0115 - Distinct Subsequences (Hard)
Problem Link
https://leetcode.com/problems/distinct-subsequences/
Problem Statement
Given two strings s
and t
, return the number of distinct subsequences of s
which equals t
.
A string's subsequence is a new string formed from the original string by deleting some (can be none) of the characters without disturbing the remaining characters' relative positions. (i.e., "ACE"
is a subsequence of "ABCDE"
while "AEC"
is not).
The test cases are generated so that the answer fits on a 32-bit signed integer.
Example 1:
Input: s = "rabbbit", t = "rabbit"
Output: 3
Explanation:
As shown below, there are 3 ways you can generate "rabbit" from S.
rabbbit
rabbbit
rabbbit
Example 2:
Input: s = "babgbag", t = "bag"
Output: 5
Explanation:
As shown below, there are 5 ways you can generate "bag" from S.
babgbag
babgbag
babgbag
babgbag
babgbag
Constraints:
1 <= s.length, t.length <= 1000
s
andt
consist of English letters.
Approach 1: Dynamic Programming
Let be the number of distinct subsequences if contains . The base case is when is empty, there is one valid subsequence for each . If the subsequence doesn't contain , then we take . If , then we need to include as well as .
- C++
- Python
class Solution {
public:
int numDistinct(string s, string t) {
int n = s.size(), m = t.size();
unsigned long long dp[n + 1][m + 1];
memset(dp, 0, sizeof(dp));
for(int i = 0; i <= n; i++) dp[i][0] = 1;
for(int i = 1; i <= n; i++) {
for(int j = 1; j <= m; j++) {
dp[i][j] = dp[i - 1][j];
if(s[i - 1] == t[j - 1]) {
dp[i][j] += dp[i - 1][j - 1];
}
}
}
return dp[n][m];
}
};
class Solution:
# Time Complexity: O(S*T) where s is length of S and t is the
# length of T, we must loop through values of s and t.
# Space Complexity: O(S*T) to maintain our dp array.
def numDistinct(self, s: str, t: str) -> int:
# initialize dp array.
# T + 1 columns to handle empty string case
# S + 1 rows to handle empty string case.
dp = [[0] * (len(t) + 1) for _ in range(len(s) + 1)]
# base case: for every subsequence in s, only 1, the empty
# string matches up to empty string of t.
for i in range(0, len(s) + 1):
dp[i][0] = 1
# loop through our dp table, skipping the empty string parts.
for i in range(1, len(s) + 1):
for j in range(1, len(t) + 1):
# We should have at least the same amount of matches in
# matches in our string s[:i] as we did on s[:i - 1]
# matching against t[:j]
dp[i][j] = dp[i - 1][j]
if s[i - 1] == t[j - 1]:
# Current character does match for s[:i-1] t[j-1]
# then we need to add all the values from the dp
# table of [i-1][j-1]
dp[i][j] += dp[i - 1][j - 1]
return dp[len(s)][len(t)]
Approach 2: Dynamic Programming (Space Optimised)
In Approach 1, we calculate based on the previous row. We can simplify it by using a one dimensional array of size where is the length of Then we calculate backwards so that the new value won't affect the calculate of the next value.
- C++
class Solution {
public:
int numDistinct(string s, string t) {
int n = s.size(), m = t.size();
unsigned long long dp[m + 1];
memset(dp, 0, sizeof(dp));
dp[0] = 1;
for (int i = 1; i <= n; i++) {
for (int j = min(i, m); j >= 1; j--) {
if (s[i - 1] == t[j - 1]) {
dp[j] += dp[j - 1];
}
}
}
return dp[m];
}
};
Approach 3: Dynamic Programming - Memoization
We can also do the problem recursively, going depth-first, and saving all the data on each iteration in case we run into it again. We can again use and to handle checking the characters of to . We will have 3 main base cases in our dfs.
- We calculated the subproblem of current and before, so we can return that cached value.
- has reached . Meaning we reached a subsequence that matches the target string and can return 1.
- has reached . Meaning we reached the end of string and have not yet found a subsequence that makes the target so we can return 0.
Know those 3 base cases, all we have to do is initialize our subsequence count to be the recursive call at and , then once we have that value, check if the characters of match at the current and if they do we can add the number of subsequences that match our recursive call at , .
Finally remember to cache the subsequence count in our cache using the key of , and returning the subsequence value.
Time Complexity: where is and is . For each and up to and we make a recursive call to explore all possibilities either including or excluding the character in . By caching the answer to the subproblems, we don't have to do the repeated work.
Space Complexity: , as we have to store the answers to each subproblem inside our hash map cache.
- Python
class Solution:
def numDistinct(self, s: str, t: str) -> int:
# initialize our cache
cache = {}
# dfs, i, j where i is position in s, j position in t.
def dfs(i, j):
# if we have calculated s[:i] and t[:j] before:
if (i, j) in cache:
# return that value
return cache[(i, j)]
# base case, j reached length t, meaning we found a subsequence
if j == len(t):
return 1
# if i reached the end of s, and we haven't created j isn't
# len(t) yet, meaning we haven't created t using s.
# We have no subsequences -> return 0
if i == len(s):
return 0
# set current subsequence == subsequences of subproblem where
# s[:i + 1] is matched against current t[:j]
subseq = dfs(i + 1, j)
# check if current ch in subsequence matches target = t[j]
if s[i] == t[j]:
# then we can add the values from the subsequence of the
# next iteration, where s[:i + 1] matched to t[i + 1]
subseq += dfs(i + 1, j + 1)
# cache our current subsequence value for the i,j for
# future subproblems use.
cache[(i, j)] = subseq
# return that value
return subseq
# call our algorithm starting with empty strings
# s[:0] and t[:0]
return dfs(0, 0)