Tuesday, July 1, 2014

[Leetcode] Distinct Subsequences

Problem 

Given a string S and a string T, count the number of distinct subsequences of T in S.
A subsequence of a string is a new string which is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (i.e., "ACE" is a subsequence of "ABCDE" while "AEC" is not).

Here is an example:
S = "rabbbit", T = "rabbit"
Return 3.

Algorithm

We can use dynamic programming to solve this problem. 
Let dp[i][j] be the number of distinct T.substring(0,j+1) can be found as a subsequence in the S.substring(0,i+1). In order to match, we have two options, 
  • Match T.charAt(j) and S.charAt(i), then solve subproblem dp[i-1][j-1]
  • Or, solve subproblem dp[i-1][j]
Therefore, we have the recurrence, 

      dp[i][j] =    dp[i-1][j] + dp[i-1][j-1]       if S.charAt(i) == T.charAt(j)
                          dp[i-1][j]                            else

Review the Longest Common Subsequences between S and T. Defining L[i][j] as the length of the longest common subsequence between S and T,  we have the following recurrence
      dp[i][j]  =  dp[i-1][j-1] + 1                    if S.charAt(i) == T.charAt(j)
                       Max(dp[i-1][j], dp[i][j-1])   else

The subtle differences between these two problems are: 
  • In the distinct subsequences, we must find a match for every character in T. So, when S[i] does not match T[j], we still need to find a match for T[j], i.e. look for solutions in dp[i-1][j]. 
  • Number of distinct subsequences, dp[i-1][j] + dp[i-1][j-1], while to get length in the LCS dp[i-1][j-1] +1

Code



No comments:

Post a Comment