792. Number of Matching Subsequences
Medium, Dynamic Programming
Given a string s and an array of strings words, return the number of words[i] that is a subsequence of s.
A subsequence of a string is a new string generated from the original string with some characters (can be none) deleted without changing the relative order of the remaining characters.
- For example, "ace" is a subsequence of "abcde".
Example 1:
Input: s = "abcde", words = ["a","bb","acd","ace"]
Output: 3
Explanation: There are three strings in words that are a subsequence of s: "a", "acd", "ace".
Example 2:
Input: s = "dsahjpjauf", words = ["ahjpjau","ja","ahbwzgqnuk","tnmlanowax"]
Output: 2
Constraints:
- 1 <= s.length <= 5 * 104
- 1 <= words.length <= 5000
- 1 <= words[i].length <= 50
- s and words[i] consist of only lowercase English letters.
SOLUTION:
To solve this problem, we need to efficiently check how many strings in the words array are subsequences of the string s. A subsequence is defined as a sequence of characters that appear in the same order as in the original string s, but not necessarily consecutively.
Plan:
-
Optimize the Matching Process: Instead of checking each word in
wordsagainstscharacter by character (which would be inefficient), we can use a bucket approach:- Group the words based on their first character.
- For each character in
s, process all words waiting for that character.
-
Iterate Through
s:- As we iterate through the characters of
s, we advance the matching process for words waiting on that character.
- As we iterate through the characters of
-
Use Buckets:
- Maintain a dictionary where each key is a character, and the value is a list of iterators. Each iterator represents a word and its current position in the matching process.
-
Count Matches:
- When a word's iterator reaches the end, it means the word is a subsequence of
s.
- When a word's iterator reaches the end, it means the word is a subsequence of
Implementation:
using System;
using System.Collections.Generic;
public class Solution {
public int NumMatchingSubseq(string s, string[] words) {
// Dictionary to hold buckets of words grouped by the current character they are waiting for
var waiting = new Dictionary<char, Queue<IEnumerator<char>>>();
// Initialize the dictionary with all lowercase letters
for (char c = 'a'; c <= 'z'; c++) {
waiting[c] = new Queue<IEnumerator<char>>();
}
// Populate the buckets based on the first character of each word
foreach (var word in words) {
var iterator = word.GetEnumerator();
if (iterator.MoveNext()) {
waiting[iterator.Current].Enqueue(iterator);
}
}
int count = 0;
// Iterate through characters of the string `s`
foreach (char c in s) {
var bucket = waiting[c]; // Get the bucket for the current character
int size = bucket.Count;
for (int i = 0; i < size; i++) {
var iterator = bucket.Dequeue(); // Get the next word's iterator
if (iterator.MoveNext()) {
waiting[iterator.Current].Enqueue(iterator); // Move to the next character
} else {
count++; // Word is a subsequence
}
}
}
return count;
}
public static void Main(string[] args) {
var solution = new Solution();
Console.WriteLine(solution.NumMatchingSubseq("abcde", new string[] { "a", "bb", "acd", "ace" })); // Output: 3
Console.WriteLine(solution.NumMatchingSubseq("dsahjpjauf", new string[] { "ahjpjau", "ja", "ahbwzgqnuk", "tnmlanowax" })); // Output: 2
}
}
Explanation of Code:
-
Dictionary for Buckets:
waitingis a dictionary where keys are characters ('a'to'z'), and values are queues of iterators for words waiting on that character.
-
Iterating Over
s:- For each character in
s, process all words waiting for it. - If the iterator can move to the next character, enqueue it into the bucket for the next character.
- If the iterator is exhausted, increment the match count.
- For each character in
-
Efficiency:
- Each word is processed exactly once, and each character in
sis processed once. - Total complexity: , where is the total length of all words in
words, and is the length ofs.
- Each word is processed exactly once, and each character in
Example Outputs:
Example 1:
s = "abcde",words = ["a", "bb", "acd", "ace"]- Words
"a","acd", and"ace"are subsequences of"abcde". - Output:
3
Example 2:
s = "dsahjpjauf",words = ["ahjpjau", "ja", "ahbwzgqnuk", "tnmlanowax"]- Words
"ahjpjau"and"ja"are subsequences of"dsahjpjauf". - Output:
2