30. Substring with Concatenation of All Words

Description

You are given a string s and an array of strings words. All the strings of words are of the same length.

A concatenated substring in s is a substring that contains all the strings of any permutation of words concatenated.

  • For example, if words = ["ab","cd","ef"], then "abcdef", "abefcd", "cdabef", "cdefab", "efabcd", and "efcdab" are all concatenated strings. "acdbef" is not a concatenated substring because it is not the concatenation of any permutation of words.

Return the starting indices of all the concatenated substrings in s. You can return the answer in any order.

 

Example 1:

Input: s = "barfoothefoobarman", words = ["foo","bar"]
Output: [0,9]
Explanation: Since words.length == 2 and words[i].length == 3, the concatenated substring has to be of length 6.
The substring starting at 0 is "barfoo". It is the concatenation of ["bar","foo"] which is a permutation of words.
The substring starting at 9 is "foobar". It is the concatenation of ["foo","bar"] which is a permutation of words.
The output order does not matter. Returning [9,0] is fine too.

Example 2:

Input: s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]
Output: []
Explanation: Since words.length == 4 and words[i].length == 4, the concatenated substring has to be of length 16.
There is no substring of length 16 in s that is equal to the concatenation of any permutation of words.
We return an empty array.

Example 3:

Input: s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]
Output: [6,9,12]
Explanation: Since words.length == 3 and words[i].length == 3, the concatenated substring has to be of length 9.
The substring starting at 6 is "foobarthe". It is the concatenation of ["foo","bar","the"] which is a permutation of words.
The substring starting at 9 is "barthefoo". It is the concatenation of ["bar","the","foo"] which is a permutation of words.
The substring starting at 12 is "thefoobar". It is the concatenation of ["the","foo","bar"] which is a permutation of words.

 

Constraints:

  • 1 <= s.length <= 104
  • 1 <= words.length <= 5000
  • 1 <= words[i].length <= 30
  • s and words[i] consist of lowercase English letters.

Solutions

Solution 1: Hash Table + Sliding Window

We use a hash table $cnt$ to count the number of times each word appears in $words$, and use a hash table $cnt1$ to count the number of times each word appears in the current sliding window. We denote the length of the string $s$ as $m$, the number of words in the string array $words$ as $n$, and the length of each word as $k$.

We can enumerate the starting point $i$ of the sliding window, where $0 \lt i < k$. For each starting point, we maintain a sliding window with the left boundary as $l$, the right boundary as $r$, and the number of words in the sliding window as $t$. Additionally, we use a hash table $cnt1$ to count the number of times each word appears in the sliding window.

Each time, we extract the string $s[r:r+k]$. If $s[r:r+k]$ is not in the hash table $cnt$, it means that the words in the current sliding window are not valid. We update the left boundary $l$ to $r$, clear the hash table $cnt1$, and reset the word count $t$ to 0. If $s[r:r+k]$ is in the hash table $cnt$, it means that the words in the current sliding window are valid. We increase the word count $t$ by 1, and increase the count of $s[r:r+k]$ in the hash table $cnt1$ by 1. If $cnt1[s[r:r+k]]$ is greater than $cnt[s[r:r+k]]$, it means that $s[r:r+k]$ appears too many times in the current sliding window. We need to move the left boundary $l$ to the right until $cnt1[s[r:r+k]] = cnt[s[r:r+k]]$. If $t = n$, it means that the words in the current sliding window are exactly valid, and we add the left boundary $l$ to the answer array.

The time complexity is $O(m \times k)$, and the space complexity is $O(n \times k)$. Here, $m$ and $n$ are the lengths of the string $s$ and the string array $words$ respectively, and $k$ is the length of the words in the string array $words$.

Python Code
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
class Solution:
    def findSubstring(self, s: str, words: List[str]) -> List[int]:
        cnt = Counter(words)
        m, n = len(s), len(words)
        k = len(words[0])
        ans = []
        for i in range(k):
            cnt1 = Counter()
            l = r = i
            t = 0
            while r + k <= m:
                w = s[r : r + k]
                r += k
                if w not in cnt:
                    l = r
                    cnt1.clear()
                    t = 0
                    continue
                cnt1[w] += 1
                t += 1
                while cnt1[w] > cnt[w]:
                    remove = s[l : l + k]
                    l += k
                    cnt1[remove] -= 1
                    t -= 1
                if t == n:
                    ans.append(l)
        return ans

Java Code
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
class Solution {
    public List<Integer> findSubstring(String s, String[] words) {
        Map<String, Integer> cnt = new HashMap<>();
        for (String w : words) {
            cnt.merge(w, 1, Integer::sum);
        }
        int m = s.length(), n = words.length;
        int k = words[0].length();
        List<Integer> ans = new ArrayList<>();
        for (int i = 0; i < k; ++i) {
            Map<String, Integer> cnt1 = new HashMap<>();
            int l = i, r = i;
            int t = 0;
            while (r + k <= m) {
                String w = s.substring(r, r + k);
                r += k;
                if (!cnt.containsKey(w)) {
                    cnt1.clear();
                    l = r;
                    t = 0;
                    continue;
                }
                cnt1.merge(w, 1, Integer::sum);
                ++t;
                while (cnt1.get(w) > cnt.get(w)) {
                    String remove = s.substring(l, l + k);
                    l += k;
                    cnt1.merge(remove, -1, Integer::sum);
                    --t;
                }
                if (t == n) {
                    ans.add(l);
                }
            }
        }
        return ans;
    }
}

C++ Code
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
class Solution {
public:
    vector<int> findSubstring(string s, vector<string>& words) {
        unordered_map<string, int> cnt;
        for (auto& w : words) {
            ++cnt[w];
        }
        int m = s.size(), n = words.size(), k = words[0].size();
        vector<int> ans;
        for (int i = 0; i < k; ++i) {
            unordered_map<string, int> cnt1;
            int l = i, r = i;
            int t = 0;
            while (r + k <= m) {
                string w = s.substr(r, k);
                r += k;
                if (!cnt.count(w)) {
                    cnt1.clear();
                    l = r;
                    t = 0;
                    continue;
                }
                ++cnt1[w];
                ++t;
                while (cnt1[w] > cnt[w]) {
                    string remove = s.substr(l, k);
                    l += k;
                    --cnt1[remove];
                    --t;
                }
                if (t == n) {
                    ans.push_back(l);
                }
            }
        }
        return ans;
    }
};

Go Code
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
func findSubstring(s string, words []string) (ans []int) {
	cnt := map[string]int{}
	for _, w := range words {
		cnt[w]++
	}
	m, n, k := len(s), len(words), len(words[0])
	for i := 0; i < k; i++ {
		cnt1 := map[string]int{}
		l, r, t := i, i, 0
		for r+k <= m {
			w := s[r : r+k]
			r += k
			if _, ok := cnt[w]; !ok {
				l, t = r, 0
				cnt1 = map[string]int{}
				continue
			}
			cnt1[w]++
			t++
			for cnt1[w] > cnt[w] {
				cnt1[s[l:l+k]]--
				l += k
				t--
			}
			if t == n {
				ans = append(ans, l)
			}
		}
	}
	return
}

TypeScript Code
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
function findSubstring(s: string, words: string[]): number[] {
    const cnt: Map<string, number> = new Map();
    for (const w of words) {
        cnt.set(w, (cnt.get(w) || 0) + 1);
    }
    const m = s.length;
    const n = words.length;
    const k = words[0].length;
    const ans: number[] = [];
    for (let i = 0; i < k; ++i) {
        const cnt1: Map<string, number> = new Map();
        let l = i;
        let r = i;
        let t = 0;
        while (r + k <= m) {
            const w = s.slice(r, r + k);
            r += k;
            if (!cnt.has(w)) {
                cnt1.clear();
                l = r;
                t = 0;
                continue;
            }
            cnt1.set(w, (cnt1.get(w) || 0) + 1);
            ++t;
            while (cnt1.get(w)! - cnt.get(w)! > 0) {
                const remove = s.slice(l, l + k);
                cnt1.set(remove, cnt1.get(remove)! - 1);
                l += k;
                --t;
            }
            if (t === n) {
                ans.push(l);
            }
        }
    }
    return ans;
}

C# Code
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
public class Solution {
    public IList<int> FindSubstring(string s, string[] words) {
        var cnt = new Dictionary<string, int>();
        foreach (var w in words) {
            if (!cnt.ContainsKey(w)) {
                cnt[w] = 0;
            }
            ++cnt[w];
        }
        int m = s.Length, n = words.Length, k = words[0].Length;
        var ans = new List<int>();
        for (int i = 0; i < k; ++i) {
            var cnt1 = new Dictionary<string, int>();
            int l = i, r = i, t = 0;
            while (r + k <= m) {
                var w = s.Substring(r, k);
                r += k;
                if (!cnt.ContainsKey(w)) {
                    cnt1.Clear();
                    t = 0;
                    l = r;
                    continue;
                }
                if (!cnt1.ContainsKey(w)) {
                    cnt1[w] = 0;
                }
                ++cnt1[w];
                ++t;
                while (cnt1[w] > cnt[w]) {
                    --cnt1[s.Substring(l, k)];
                    l += k;
                    --t;
                }
                if (t == n) {
                    ans.Add(l);
                }
            }
        }
        return ans;
    }
}

Solution 2

C++ Code
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class Solution {
public:
    vector<int> findSubstring(string s, vector<string>& words) {
        unordered_map<string, int> d;
        for (auto& w : words) ++d[w];
        vector<int> ans;
        int n = s.size(), m = words.size(), k = words[0].size();
        for (int i = 0; i < k; ++i) {
            int cnt = 0;
            unordered_map<string, int> t;
            for (int j = i; j <= n; j += k) {
                if (j - i >= m * k) {
                    auto s1 = s.substr(j - m * k, k);
                    --t[s1];
                    cnt -= d[s1] > t[s1];
                }
                auto s2 = s.substr(j, k);
                ++t[s2];
                cnt += d[s2] >= t[s2];
                if (cnt == m) ans.emplace_back(j - (m - 1) * k);
            }
        }
        return ans;
    }
};