Description
Given a string s and a non-empty string p, find all the start indices of p's anagrams in s.
Strings consists of lowercase English letters only and the length of both strings s and p will not be larger than 20,100.
The order of output does not matter.
Example 1: Input: s: "cbaebabacd" p: "abc" Output: [0, 6] Explanation: The substring with start index = 0 is "cba", which is an anagram of "abc". The substring with start index = 6 is "bac", which is an anagram of "abc". Example 2: Input: s: "abab" p: "ab" Output: [0, 1, 2] Explanation: The substring with start index = 0 is "ab", which is an anagram of "ab". The substring with start index = 1 is "ba", which is an anagram of "ab". The substring with start index = 2 is "ab", which is an anagram of "ab".
My solution
- The simplest idea is to calculate and compare two map s from 1 - > psize, 2 - > psize + 1, 3 - > psize + 2,... Each time.
- Considering that only one new element comes in and one old element leaves at a time, the algorithm can be simplified to modify the map only for changing places.
- Considering that comparing all the elements of two map s at a time is redundant, because it only needs to compare the changes, if some elements have been matched by s <=> p, it can be ignored. The way I adopt is to build a map to be investigated (named dif). When dif is empty, there is no difference between the two map s, res.push_back is enough; when dif is not empty, it means that the sliding window needs to move on. Comparing the O(psize) complexity of two maps at a time.
- Generally speaking, the basic idea is unordered_map. Of course, the code needs to be optimized.
class Solution { public: vector<int> findAnagrams(string s, string p) { vector<int> res; int ssize = s.size(); int psize = p.size(); unordered_map<char, int> mp; for (int i = 0; i < psize; ++i) --mp[p[i]]; unordered_map<char, int> dif = mp; for (int i = 0; i < psize; ++i) { if (++dif[s[i]] == 0) dif.erase(s[i]); } if (dif.empty()) res.push_back(0); for (int i = psize; i < ssize; ++i) { if (++dif[s[i]] == 0) dif.erase(s[i]); if (--dif[s[i - psize]] == 0) dif.erase(s[i - psize]); if (dif.empty()) res.push_back(i - psize + 1); } return res; } };
Discuss
The following code is basically the same as my idea, but because the letters are limited, direct storage of vectors in 26 spaces (this is an exceptional case). Of course, the time consumption of vectors should be higher, especially in the steps p==v in the following code.
class Solution { public: vector<int> findAnagrams(string s, string p) { vector<int> pv(26,0), sv(26,0), res; if(s.size() < p.size()) return res; // fill pv, vector of counters for pattern string and sv, vector of counters for the sliding window for(int i = 0; i < p.size(); ++i) { ++pv[p[i]-'a']; ++sv[s[i]-'a']; } if(pv == sv) res.push_back(0); //here window is moving from left to right across the string. //window size is p.size(), so s.size()-p.size() moves are made for(int i = p.size(); i < s.size(); ++i) { // window extends one step to the right. counter for s[i] is incremented ++sv[s[i]-'a']; // since we added one element to the right, // one element to the left should be forgotten. //counter for s[i-p.size()] is decremented --sv[s[i-p.size()]-'a']; // if after move to the right the anagram can be composed, // add new position of window's left point to the result if(pv == sv) res.push_back(i-p.size()+1); } return res; } };
Epilogue
Why do I think my way is excellent? Is it an illusion?