leetcode Quesion 6: Anagram



Anagram
Given an array of strings, return all groups of strings that are anagrams.
Note: All inputs will be in lower-case.

Analysis:
Anagrams is two strings are using the same characters. 
One way to compare two strings is use sort(). e.g. 
sort(str1.begin(), str1.end()); 
sort(str1.begin(), str1.end());
if (str1.compare(str2)==0) // when two strings are equal, the func returns 0

A more efficient way:
1. Scan the whole string vector, for each string, store to a hash map with the "ordered string" as the key. O(n).
2. Scan the whole hash map, output the values where for one key the number of value >=2. O(n)

Note:
1. We can use multimap<string, string> in c++, which allows the duplicate key values.
2. To store key-value into multimap, use ".insert(pair<key_type, value_type>(key,value))"
3. Use "pair<multimap<string,string>::iterator,multimap<string,string>::iterator> ret;" and
    ".equal_range()" which returns a iterator pair(ret) that the "first" is the lower bound and "second" is the upper bound, to get all the key-values pairs for one key.
4. To check the number of values in one key, use the .count(key) method.


Updated 201312
Use one map <sorted string as the key,  position of the first string(unsorted)>, can deal with this problem more space efficiently.
Check the array once, if the current string is not found in the map, then add <sorted string, current position>  into the map. If the current string can be found in the map, then add the current string in the result array, at this time, the first string of this anagram added in the map should also be put into the result array only once! Change the  map value can determine that.

Code(updated 201312):


class Solution {
public:
    vector<string> anagrams(vector<string> &strs) {
        vector<string> res;
        map<string,int> mp;
        for (int i=0;i<strs.size();i++){
            string ss = strs[i];
            sort(ss.begin(),ss.end());
            if (mp.find(ss)!=mp.end()){
                res.push_back(strs[i]);
                if (mp[ss]!=-1){
                    res.push_back(strs[mp[ss]]);    
                    mp[ss]=-1;
                }
            }else{
                mp[ss]=i;
            }
        }
        return res;
    }
};


Updated 201309

Just use the map data structure to store, and use sort algorithm to check the anagram.

Code (updated 201309):

class Solution {
public:
    vector<string> anagrams(vector<string> &strs) {
        // Start typing your C/C++ solution below
        // DO NOT write int main() function
        map<string,int> mp; //store the occurrence of strings, keys are sorted string
        map<string,string> pre; //store the first-time string
        vector<string> res;  //result vector
        for (int i=0;i<strs.size();i++){
            string s = strs[i]; //current string
            sort(s.begin(),s.end()); // get the key (sorted string)
            if (mp.find(s)==mp.end()){ // if this is the 1st time meet this string
                mp[s]=1; // set occurence to 1
                pre[s] = strs[i]; // save the original string
            }else{ // if anagrams found
                mp[s]++;
                if (mp[s]==2){res.push_back(pre[s]);} // need to push the 1st string into result only once
                res.push_back(strs[i]); // push the current string
            }
        }
        return res;
    }
};




Old versions:
The source code O(n^2):

class Solution {
public:
    vector<string> anagrams(vector<string> &strs) {
        // Start typing your C/C++ solution below
        // DO NOT write int main() function
        vector<string>res;
        if (!strs.empty()){
            vector<string>::iterator it1,it2;
            map<string int="int"> hmap;
            for (it1 = strs.begin();it1!=strs.end();it1++){
                for (it2 = it1+1;it2!=strs.end();it2++){
                    string s1 = *it1;
                    string s2 = *it2;
                    if(*it1==*it2){
                        hmap[*it1]=2;
                    }else{
                        sort(s1.begin(),s1.end());
                        sort(s2.begin(),s2.end());
                        if(s1.compare(s2)==0){
                            hmap[*it1]=1;
                            hmap[*it2]=1;
                        }
                    }
                }
            }
            map<string, int>::iterator it3;
            for (it3 = hmap.begin(); it3!=hmap.end();it3++){
                if ((*it3).second==1){
                    res.push_back((*it3).first);
                }
                if ((*it3).second==2){
                    res.push_back((*it3).first);
                    res.push_back((*it3).first);
                }
            }
        }
        return res;
    }



The source code: O(n)


class Solution {
public:
 vector<string> anagrams(vector<string> &strs) {
        // Start typing your C/C++ solution below
        // DO NOT write int main() function
        vector<string> res; //result vector
        set<string> order;   //string keys (ordered charactors)
        if (!strs.empty()){
            multimap<string,string> mmp; //hash map storing all the strings
            multimap<string,string>::iterator iit;
            vector<string>::iterator it; //scan the string vector
            for (it = strs.begin(); it!= strs.end();it++){
                string ss =  *it;
                sort(ss.begin(),ss.end()); //sort the string (if isanagram, the sorted string would be the same)
                mmp.insert(pair<string,string>(ss,*it)); //store in the hash map
                order.insert(ss); //store the key.
            }
             
            pair<multimap<string,string>::iterator,multimap<string,string>::iterator> ret;
            set<string>::iterator oit;
            for (oit=order.begin(); oit!=order.end(); oit++){ // for each key loop
                if (mmp.count(*oit) > 1){                     // anagram only if the number >1
                    ret = mmp.equal_range((*oit));  //get all the pairs with the specific key
                    for (iit=ret.first; iit!=ret.second; ++iit){ 
                        res.push_back((*iit).second);
                    }
                }
                 
            }
        }
        return res;
    }
};

2 comments:

  1. Hello, incredible blog, thank you very much for a bunch of amazing decisions!
    However, here you didn't use the condition "All inputs will be in lower-case.". For example, we could sort the string for O(n).
    Or you can find one more interesting solution :)

    ReplyDelete
  2. how are u maintaining lexicographic order of strings?

    ReplyDelete