leetcode_438_Find All Anagrams in a String_哈希表_java实现

题目：

Given a string s and a non-empty string p, find all the start indices of p's anagrams in s.

Strings consists of lowercase English letters only and the length of both strings s and p will not be larger than 20,100.

The order of output does not matter.

Example 1:

Input:

s: "cbaebabacd" p: "abc"

Output:

[0, 6]

Explanation:

The substring with start index = 0 is "cba", which is an anagram of "abc".

The substring with start index = 6 is "bac", which is an anagram of "abc".

Example 2:

Input:

s: "abab" p: "ab"

Output:

[0, 1, 2]

Explanation:

The substring with start index = 0 is "ab", which is an anagram of "ab".

The substring with start index = 1 is "ba", which is an anagram of "ab".

The substring with start index = 2 is "ab", which is an anagram of "ab".

解题思路：

这道题一个最重要的难点就是时间控制。另一个就是哈希表的应用。

笔者第一次是从s中逐步截取与p等长的子串m，然后再暴力遍历分析m和p是否为Anagrams（只有字符顺序不同的两个字符串），总的时间复杂度为O(s*p*p)大概是O(n^3)，自然不能通过。第二次时间优化是在比较m和p时，采用《如何判断两个String是否是Anagrams》一文中介绍的方法，总的时间复杂度是O(s*p)，大概是O(n^2)，然后依然没有通过，超时。

最后一次进一步优化，使用滑动窗口，使得时间复杂度为为O(n)，这就是算法的力量！

算法介绍：

（1）NumberOfDeference = p.length()，用来表示目前窗口中的字符和p的差异度（由于窗口最大时（为p.length()）才有可能使NumberOfDeference为0，所以NumberOfDeference 为0时窗口中与p为Anagrams）。NumberOfDeference差异度的含义：一开始窗口左右指针都指向第一个，所以差异度最大为p.length()；当窗口中进来一个p中有的字符，差异度就减一，出去一个有的差异度就增一；至于进来或出去的是p中没有的则NumberOfDeference不变，反正窗口的最大值也不过是p.length()。

（2）窗口的左右两个指针：left和right，分别指向窗口的左端和右端。

滑动窗口具体操作：

先滑动右指针，

1.1 加进来的字符如果该字符在数组asciiChars的计数中非负，则NumberOfDeference减一，否则不做操作，

1.2无论在不在数组asciiChars该字符相应位置计数都减一

1.3如果滑动完right，判断如果窗口长度到达p.length()（如果长度到达p.length()，并且NumberOfDeference为0，则将此时的left值加入到result中），滑动左窗口。

滑动左指针：

2.1被踢出窗口的那个字符如果在数组asciiChars的计数中非负，则NumberOfDeference增一，不在则不做操作。

2.2无论在不在数组asciiChars该字符相应位置计数都加一

（3）上面1.1和2.1操作的原理：asciiChars中的记录的数据是代表p中含有的的每个字符的数量去掉当前窗口中存在的p字符串中每个字符的数量，一开始窗口大小为0，啥都不存在，自然asciiChars中记录的就是全部p中含有的的每个字符的数量，如果窗口中有一个，则asciiChars相应的记录中就少一个。如果窗口中多包含一个p中没有的（或者包含的某一个字符的数量比p中有的还多），那么这时候，asciiChars中相应的记录值就会变成负数，代表当前窗口中包含相应字符的“多余”的数量。

所以当进来一个的时候，如果相应记录为非负，那么这个进入是有意义的，则差异度（NumberOfDeference）减一；当出去一个的时候，如果相应记录为非负，那么这个出去是有意义的，则差异度（NumberOfDeference）加一。

（4）asciiChars用到哈希表思想，用来记录p中每一种字符分别有多少个。详见《如何判断两个String是否是Anagrams》一文。

代码：

public static List<Integer> findAnagrams(String s, String p) {

        List<Integer> result = new ArrayList<Integer>();

        int NumberOfDeference = p.length();  //差异度指数

        int left=0,right=0;  //窗口左右指针

        int[] asciiChars = new int[256];

//记录p中字符有哪些及其数量的数组

        for (int i = p.length() - 1; i>=0; --i)

          {     ++asciiChars[p.charAt(i)];      } //记录完毕

        for(;right<s.length();right++){ //滑动右窗口

            asciiChars[s.charAt(right)]--;  //在该字符相应位置减一

            if(asciiChars[s.charAt(right)]>=0) NumberOfDeference--;

//如果加进来的那个在p中，NumberOfDeference减一

            if(right-left == (p.length()-1)){

 //如果这时窗口大小为p.length()

                if(NumberOfDeference==0) result.add(left);

//这时出现一次匹配，将左窗口加到result中

                //下面是滑动左窗口的操作

                if(asciiChars[s.charAt(left)]>=0) {

                    NumberOfDeference++;

//如果被踢出的那个在p中，NumberOfDeference加一

                }

                asciiChars[s.charAt(left)]++;

//数组中相应字符计数位置加回来

                left++; //左窗口向右滑动

            }

        }

        return result;

    }