KMP算法是用來查看一個(gè)字符串是否包含另一個(gè)字符串的算法,暴力求解的時(shí)間復(fù)雜度是O(m*n),而KMP算法的時(shí)間復(fù)雜度是O(m+n)宏蛉,KMP算法的關(guān)鍵概念如下:
- 前綴 abcdab的前綴集合為a,ab,abc,abcda
- 后綴 abcdab的后綴集合為bcdab,cdab,dab,ab,b
- 部分匹配值 部分匹配值就是前綴和后綴最長(zhǎng)的共有元素的長(zhǎng)度肯污,比如abcdab前綴和后綴最長(zhǎng)的共有元素是ab,其長(zhǎng)度是2
- next值 當(dāng)前字符前(不含當(dāng)前字符)的字符串的部分匹配值,比如abcdab最后b的next值就是abcda的部分匹配值彩届,也就是1
- 部分匹配值數(shù)組 設(shè)一個(gè)字符串string的長(zhǎng)度為n伪冰,則部分匹配值數(shù)組array有n個(gè)元素,且array[i]的值為字符串string.sub(0, i+1)的部分匹配值樟蠕,比如abcdab的部分匹配值數(shù)組為0,0,0,0,1,2
- next數(shù)組 設(shè)一個(gè)字符串string的長(zhǎng)度為n贮聂,則next數(shù)組array有n個(gè)元素,且array[i]的值為字符串string.sub(0, i+1)的next值寨辩,特別的吓懈, array[0]永遠(yuǎn)等于-1,array[1]永遠(yuǎn)等于0, next數(shù)組可以用部分匹配值數(shù)組右移,然后在第一個(gè)元素補(bǔ)上-1得到靡狞。例如abcdab的部分匹配值數(shù)組為0,0,0,0,1,2,右移后為0,0,0,0,1,補(bǔ)充-1后為-1,0,0,0,0,1
next數(shù)組的作用在于計(jì)算模版P匹配字符串A失配后耻警,需要在當(dāng)前位置后移多少位繼續(xù)匹配。設(shè)失配字符為模版P的第i個(gè)字符耍攘,計(jì)算公式為distance = i - next(i)
實(shí)例代碼如下:
public class KMP {
private static int[] getNextArray(final String pattern) {
if (pattern == null || pattern.isEmpty()) {
return null;
}
int[] next = new int[pattern.length()];
next[0] = -1;
int k = -1;
int j = 0;
while (j < (pattern.length() - 1)) {
if (k == -1 || pattern.charAt(k) == pattern.charAt(j)) {
k++;
j++;
next[j] = k;
} else {
k = next[k];
}
}
return next;
}
private static void printArray(int[] arr) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < arr.length; i++) {
sb.append(arr[i]);
sb.append(' ');
}
sb.append('\n');
System.out.print(sb.toString());
}
public static int kmpFind(final String src, final String pattern) {
if (src == null || pattern == null) {
return -1;
}
int lenSrc = src.length();
int lenPattern = pattern.length();
if (lenSrc < lenPattern) {
return -1;
}
int[] next = getNextArray(pattern);
if (next == null) {
return -1;
}
int result = -1;
int i = 0;
while (i < lenSrc) {
int j = 0;
while (j < lenPattern) {
if (src.charAt(i) != pattern.charAt(j)) {
break;
}
i++;
j++;
}
if (j < lenPattern) {
int move = j - next[j];
i = i - j + move;
result = -1;
} else {
result = i - lenPattern;
break;
}
}
return result;
}
public static void main(String[] argv) {
System.out.print(String.format("find %s in %s : %d\n", argv[1],
argv[0],
kmpFind(argv[0], argv[1])));
}
}