Article directory
5.2 definition of string
- A finite sequence of zero or more characters, also known as a string.
5.3 comparison of strings
Compare number and ASCII.
5.4 abstract data types of strings
ADT DATA Operation StrAssign(T,*chars);//Generate string StrCopy(T,S);//Copy S to T ClearString(S);//empty StringEmpty(S);//Judge whether it is empty StrLength(S); StrCompare(S,T); Concat(T,S1,S2); //merge SubString(Sub,S,pos,len);//cutting Index(S,T,pos); Replace(S,T,V); StrInsert(S,pos,T); StrDelete(S,pos,len);
5.5 storage structure of string
5.5.1 sequential storage structure of strings
- Dynamic allocation of malloc(), free() in C language heap
5.5.2 chain storage structure of string
-> ABCD->IG##^ The last node is not full. Use "ා" or other non string value characters to complete
5.6 simple pattern matching algorithm
- The positioning operation of substring is usually called pattern matching of string
//Returns the position of the substring t after the pos character in the main string s, does not exist, returns - 1 public int index(String s,String t,int pos){ int i = pos; //Match primary string position subscript from pos position int j = 0; //Substring substring position subscript //If i is less than the length of the main string and t is less than the length of the substring while(i < s.length() && t < t.length()){ //If the characters move equally if(s.charAt(i) == t.charAt(j)){ i++; j++; } else { //Backoff operation if not equal i = i - j + 1; j = 0; } //Description found if(j == t.length()){ return i - j; / } return -1; } } //Best time complexity O(1) //Average O(n+m) //Worst O((n-m+1) * m)
5.7 KMP pattern matching algorithm
- When the letters in position i and j are the same, both pointers point to the next position to continue the comparison;
- When the letters in positions i and j are different, i remains unchanged, and j returns to the next[j] position for re comparison. (for the moment, we don't care about the next [], as long as we remember to define the next[0]=-1)
- When J returns to the subscript 0, if the letters in the positions i and j are still different, according to (2), there is j = next[0]=-1, then only i and J can continue to move backward for comparison (the same as step (1))
/* * Returns the position (including pos position) of the substring t after the pos character in the main string s. If not, return - 1 */ public int index_KMP(String s, String t, int pos) { int i = pos; //Pointer to main string int j = 0; //Pointer to substring int[] next = getNext(t); //Get next array of substrings while (i < s.length() && j < t.length()) { //When j = -1 or characters move equally if (j == -1 || s.charAt(i) == t.charAt(j)) { // j==-1 indicates that the first place of the substring does not match, which is obtained from the previous step j=next[0]=-1. i++; j++; } else { j = next[j]; } } if (j == t.length()){ return i - j; } return -1; } //next[j]: when the element with subscript j does not match, the next position subscript to jump for J. /* * Returns the next array of strings */ public int[] getNext(String str) { int length = str.length(); int[] next = new int[length]; int i = 0; //i suffix pointer int j = -1; //j prefix pointer next[0] = -1; // 1.next[0]=-1; while (i < length - 1) { // Because there is next[i + +], it is not I < length if (j == -1 || str.charAt(i) == str.charAt(j)) { // j == -1 means that there is no equal part of the prefix and suffix, and the next value of i+1 position is 0 //2. When J = = - 1, the prefix does not have the same place as the suffix. If the maximum length is 0, then the next value of i+1 position can only be 0, which can also be expressed as next[i+1]=j+1. next[i+1]=j+1 when the character at position J in the current suffix is equal to the character at position I in the suffix next[++i] = ++j; //Equal to the length of the prefix } else { //When the characters in position j and position I are not equal, it means that the prefix cannot match the suffix in position j, so that j jumps to the next matching position, i.e. j= next[j]. j = next[j]; } } return next; }
Next [J] = maximum length of the same prefix of the string preceding the j position.
5.7.1 improvement of KMP pattern matching algorithm
public class KMP2 { public int[] getNextval(String str) { int length = str.length(); int[] nextval = new int[length]; int i = 0; //i pointer to suffix int j = -1; //Pointer to the j prefix nextval[0] = -1; while (i < length - 1) { if (j == -1 || str.charAt(i) == str.charAt(j)) { i++; j++; if(str.charAt(i)!=str.charAt(j)) { //The judgment of whether an extra character is equal nextval[i] = j; //Equal to the length of the prefix }else { nextval[i]=nextval[j]; } } else { j = nextval[j]; } } return nextval; }
- Core: an array of partial match tables. The value in PMT is the length of the longest element in the intersection of the prefix set and the suffix set of the string.
How to better understand and master KMP algorithm?
38 original articles published, praised 13, visited 40000+