Virus Invasion Continuing HDU - 3065
Problem Description
Little t thanks you very much for helping him solve his last problem.However, viral invasion continues.With the constant efforts of Little t, he found the "root of all evil" in the network.This is a huge virus website. It has a lot of viruses, but this website contains strange viruses. The signatures of these viruses are short and contain only "English uppercase characters".Of course, little t wants to do harm to the people, but little t never fights unprepared wars.Know who you are and won't lose a battle. Little t's first thing to do is to know the characteristics of this virus website: how many different viruses are contained and how many times each virus appears.Can you help him again?
Input
The first line, an integer N (1<=N<=1000), represents the number of virus signatures.
The next N lines, each representing a virus signature, have a signature string between 1 and 50 and contain only "English uppercase characters".Any two virus signatures will not be identical.
After that, the line represents the source code of the "Source of All Evils" website, with a source string length of less than 2000,000.The characters in the string are ASCII visible characters (excluding carriage returns).
Output
Output the number of occurrences of each virus, one per line in the following format.Viruses that do not appear do not require output.
Virus signature: number of occurrences
There is a space after the colon to output in the order in which the virus signature is entered.
Sample Input
3
AA
BB
CC
ooxxCC%dAAAoen...END
Sample Output
AA: 2
CC: 1
Title: Enter n template string viruses, and then enter a site source code, output the number of times each virus appears in the site source code in sequence, and do not output if it occurs 0 times.
Idea: Count the number of occurrences of each template in the text string. When matching with text, each word node I is found, the corresponding res[val[i]++ (so as to query viruses continuously) is output sequentially. In addition, since the characters are ASCII codes, the second dimension of the dictionary tree is opened to 128 (based on theSample consecutive AAA represents two occurrences of AA virus)
#include<iostream> #include<cstdio> #include<cstdlib> #include<string> #include<cstring> #include<cmath> #include<ctime> #include<algorithm> #include<utility> #include<stack> #include<queue> #include<vector> #include<set> #include<map> #define E 1e-9 #define INF 0x3f3f3f3f #define LL long long const int MOD=10007; const int N=500000+5; using namespace std; int res[N];//Record results struct AC_Automata{ int tire[N][128];//Dictionary Tree int val[N];//End of string marker int fail[N];//Mismatch Pointer int last[N];//last[i]=j Table The word represented by the j node is the suffix of the I node word, and the j node is the word node int tot;//number void init(){//Initialize Point 0 tot=1; val[0]=fail[0]=last[0]=0; memset(tire[0],0,sizeof(tire[0])); } void insert(char *s,int v){//Construct trie and Vals arrays, v needs to be non-zero to represent a single word node int len=strlen(s); int root=0; for(int i=0;i<len;i++){ int id=s[i]; if(tire[root][id]==0){ tire[root][id]=tot; memset(tire[tot],0,sizeof(tire[tot])); val[tot++]=0; } root=tire[root][id]; } val[root]=v; } void build(){//Construct fail and last queue<int> q; last[0]=fail[0]=0; for(int i=0;i<128;i++){ int root=tire[0][i]; if(root!=0){ fail[root]=0; last[root]=0; q.push(root); } } while(!q.empty()){//bfs fail s int k=q.front(); q.pop(); for(int i=0;i<128;i++){ int u=tire[k][i]; if(u==0) continue; q.push(u); int v=fail[k]; while(v && tire[v][i]==0) v=fail[v]; fail[u]=tire[v][i]; last[u]=val[fail[u]]?fail[u]:last[fail[u]]; } } } void print(int i){//Recursive printing of prefix node number identical to node i suffix if(val[i]){ res[val[i]]++; print(last[i]); } } void query(char *s){//matching int len=strlen(s); int j=0; for(int i=0;i<len;i++){ int id=s[i]; while(j && tire[j][id]==0) j=fail[j]; j=tire[j][id]; if(val[j]) print(j); else if(last[j]) print(last[j]); } } }ac; char P[1000][1000]; char T[2000000+10]; int main(){ int n; while(scanf("%d",&n)!=EOF&&n){ memset(res,0,sizeof(res)); ac.init(); for(int i=1;i<=n;i++){ scanf("%s",P[i]); ac.insert(P[i],i); } ac.build(); scanf("%s",T); ac.query(T); for(int i=1;i<=n;i++) if(res[i]) printf("%s: %d\n",P[i],res[i]); } return 0; }