HDU-3065 (range ASCII code visible characters (continuous AAA represents AA template twice)) persistent virus attack

Keywords: ascii network less

Virus Invasion Continuing HDU - 3065
Problem Description

Little t thanks you very much for helping him solve his last problem.However, viral invasion continues.With the constant efforts of Little t, he found the "root of all evil" in the network.This is a huge virus website. It has a lot of viruses, but this website contains strange viruses. The signatures of these viruses are short and contain only "English uppercase characters".Of course, little t wants to do harm to the people, but little t never fights unprepared wars.Know who you are and won't lose a battle. Little t's first thing to do is to know the characteristics of this virus website: how many different viruses are contained and how many times each virus appears.Can you help him again?

Input

The first line, an integer N (1<=N<=1000), represents the number of virus signatures.
The next N lines, each representing a virus signature, have a signature string between 1 and 50 and contain only "English uppercase characters".Any two virus signatures will not be identical.
After that, the line represents the source code of the "Source of All Evils" website, with a source string length of less than 2000,000.The characters in the string are ASCII visible characters (excluding carriage returns).

Output

Output the number of occurrences of each virus, one per line in the following format.Viruses that do not appear do not require output.
Virus signature: number of occurrences
There is a space after the colon to output in the order in which the virus signature is entered.

Sample Input

3
AA
BB
CC
ooxxCC%dAAAoen...END

Sample Output

AA: 2
CC: 1

Title: Enter n template string viruses, and then enter a site source code, output the number of times each virus appears in the site source code in sequence, and do not output if it occurs 0 times.
Idea: Count the number of occurrences of each template in the text string. When matching with text, each word node I is found, the corresponding res[val[i]++ (so as to query viruses continuously) is output sequentially. In addition, since the characters are ASCII codes, the second dimension of the dictionary tree is opened to 128 (based on theSample consecutive AAA represents two occurrences of AA virus)

#include<iostream>
#include<cstdio>
#include<cstdlib>
#include<string>
#include<cstring>
#include<cmath>
#include<ctime>
#include<algorithm>
#include<utility>
#include<stack>
#include<queue>
#include<vector>
#include<set>
#include<map>
#define E 1e-9
#define INF 0x3f3f3f3f
#define LL long long
const int MOD=10007;
const int N=500000+5;
using namespace std;
int res[N];//Record results
struct AC_Automata{
    int tire[N][128];//Dictionary Tree
    int val[N];//End of string marker
    int fail[N];//Mismatch Pointer
    int last[N];//last[i]=j Table The word represented by the j node is the suffix of the I node word, and the j node is the word node
    int tot;//number

    void init(){//Initialize Point 0
        tot=1;
        val[0]=fail[0]=last[0]=0;
        memset(tire[0],0,sizeof(tire[0]));
    }

    void insert(char *s,int v){//Construct trie and Vals arrays, v needs to be non-zero to represent a single word node
        int len=strlen(s);
        int root=0;
        for(int i=0;i<len;i++){
            int id=s[i];
            if(tire[root][id]==0){
                tire[root][id]=tot;
                memset(tire[tot],0,sizeof(tire[tot]));
                val[tot++]=0;
            }
            root=tire[root][id];
        }
        val[root]=v;
    }

    void build(){//Construct fail and last
        queue<int> q;
        last[0]=fail[0]=0;
        for(int i=0;i<128;i++){
            int root=tire[0][i];
            if(root!=0){
                fail[root]=0;
                last[root]=0;
                q.push(root);
            }
        }

        while(!q.empty()){//bfs fail s
            int k=q.front();
            q.pop();
            for(int i=0;i<128;i++){
                int u=tire[k][i];
                if(u==0)
                    continue;
                q.push(u);

                int v=fail[k];
                while(v && tire[v][i]==0)
                    v=fail[v];
                fail[u]=tire[v][i];
                last[u]=val[fail[u]]?fail[u]:last[fail[u]];
            }
        }
    }

    void print(int i){//Recursive printing of prefix node number identical to node i suffix
        if(val[i]){
            res[val[i]]++;
            print(last[i]);
        }
    }

    void query(char *s){//matching
        int len=strlen(s);
        int j=0;
        for(int i=0;i<len;i++){
            int id=s[i];
            while(j && tire[j][id]==0)
                j=fail[j];
            j=tire[j][id];
            if(val[j])
                print(j);
            else if(last[j])
                print(last[j]);
        }
    }
}ac;
char P[1000][1000];
char T[2000000+10];
int main(){
    int n;
    while(scanf("%d",&n)!=EOF&&n){
        memset(res,0,sizeof(res));
        ac.init();

        for(int i=1;i<=n;i++){
            scanf("%s",P[i]);
            ac.insert(P[i],i);
        }
        ac.build();

        scanf("%s",T);
        ac.query(T);
        for(int i=1;i<=n;i++)
            if(res[i])
                printf("%s: %d\n",P[i],res[i]);
    }
    return 0;
}

Posted by pakmannen on Wed, 07 Aug 2019 20:43:23 -0700