bzoj 4892: [Tjoi2017]dna hash + dichotomy

Keywords: less

meaning of the title

The Institute of biology at the University of garridon found the gene sequence s that determines whether people like to eat lotus root, and the base sequence with this sequence will show the character that they like to eat lotus root. However, the researchers found that the base sequence s can still show the character of eating lotus root by modifying no more than three bases arbitrarily. Now researchers want to know where the gene is on the DNA strand S0. Therefore, you need to count how many consecutive subsets of a person's DNA sequence S0 that shows lotus root eating traits may be the gene, that is, how many consecutive subsets of S0 can be changed into s if they are less than or equal to three letters.
|S|,|S0|<=10^5

Analysis

Directly enumerate each position as the starting point, and then the pointer jumps to the back of lcp every time. If it jumps more than three times, it can exit.
sa can be used to find lcp here, but as long as hash + dichotomy is used.

Code

#include<iostream>
#include<cstdio>
#include<cstdlib>
#include<cstring>
#include<algorithm>
using namespace std;

typedef unsigned long long ull;

const int N=100005;

int n,m;
ull pow[N],hash1[N],hash2[N];
char ch[N];

bool check(int x,int y,int len)
{
    return (hash1[x+len-1]-hash1[x-1]*pow[len])==(hash2[y+len-1]-hash2[y-1]*pow[len]);
}

int get_lcp(int x,int y)
{
    int l=y,r=m;
    while (l<=r)
    {
        int mid=(l+r)/2;
        if (check(x,y,mid-y+1)) l=mid+1;
        else r=mid-1;
    }
    return l-1;
}

int main()
{
    pow[0]=1;
    for (int i=1;i<=100000;i++) pow[i]=pow[i-1]*27;
    int T;scanf("%d",&T);
    while (T--)
    {
        scanf("%s",ch+1);n=strlen(ch+1);
        for (int i=1;i<=n;i++) hash1[i]=hash1[i-1]*27+ch[i]-'A'+1;
        scanf("%s",ch+1);m=strlen(ch+1);
        for (int i=1;i<=m;i++) hash2[i]=hash2[i-1]*27+ch[i]-'A'+1;
        int ans=0;
        for (int i=1;i+m-1<=n;i++)
        {
            int len=1;
            for (int j=1;j<=3;j++)
            {
                len=get_lcp(i+len-1,len)+2;
                if (len>m) break;
            }
            if (len<=m) len=get_lcp(i+len-1,len)+1;
            if (len>m) ans++;
        }
        printf("%d\n",ans);
    }
    return 0;
}

Posted by Trs2988 on Sun, 03 May 2020 14:25:57 -0700