With (rand):
Questions:
For the data mod = 2 of $10%$, a [i] must be 1 and output 1 directly.
For another $10%$data n=1, output $a [1] ^ m% mod $attention mod instead of $1e9+7$.
For the data of $50%$, we consider dp.
Set $f[i][j]$as the number of schemes whose result is j after the first operation.
The easy-to-get transfer equation is: $f [i] [j*a [k] mod]= sum limits f [i-1] [j]$(sb DP in the examination room)
But the complexity of $O(n*m*mod) $is prohibitive and does not add to the score.
Considering optimization, we observed that $mod <= 300 $and $n <= 100,000 $, and thought of a similar discretization idea.
Record the number of occurrences of the number of $i dollars with an array of $c[i] $
The transfer equation becomes: $f [i] [j*a [k] mod]= sum limits f [i-1] [j] * C [k].$
The time complexity becomes $O(m*mod^2) $, and the expected score is 50 pts.
For the data of $100\\ we consider optimization.
Method 1: Matrix Fast Power + Primitive Root Optimization Attached wd Dashen Blog
Method 2: Multiplication optimization.
Looking back at the DP formula, we can see: $f [i + q] [j*a [k]% mod]= f [i] [j] * f [q] [k].$
Equally available: $f [2*i] [j*k%mod]= f [i] [j]*f [i] [k]$
Then we can calculate $f [2 ^ 1], f [2 ^ 2], f [2 ^ 3] cdots, f [2 ^ n] (2 ^ n <= m < 2 ^ {n + 1} at the time of $O(logm)$.$
So we can use the idea of fast power to split m binary and attach a common fast power template:
ll qpow(ll x,ll y,ll mod) { ll ans=1; while(y){ if(y&1)ans=ans*x%mod; x=x*x%mod;y>>=1; } return ans; }
Change y to m, x to f, and ans to g.
The meaning of the $f[i][j] $array is also changed to $2^i $times, resulting in the number of schemes of $j $.
The array of $g[i][j] $is the number of scenarios that operate on $2 ^ 1 + 2 ^ 2 + 2 ^ 3 + cdots + 2 ^ I $times, resulting in $j $
So $f [i] [j*k% mod]= sum limits {j=1} ^ {mod-1} \ sum_ limits __ limits {k=1} {mod-1} f [i-1] [j] * f [i-1] [k] [k].$
$ g[i][j*k\%mod]=\sum\limits_{j=1}^{mod-1}\sum\limits_{k=1}^{mod-1}g[i-1][j]*f[i][k] $
There's one last problem: memory.
At present, our space complexity is $O(logm*mod) $, and f, g arrays are only related to the previous state. Obviously, rolling arrays can be used. The final space complexity is $O(2*mod) $, and the time complexity is $O(mod^2logm)$.
And then we're happy with AC.1 #include<iostream> 2 #include<cstdio> 3 #include<cstring> 4 #include<cmath> 5 #include<algorithm> 6 using namespace std; 7 #define R register 8 #define ll long long 9 inline ll read(){ 10 ll aa=0;R int bb=1;char cc=getchar(); 11 while(cc<'0'||cc>'9') 12 {if(cc=='-')bb=-1;cc=getchar();} 13 while(cc>='0'&&cc<='9') 14 {aa=(aa<<1)+(aa<<3)+(cc^48);cc=getchar();} 15 return aa*bb; 16 } 17 const int p=1e9+7; 18 const int N=1005; 19 ll qpow(ll x,ll y,ll mod) 20 { 21 ll ans=1; 22 while(y){ 23 if(y&1)ans=ans*x%mod; 24 x=x*x%mod;y>>=1; 25 } 26 return ans; 27 } 28 ll n,mod,m,tmp=0,cur=0; 29 ll ans,f[2][N],g[2][N]; 30 inline void split(int y) 31 { 32 while(y){ 33 if(y&1){ 34 memset(g[cur^1],0,sizeof(g[cur^1])); 35 for(R int i=1;i<mod;++i) 36 for(R int j=1;j<mod;++j) 37 g[cur^1][i*j%mod]=(g[cur^1][i*j%mod]+f[tmp][i]*g[cur][j]%p)%p; 38 cur^=1; 39 } 40 y>>=1; 41 memset(f[tmp^1],0,sizeof(f[tmp])); 42 for(R int i=1;i<mod;++i){ 43 for(R int j=1;j<mod;++j) 44 f[tmp^1][i*j%mod]=(f[tmp^1][i*j%mod]+f[tmp][i]*f[tmp][j])%p; 45 } 46 tmp^=1; 47 } 48 } 49 int main() 50 { 51 n=read();m=read();mod=read(); 52 if(mod==2){puts("1");return 0;} 53 if(n==1){ 54 int x=read(); 55 ans=qpow(x,m,mod); 56 printf("%lld\n",ans%p); 57 return 0; 58 } 59 for(R int i=1;i<=n;++i) 60 ++f[0][read()]; 61 g[0][1]=1; 62 split(m); 63 for(R int i=1;i<=mod;++i) 64 ans=(ans+g[cur][i]*i%p)%p; 65 ans=1ll*ans*qpow(n,m*(p-2),p)%p; 66 printf("%lld\n",ans); 67 return 0; 68 }