Endpoint detection
Endpoint Detection Algorithmic Steps of Energy and Zero Crossing Number
- Speech signal x(n) subframe processing
- Calculate the short-time energy of each frame, and get the short-time frame energy of speech.
- Calculate the zero-crossing number of each frame and get the zero-crossing number of short-time frames.
- The average energy of the erased speech is set to a higher threshold T1 to determine the beginning of the speech, and then a lower threshold T2 is determined according to the average energy of the background noise to determine the end point of the first level speech. The second level judgement also sets a threshold T3 based on the average zero-crossing of background noise to judge the front voice voice and the back end voice.
a. Speech signal framing
filedir=[];%set up path filename='D:\matlab\music\zj3.wav'; file=[filedir filename]; [x,Fs]=audioread(file);%Get the data of speech signal wlen=200;%Frame length inc=100;%Frame shift win=hamming(wlen);%hamming window N=length(x);%Signal Length time=(0:N-1)/Fs;%Calculate the time scale of the signal X=enframe(x,win,inc)'; %Framing,A column is a frame. fn=size(X,2);%Frame number frameTime=frame2time(fn,wlen,inc,Fs); %Find the corresponding time of each frame %This formula has to be looked at again.
b.
%short time energy for i=1:fn y=X(:,i);%Data per frame b=0; for m=1:1:200 %This is based on the frame length. b=b+y(m).^2; end E(i)=b; end %%Reference resources-Simpler-Not yet understood fn=size(X,2); % Find out the number of frames time=(0:N-1)/Fs; % Calculate the time scale of the signal for i=1 : fn u=X(:,i); % Take out a frame u2=u.*u; % Find out the energy En(i)=sum(u2); % Summation of a frame end
c. Zero Crossing Number of Short Time Frames
%Short-term zero-crossing rate Z=zeros(1,fn); % Initialization for i=1:fn y=X(:,i);%Data per frame b=0; for m=1:1:199 %Depending on the frame length if y(m)*y(m+1)<0; b=b+1; end Z(i)=b; end end %%Simpler-Not yet understood fn=size(X,2); % Get the number of frames zcr1=zeros(1,fn); % Initialization for i=1:fn z=X(:,i); % Get a frame of data for j=1: (wlen- 1) ; % Looking for Zero Crossing Points in a Frame if z(j)* z(j+1)< 0 % Judging whether it is a zero-crossing point zcr1(i)=zcr1(i)+1; % It's zero crossing. Record it once. end end end
d. Setting thresholds based on average energy
Average energy results:
[External Link Picture Transfer Failure (img-Ciy7MbEu-15639507473) (D: matlab Voice Signal Processing Experimental Course - Self assets Short-term Energy. jpg)]
E has 818 data
T1 is set to 0.01, that is, when a marker greater than 0.1 is found. The Beginning of Voice
T2 is set to 0.001, i.e. when a marker less than 0.001 is found. The End of Voice
Find a point, but programming is a bit of a problem. It's about showing only endpoints, not all of them.
[External Link Picture Transfer Failure (img-2wWpu6Tz-1563959907475)(assets/found point-1 but did not find endpoint.jpg)]
Amended
[External Link Picture Transfer Failure (img-SRvXb1fn-1563959907476)(assets/find point-something wrong. jpg)]
Zero-crossing rate endpoint is a bit problematic
%Endpoint Detection Algorithms for Energy and Zero Crossing Number 1 clear all; clc; filedir=[];%set up path filename='D:\matlab\music\zs.wav'; file=[filedir filename]; [x,Fs]=audioread(file); xmax=max(abs(x)); x=x/xmax';%normalization x=filter([1 -0.98],[1],x);%Pre-aggravation wlen=200;%Frame length inc=100;%Frame shift win=hamming(wlen);%hamming window N=length(x);%Signal Length time=(0:N-1)/Fs;%Calculate the time scale of the signal X=enframe(x,win,inc)'; %Framing,A column is a frame. % Xmax=max(abs(X));%Matrix normalization, that's not good. % X=X/Xmax; fn=size(X,2)';%Frame number frameTime=frame2time(fn,wlen,inc,Fs); %Find the corresponding time of each frame %This formula has to be looked at again. %short time energy for i=1:fn y=X(:,i);%Data per frame b=0; for m=1:1:200%Data in a frame b=b+y(m).^2; end E(i)=b; end %Short-term zero-crossing rate Z=zeros(1,fn); % Initialization, fn Previously used for i=1:fn y=X(:,i);%Data per frame b=0; for m=1:1:199 if y(m)*y(m+1)<0; b=b+1; end Z(i)=b; end end %Find the threshold of short-term energy to determine the beginning and end of speech zeros(i); q=[];%Store the location of the start voice boundaries i1=1; while (i1<length(E)) for i1=i1:1:length(E) e=E(i1); if e>0.1 q=[q i1-1]; i1=i1+1; for i2=i1:length(E) e=E(i2); if e<0.1 q=[q i2+1]; i1=i2+1; break end end break end end end %Zero-crossing rate i1=1; w=[];%Storage End Speech Limit Location while (i1<length(Z)) for i1=i1:1:length(Z) e=Z(i1); if e>120 w=[w i1]; i1=i1+1; for i2=i1:length(Z) e=Z(i2); if e<50 w=[w i2+1]; i1=i2+1; break end end break end end end %Drawing subplot(311) plot(time,x); title('original signal') xlabel('time');ylabel('Range'); subplot(312) plot(frameTime(q),E(q),'or'); hold on plot(frameTime,E); title('short time energy') xlabel('time');ylabel('Range'); subplot(313) plot(frameTime(w),Z(w),'or'); hold on plot(frameTime,Z); title('Zero-crossing rate') xlabel('time');ylabel('frequency');