Short-term average zero-crossing rate test based on MATLAB and Python

1. Demand analysis

Using five basic waveforms to detect the short-term average zero-crossing rate algorithm. Generate a 5s waveform with a sampling frequency of 8000Hz: 1. Sine wave: Amplitude: 0.5V, frequency: 1kHz; 2. All-zero waveform; 3. Sine wave: Amplitude: 0.5V, frequency: 2kHz; 4. Sine wave: amplitude: 0.5V, frequency: 3kHz; 5, noise.

Note: Because after generating the .wav file, MATLAB and python will produce errors when reading the WAV file, and the zero waveform will no longer be the result after reading. Therefore, when verifying the short-term average zero-crossing rate algorithm, enter the full Zero matrix for verification.

2. Short-term average zero-crossing rate code

1) MATLAB-based code

Short-term average zero crossing rate program:

zcr=zeros(1,frameNum);
for i=1:frameNum
    frameMat(:,i)=frameMat(:,i)-mean(frameMat(:,i));    % Eliminate DC component
    zcr(i)=sum(frameMat(1:end-1,i).*frameMat(2:end,i)<=0);
end

The complete code is as follows:

[y,fs]=wavread('synthesis.wav');
% Directly output all zero matrix
%y=zeros(1,8000);
%fs = 8000;% sampling frequency
frameSize=200;inc=80; % Set frame length, frame shift
win=hanning(frameSize);
N=length(y);
frameMat=enframe(y, win, inc)';% frame. "'"Means conjugate transpose
frameNum=size(frameMat, 2);  % Get the number of frames. Returns the number of columns in the frameMat matrix
zcr=zeros(1,frameNum);
for i=1:frameNum
    frameMat(:,i)=frameMat(:,i)-mean(frameMat(:,i));    % Eliminate DC component
    zcr(i)=sum(frameMat(1:end-1,i).*frameMat(2:end,i)<=0);
end
gll=zcr/frameSize;
sampleTime=(1:N)/fs;
frameTime=((0:frameNum-1)*inc+0.5*frameSize)/fs;
subplot(2,1,1); plot(sampleTime, y); ylabel('Amplitude'); title('waveFile');
subplot(2,1,2); plot(frameTime, gll);
xlabel('Time (s)'); ylabel('Zero crossing rate'); title('ZCR');

2) Python-based code

Framing program:


def enframe(wave_data, nw, inc, winfunc):
    '''Convert audio signals into frames.
         Parameter meaning:
         wave_data: original audio model
         nw: the length of each frame (here refers to the length of the sampling point, that is, the sampling frequency multiplied by the time interval)
         inc: interval between adjacent frames (defined above)
    '''
    wlen=len(wave_data) #Signal total length
    if wlen<=nw: #If the signal length is less than the length of a frame, the number of frames is defined as 1
        nf=1
    else: # Otherwise, calculate the total length of the frame
        nf=int(np.ceil((1.0*wlen-nw+inc)/inc))
    pad_length=int((nf-1)*inc+nw) #All frames add up to the total flattened length
    zeros=np.zeros((pad_length-wlen,)) #Insufficient length is filled with 0, similar to the extended array operation in FFT
    pad_signal=np.concatenate((wave_data,zeros)) #The filled signal is recorded as pad_signal
    indices=np.tile(np.arange(0,nw),(nf,1))+np.tile(np.arange(0,nf*inc,inc),(nw,1)).T  #Equivalent to extracting the time points of all frames to get a matrix of length nf*nw
    indices=np.array(indices,dtype=np.int32) #Convert indices to matrix
    frames=pad_signal[indices] #Get frame signal
    win=np.tile(winfunc,(nf,1))  #windowWindow function, here default is 1
    return frames*win   #Return frame signal matrix

Short-term average zero crossing rate program:

zcr = np.zeros((frameNum,1))
for i in range(frameNum):
    #X =wave_data[np.arange(i*inc,min(i*inc+win,nw))] #To avoid DC offset, we usually need to perform mean subtraction on each frame
    X[:,i]=X[:,i]-np.mean(X[:,i])    # Eliminate DC component
    zcr[i] = sum(X[0:-1,i]*X[1:,i]<0)

Complete code:

import numpy as np
import wave
import scipy.signal as signal
import math
import pylab as pl

def enframe(wave_data, nw, inc, winfunc):
    '''Convert audio signals into frames.
         Parameter meaning:
         wave_data: original audio model
         nw: the length of each frame (here refers to the length of the sampling point, that is, the sampling frequency multiplied by the time interval)
         inc: interval between adjacent frames (defined above)
    '''
    wlen=len(wave_data) #Signal total length
    if wlen<=nw: #If the signal length is less than the length of a frame, the number of frames is defined as 1
        nf=1
    else: # Otherwise, calculate the total length of the frame
        nf=int(np.ceil((1.0*wlen-nw+inc)/inc))
    pad_length=int((nf-1)*inc+nw) #All frames add up to the total flattened length
    zeros=np.zeros((pad_length-wlen,)) #Insufficient length is filled with 0, similar to the extended array operation in FFT
    pad_signal=np.concatenate((wave_data,zeros)) #The filled signal is recorded as pad_signal
    indices=np.tile(np.arange(0,nw),(nf,1))+np.tile(np.arange(0,nf*inc,inc),(nw,1)).T  #Equivalent to extracting the time points of all frames to get a matrix of length nf*nw
    indices=np.array(indices,dtype=np.int32) #Convert indices to matrix
    frames=pad_signal[indices] #Get frame signal
    win=np.tile(winfunc,(nf,1))  #windowWindow function, here default is 1
    return frames*win   #Return frame signal matrix

fw = wave.open('F:\【1】Audio\Characteristic parameter.m\synthesis.wav','rb')
params = fw.getparams()
print(params)
nchannels, sampwidth, framerate, nframes = params[:4]
str_data = fw.readframes(nframes)
wave_data = np.fromstring(str_data, dtype=np.int16)
wave_data =wave_data*1.0/(max(abs(wave_data)))#wave amplitude normalization
fw.close()
#wave_data=np.zeros(8000,np.int16)
#framerate = 8000 #Sampling frequency
nw = 200
inc = 80
winfunc = signal.hann(nw) 
X=enframe(wave_data, nw, inc, winfunc).T   #The reason for the transposition is that the output matrix of the framing function enframe is the number of frames * frame length
frameNum =X.shape[1] #Return the number of matrix columns and get the number of frames
zcr = np.zeros((frameNum,1))
for i in range(frameNum):
    #X =wave_data[np.arange(i*inc,min(i*inc+win,nw))] #To avoid DC offset, we usually need to perform mean subtraction on each frame
    X[:,i]=X[:,i]-np.mean(X[:,i])    # Eliminate DC component
    zcr[i] = sum(X[0:-1,i]*X[1:,i]<0)
#print (zcr.max())
time = np.arange(0, len(wave_data)) * (1.0 / framerate)
time2 = np.arange(0, len(zcr)) * (len(wave_data)/len(zcr) / framerate)
pl.subplot(211)
pl.plot(time, wave_data)
pl.ylabel("Amplitude")
pl.subplot(212)
pl.plot(time2, zcr/nw)
pl.ylabel("ZCR")
pl.xlabel("time (seconds)")
#pl.ylim((-1, 2))   
pl.show()

3. Realization results and analysis

1) Implementation results based on MATLAB

a. All zero waveform verification:

b. A 5s waveform verification:

2) Implementation results based on Python

a. All zero waveform verification:

b. A 5s waveform verification:

3) Results analysis and verification

Theoretically:The number of sampling points per second is 8000. The sine wave passes through the zero point (and horizontal axis) twice in a cycle, so the 1kHz sine wave passes 2000 zero points per second, the 2kHz sine wave passes 4000 zero points per second, and the 3kHz sine wave 6000 zero points per second, that is, the average zero-crossing rate of a 1kHz sine wave: 2000/8000=0.25, the average zero-crossing rate of a 2kHz sine wave: 4000/8000=0.5, the average zero-crossing of a 3kHz sine wave The rate is: 6000/8000=0.75; the average zero-crossing rate of all zero waveforms is: 100%; the noise is evenly distributed, so the zero-crossing rate should be 50%.

Experimental results: It can be seen from the results of MATLAB and Python that the simulation results are the same, and the average zero-crossing rate of the sine wave of 1kHz is 0.25, the average zero-crossing rate of the sine wave of 2kHz is 0.5, and the average zero-crossing rate of the sine wave of 3kHz is Is 0.75; the average zero-crossing rate of all zero waveforms is 1; the average zero-crossing rate of noise should be about 0.5.

in conclusion:From the above verification, the algorithm that can derive the average zero-crossing rate is correct.

Intelligent Recommendation

Short-term average zero level of speech signals

First, short-term average zero zero 1. For continuous speech signals, it can be examined for the case of the time domain waveform through the time axis; 2. For discrete signals, it is essentially the ...

Short-time zero-crossing rate in time domain for acoustic event recognition

1. Introduction to the concept The short-term zero-crossing rate can be regarded as a simple measure of the signal frequency and is a rough estimate of the spectral characteristics. (1) Zero crossing ...

Python wav file zero-crossing rate and plot out

...

Zero-crossing rate and energy is calculated

Calculation of energy By calculating the rate of zero ...

Long-term moving principle on short-term average

Yesterday, I learned the short-term moving average usage. By the way, I wrote a memo, remember the inquiry in the inside. First of all, we must know the probability theory, the probability of the coin...