How to properly nomalisation an array to plot spectrogram

I have already obtain a 2D array of amplitudes for each time and frequency. However, to plot a spectrogram, the pixel intensity should be according to the level of amplitude (ex. Red highest green lowest).

(I got the amplitude from FFT and sqrt(real^2+img^2))

First I convert it to logarithmic scale by 10*log10(amplitude^2) (*i'm not sure how to manage when amp is 0, no error comes up)

Then, i simply find Mix_amp and scale all the elements to value between 0-1.

The problem is. When generating a spectrogram from noise free sound like a computer generated sweeping sound from 20-20kHz i got a nicely upward straightline. However, with actual song, the characteristic of spectrogram seems to be not distinct enough so its going to be difficult when i want to apply peak finding for the later phase.

Did i do something not right for the processes?


You may not find single peaks on real songs.

ie, a chord has 3 or more fundamental tones + harmonics for each fundamental.

Also multiple instruments may be playing different tones at different intensity.

Instead of Max Amp normalize by total power inside the window. If a frequency contains more that x% of that power you've found a peak.

If you have close tones, you'll need to deal with spectral leakage. Using an appropriate window and/or bigger FFT may help with discrimination.

链接地址: http://www.djcxy.com/p/33846.html

上一篇: 如何从音频文件(.wav)中分离频率

下一篇: 如何正确地将数组绘制成谱图