Generating Audio Waveforms

This post is more about clarification than it is about implementing some sort of audio waveform algorithm. I've read a myriad of posts concerning the subject (both on SO and out on the web), and here's what I've gathered:

  • In the context of 16-bit WAV, I want to read every two bytes as a short , which will result in a value between -32768 to 32767.
  • With a sample rate of 44.1kHz, I'll have 44 thousand samples for every one second of audio.
  • This is pretty straight-forward, however I have the following questions:

  • A WAV rendered in mono only has one channel, which is two bytes of information per frame. In stereo, this becomes four bytes of information. In my situation, I'm not required to display both channels, so would I simply skip the right channel and read only the left? Some solutions I've read mentioned combining both the left and right channels, though I'm not sure if this is required.
  • Say I had an audio file that is two seconds long, and another that is thirty seconds long. If I need to grab a minimum of 800 samples to represent the waveform, would grabbing 800 samples along the length of the file introduce accuracy issues, eg (44,000 * 2) / 800 for the two second audio file, and (44,000 * 30) / 800 for the thirty second audio file.
  • An explanation would really be appreciated!


    This is outside my area of expertise, but I'll give it a go.

    As far as I can tell, you could probably skip some of the samples and retain reasonable accuracy - if you skip every other sample in a 43 kHz file, it would be like if you recorded the original at 22.05 kHz. However, according to Wikipedia, you run into accuracy issues when your sampling frequency is less than double the frequency of one of the components of the sound you are sampling. Unless you have high-pitched bells and cymbals in your audio, that's probably not much of an issue at 22.05 kHz. But if you're sampling only 800 times per 30 seconds, that wouldn't be enough to handle much more than the very lowest note on an organ.

    Imagine you're sampling 800 times per second, and there's a sound at 800 Hz (which is near G or G# above treble C.) Every time you sample, you're going to get that wave at the exact same point. That place in the wave you are sampling could be the peak point, or it could be a low point. It's impossible for you to know without sampling more often.

    As far as whether you can sample just one channel, that depends on whether it's OK for you to ignore the other channel. Imagine a stereo file with a voice on the right and music on the left. They're going to have different wavepatterns. If it's OK for you to just ignore the music, then you can sample the right and ignore the left. If you need both, then you obviously need to sample both.

    链接地址: http://www.djcxy.com/p/33870.html

    上一篇: 编码测试音时为什么会出现ffmpeg编解码器噪声?

    下一篇: 生成音频波形