Writing time sequenced to Android AudioTrack
I am currently writing some code for a sample sequencer in Android. I am using the AudioTrack class. I have been told the only proper way to have accurate timing is to use the timing of the AudioTrack. EG I know that if I write a buffer of X samples to AudioTrack playing at a rate of 44100 samples per second, that the time to write will be (1/44100)X secs.
Then you use that info to know what samples should be written when.
I am trying to implement my first attempt using this approach. I am using only one sample and am writing it as continuous 16th notes at a tempo of 120bpm. But for some reason it is playing at a rate of 240bpm.
First I checked my code to derive the time of a 16th (nanoseconds) note at tempo X. It checks outs.
private void setPeriod()
{
period=(int)((1/(((double)TEMPO)/60))*1000);
period=(period*1000000)/4;
Log.i("test",String.valueOf(period));
}
Then I verified that my code to get the time for my buffer to be played at 44100khz in nanoseconds and it is correct.
long bufferTime=(1000000000/SAMPLE_RATE)*buffSize;
So now I am left thinking that the audio track is playing at a rate that is different from 44100. Maybe 96000khz, which would explain the doubling of speed. But when I instantiate the audioTrack, it was indeed set to 44100khz.
final int SAMPLE_RATE is set to 44100
buffSize = AudioTrack.getMinBufferSize(SAMPLE_RATE, AudioFormat.CHANNEL_OUT_MONO,
AudioFormat.ENCODING_PCM_16BIT);
track = new AudioTrack(AudioManager.STREAM_MUSIC, SAMPLE_RATE,
AudioFormat.CHANNEL_OUT_MONO,
AudioFormat.ENCODING_PCM_16BIT,
buffSize,
AudioTrack.MODE_STREAM);
So I am confused as to why my tempo is being doubled. I ran a debug to compare time elapsed audioTrack to time elapsed system time, and the it seems that the audiotrack is indeed playing twice as fast as it should be. I am confused.
Just to make sure, this is my play loop.
public void run() {
// TODO Auto-generated method stub
int buffSize=192;
byte[] output = new byte[buffSize];
int pos1=0;//index for output array
int pos2=0;//index for sample array
long bufferTime=(1000000000/SAMPLE_RATE)*buffSize;
long elapsed=0;
int writes=0;
currTrigger=trigger[triggerPointer];
Log.i("test","period="+String.valueOf(period));
Log.i("test","bufferTime="+String.valueOf(bufferTime));
long time=System.nanoTime();
while(play)
{
//fill up the buffer
while(pos1<buffSize)
{
output[pos1]=0;
if(currTrigger&&pos2<sample.length)
{
output[pos1]=sample[pos2];
pos2++;
}
pos1++;
}
track.write(output, 0, buffSize);
elapsed=elapsed+bufferTime;
writes++;
//time passed is more than one 16th note
if(elapsed>=period)
{
Log.i("test",String.valueOf(writes));
Log.i("test","elapsed A.T.="+String.valueOf(elapsed)+" elapsed S.T.="+String.valueOf(System.nanoTime()-time));
time=System.nanoTime();
writes=0;
elapsed=0;
triggerPointer++;
if(triggerPointer==16)
triggerPointer=0;
currTrigger=trigger[triggerPointer];
pos2=0;
}
pos1=0;
}
}
}
edited : rephrased and updated due to initial erroneous assumption that system time was used to synchronize sequenced audio :)
As for audio playing back at twice the speed, this is a bit strange as the "write"-method of the AudioTrack is blocking until the native layer has enqueued the next buffer, are you sure the render loop isn't invoked from two different sources (although I assume from your example you invoke the loop from within a thread).
However, what is certain is that there is a time synchronization issue to address: the problem herein lies with the calculation of the buffer time you use in your example:
(1000000000/SAMPLE_RATE)*buffSize;
Which will always return 4353741 at a buffer size of 192 samples at a sample rate of 44100 Hz, thus disregarding any cues in tempo (for instance this will be the same at 300 BPM or 40 BPM), Now, in your example this doesn't have any consequences for the actual syncing per se, but I'd like to point this out as we'll return to it shortly further on in this text.
Also, nanoseconds are a nicely precise unit, but too much as milliseconds will suffice for audio operations. As such, I will continue the illustration in milliseconds.
Your calculation for the period of a 16th note at 120 BPM indeed checks out at the correct value of 125 ms. The previously mentioned calculation for the period corresponding to each buffer size is 4.3537 ms. This indicates you will iterate the buffer loop 28.7112 times before the time of a single sixteenth note passes. In your example however, you check whether the "offset" for this sixteenth note has passed at the END of the buffer iteration loop (where the period for a single buffer has already been added to the elapsed time!), by using:
elapsed>=period
Which will already lead to drift at the first occasion as at this moment "elapsed" would be at (192 * 29 iterations) 5568 samples (or 126.26 ms), rather than at (192 * 28.7112 iterations) 5512 samples (or 126 ms). This is a difference of 56 samples (or when speaking in time : 1.02 ms). This wouldn't of course lead to samples playing back FASTER than expected (as you stated), but already leads to a irregularity in playback. For the second 16th note (which would occur at the 57.4224th iteration, the drift would be 11136 - 11025 = 111 samples or 2.517 ms (more than half your buffer time!) As such, you must perform this check WITHIN the
while(pos1<buffSize)
loop, where you are incrementing the read pointer up until the size of the buffer has been reached. As such you will need to increase another variable by a fraction of the buffer period PER buffer sample.
I hope the above example illustrates why I'd initially proposed counting time by sample iterations rather than elapsed time (of course the samples DO indicate time, as they are merely translations of a unit of time to an amount of samples in a buffer, but you can use these numbers as the markers, rather than adding a fixed interval to a counter as in your render loop).
First of all, some convenience math to help you with getting these values :
// calculate the amount of samples are necessary for storing the given length of time
// ( in milliSeconds ) at the given sample rate ( in Hz )
int millisecondsToSamples( int milliSeconds, int sampleRate )
{
return ( int ) ( milliSeconds * ( sampleRate / 1000 ));
}
OR : These calculations which are more convenient when thinking in a musical context like you mentioned in your post. Calculate the amount of samples that are present in a single bar of music at the given sample rate ( in Hz ), tempo ( in BPM ) and time signature ( timeSigBeatUnit being the "4" and timeSigBeatAmount being the "3" in a time signature of 3/4 - although most sequencers limit themselves to 4/4 I've added the calculation for explaining the logic).
int samplesPerBeat = ( int ) (( sampleRate * 60 ) / tempo );
int samplesPerBar = samplesPerBeat * timeSigBeatAmount;
int samplesPerSixteenth = ( int ) ( samplesPerBeat / 4 ); // 1/4 of a beat being a 16th
etc.
The way you then write the timed samples into the output buffer is by keeping track of the "playback position' in your buffer callback, ie each time you write a buffer, you'll be incrementing the playback position with the length of the buffer. Returning to a musical context: if you were to be "looping a single bar of 120 bpm in 4/4 time", when the playback position would exceed (( sampleRate * 60 ) / 120 * 4 = 88200 samples, you reset it to 0 to "loop" from the beginning.
So let's assume you have two "events" of audio that occur in a sequence of a single bar of 4/4 time at 120 BPM. One event is to play on the 1st beat of a bar and lasts for a quaver (1/8 of a bar) and the other is to play on the 3rd beat of a bar and lasts for another quaver. These two "events" (which you could represent in a value object) would have the following properties, for the first event:
int start = 0; // buffer position 0 is at the 1st beat/start of the bar
int length = 11025; // 1/8 of the full bar size
int end = 11025; // start + length
and the second event:
int start = 44100; // 3rd beat (or half-way through the bar)
int length = 11025;
int end = 55125; // start + length
These value objects could have two additional properties such as "sample" which could be the buffer containing the actual audio and "readPointer" which would hold the last sample-buffer index the sequencer has read from last.
Then in the buffer write loop:
int playbackPosition = 0; // at start of bar
int maximumPlaybackPosition = 88200; // i.e. a single bar of 4/4 at 120 bpm
public void run()
{
// loop through list of "audio events" / samples
for ( CustomValueObject audioEvent : audioEventList )
{
// loop through the buffer length this cycle will write
for ( int i = 0; i < bufferSize; ++i )
{
// calculate "sequence position" from playback position and current iteration
int seqPosition = playbackPosition + i;
// sequence position within start and end range of audio event ?
if ( seqPosition >= audioEvent.start && seqPosition <= audioEvent.end )
{
// YES! write its sample content into the output buffer
output[ i ] += audioEvent.sample[ audioEvent.readPointer ];
// update the sample read pointer to the next slot (but keep in bounds)
if ( ++audioEvent.readPointer == audioEvent.length )
audioEvent.readPointer = 0;
}
}
// update playback position and keep within sequencer range for looping
if ( playbackPosition += bufferSize > maximumPosition )
playbackPosition -= maximumPosition;
}
}
This should give you a perfectly timed approach in writing audio. There's still some magic you have to work out when you're hitting the iteration where the sequence will loop (ie read the remaining unprocessed buffer length from the start of the sample for seamless looping) but I hope this gives you a general idea on a working approach.
链接地址: http://www.djcxy.com/p/5798.html上一篇: 从音频中绘制波形的算法