Pcm -> Aac (Encoder) -> Pcm(Decoder) in Real-Time with Correct Optimization

Android 9 AAC decoder outputs zero samples with ffmpeg-encoded files

I opened an issue on the android issue tracker:
https://issuetracker.google.com/issues/118398811

And for now I just implemented a workaround: when the "encoder-delay" value is present in the MediaFormat object and it's an impossibly high value, I just set it to zero. Something like:

if (format.containsKey("encoder-delay") && format.getInteger("encoder-delay") > THRESHOLD) {
    format.setInteger("encoder-delay", 0);
}

NB: This means the initial gap will not be trimmed away, but for M4a files that don't have such info this is already the case on pre-android-9 devices.

Android MediaCodec, is it ok to have decoded data of double size of the original?

This sounds like you're capturing mono, encoding it, and then the decoder outputs it as a stereo stream with the mono channel duplicated across the left and right channels.

Make sure that when MediaCodec.dequeueOutputBuffer() returns MediaCodec.INFO_OUTPUT_FORMAT_CHANGED, you call MediaCodec.getOutputFormat() to get the current format.

How do I use CoreAudio's AudioConverter to encode AAC in real-time?

AudioConverterFillComplexBuffer does not actually mean "fill the encoder with my input buffer that I have here". It means "fill this output buffer here with encoded data from the encoder". With this perspective, the callback suddenly makes sense -- it is used to fetch source data to satisfy the "fill this output buffer for me" request. Maybe this is obvious to others, but it took me a long time to understand this (and from all the AudioConverter sample code I see floating around where people send input data through inInputDataProcUserData, I'm guessing I'm not the only one).

The AudioConverterFillComplexBuffer call is blocking, and is expecting you to deliver data to it synchronously from the callback. If you are encoding in real time, you will thus need to call FillComplexBuffer on a separate thread that you set up yourself. In the callback, you can then check for available input data, and if it is not available, you need to block on a semaphore. Using an NSCondition, the encoder thread would then look something like this:

- (void)startEncoder
{
    OSStatus creationStatus = AudioConverterNew(&_fromFormat, &_toFormat, &_converter);

    _running = YES;
    _condition = [[NSCondition alloc] init];
    [self performSelectorInBackground:@selector(_encoderThread) withObject:nil];
}

- (void)_encoderThread
{
    while(_running) {
        // Make quarter-second buffers.
        size_t bufferSize = (_outputBitrate/8) * 0.25;
        NSMutableData *outAudioBuffer = [NSMutableData dataWithLength:bufferSize];
        AudioBufferList outAudioBufferList;
        outAudioBufferList.mNumberBuffers = 1;
        outAudioBufferList.mBuffers[0].mNumberChannels = _toFormat.mChannelsPerFrame;
        outAudioBufferList.mBuffers[0].mDataByteSize = (UInt32)bufferSize;
        outAudioBufferList.mBuffers[0].mData = [outAudioBuffer mutableBytes];

        UInt32 ioOutputDataPacketSize = 1;

        _currentPresentationTime = kCMTimeInvalid; // you need to fill this in during FillComplexBuffer
        const OSStatus conversionResult = AudioConverterFillComplexBuffer(_converter, FillBufferTrampoline, (__bridge void*)self, &ioOutputDataPacketSize, &outAudioBufferList, NULL);

        // here I convert the AudioBufferList into a CMSampleBuffer, which I've omitted for brevity.
        // Ping me if you need it.
        [self.delegate encoder:self encodedSampleBuffer:outSampleBuffer];
    }
}

And the callback could look like this: (note that I normally use this trampoline to immediately forward to a method on my instance (by forwarding my instance in inUserData; this step is omitted for brevity)):

static OSStatus FillBufferTrampoline(AudioConverterRef               inAudioConverter,
                                        UInt32*                         ioNumberDataPackets,
                                        AudioBufferList*                ioData,
                                        AudioStreamPacketDescription**  outDataPacketDescription,
                                        void*                           inUserData)
{
    [_condition lock];

    UInt32 countOfPacketsWritten = 0;

    while (true) {
        // If the condition fires and we have shut down the encoder, just pretend like we have written 0 bytes and are done.
        if(!_running) break;

        // Out of input data? Wait on the condition.
        if(_inputBuffer.length == 0) {
            [_condition wait];
            continue;
        }

        // We have data! Fill ioData from your _inputBuffer here.
        // Also save the input buffer's start presentationTime here.

        // Exit out of the loop, since we're done waiting for data
        break;
    }

    [_condition unlock];

        // 2. Set ioNumberDataPackets to the amount of data remaining

    // if running is false, this will be 0, indicating EndOfStream
    *ioNumberDataPackets = countOfPacketsWritten;

    return noErr;
}

And for completeness, here's how you would then feed this encoder with data, and how to shut it down properly:

- (void)appendSampleBuffer:(CMSampleBufferRef)sampleBuffer
{
    [_condition lock];
    // Convert sampleBuffer and put it into _inputBuffer here
    [_condition broadcast];
    [_condition unlock];
}

- (void)stopEncoding
{
    [_condition lock];
    _running = NO;
    [_condition broadcast];
    [_condition unlock];
}

Pcm -> Aac (Encoder) -> Pcm(Decoder) in Real-Time with Correct Optimization

Android 9 AAC decoder outputs zero samples with ffmpeg-encoded files

Android MediaCodec, is it ok to have decoded data of double size of the original?

How do I use CoreAudio's AudioConverter to encode AAC in real-time?

Related Topics

Leave a reply