C++ Reading the Data Part of a Wav File

C++ Reading the Data part of a WAV file

This image is taken from a Stanford course

WAV File Format

So you can see that the audio data occurs immediately after the headers you already read and there will be Subchunk2Size bytes of audio data.

The pseudocode for this would be

ReadRIFF();
ReadFMT();
int32 chunk2Id = Read32(BigEndian);
int32 chunk2Size = Read32(LittleEndian);
for (int i = 0; i < chunk2Size; i++)
{
audioData[i] = ReadByte();
}

If the audio is stereo you'll have two audio streams in data. If the audio is compressed (mp3, aac, etc) you'll have to decompress it first.

Reading and processing WAV file data in C/C++

My first recommendation would be to use some kind of library to help you out. Most sound solutions seem overkill, so a simple library (like the one recommended in the comment of your question, libsndfile) should do the trick.

If you just want to know how to read WAV files so you can write your own (since your school might turn its nose up at having you use a library like any other regular person), a quick google search will give you all the info you need plus some people who have already wrote many tutorials on reading the .wav format.

If you still don't get it, here's some of my own code where I read the header and all other chunks of the WAV/RIFF data file until I get to the data chunk. It's based exclusively off the WAV Format Specification. Extracting the actual sound data is not very hard: you can either read it raw and use it raw or do a conversion to a format you'd have more comfort with internally (32-bit PCM uncompressed data or something).

When looking at the below code, replace reader.Read...( ... ) with equivalent fread calls for integer values and byte sizes of the indicated type. WavChunks is an enum that is the Little Endian values of the IDs inside of a WAV file chunk, and the format variable is one of the types of the Wav Format Types that can be contained in the WAV File Format:

enum class WavChunks {
RiffHeader = 0x46464952,
WavRiff = 0x54651475,
Format = 0x020746d66,
LabeledText = 0x478747C6,
Instrumentation = 0x478747C6,
Sample = 0x6C706D73,
Fact = 0x47361666,
Data = 0x61746164,
Junk = 0x4b4e554a,
};

enum class WavFormat {
PulseCodeModulation = 0x01,
IEEEFloatingPoint = 0x03,
ALaw = 0x06,
MuLaw = 0x07,
IMAADPCM = 0x11,
YamahaITUG723ADPCM = 0x16,
GSM610 = 0x31,
ITUG721ADPCM = 0x40,
MPEG = 0x50,
Extensible = 0xFFFE
};

int32 chunkid = 0;
bool datachunk = false;
while ( !datachunk ) {
chunkid = reader.ReadInt32( );
switch ( (WavChunks)chunkid ) {
case WavChunks::Format:
formatsize = reader.ReadInt32( );
format = (WavFormat)reader.ReadInt16( );
channels = (Channels)reader.ReadInt16( );
channelcount = (int)channels;
samplerate = reader.ReadInt32( );
bitspersecond = reader.ReadInt32( );
formatblockalign = reader.ReadInt16( );
bitdepth = reader.ReadInt16( );
if ( formatsize == 18 ) {
int32 extradata = reader.ReadInt16( );
reader.Seek( extradata, SeekOrigin::Current );
}
break;
case WavChunks::RiffHeader:
headerid = chunkid;
memsize = reader.ReadInt32( );
riffstyle = reader.ReadInt32( );
break;
case WavChunks::Data:
datachunk = true;
datasize = reader.ReadInt32( );
break;
default:
int32 skipsize = reader.ReadInt32( );
reader.Seek( skipsize, SeekOrigin::Current );
break;
}
}

Correct reading of samples from .wav file

I'm a Java programmer, not C++, but I've dealt with this often.

The PCM data is organized by frame. If it's mono, little-endian, 16-bit the first byte will be the lower half of the value, and the second byte will be the upper and include the sign bit. Big-endian will reverse the bytes. If it's stereo, a full frame (I think it's left then right but I'm not sure) is presented intact before moving on to the next frame.

I'm kind of amazed at all the code being shown. In Java, the following suffices for PCM encoded as signed values:

public short[] fromBufferToPCM(short[] audioPCM, byte[] buffer)
{
for (int i = 0, n = buffer.length; i < n; i += 2)
{
audioPCM[i] = (buffer[i] & 0xff) | (buffer[i + 1] << 8);
}

return audioBytes;
}

IDK how to translate that directly to C++, but we are simply OR-ing together the two bytes with the second one first being shifted 8 places to the left. The pure shift picks up the sign bit. (I can't recall why the & 0xff was included--I wrote this a long while back and it works.)

Curious why so many answers are in the comments and not posted as answers. I thought comments were for requests to clarify the OP's question.



Related Topics



Leave a reply



Submit