How to Convert Data of Int16 Audio Samples to Array of Float Audio Samples

How to convert Data of Int16 audio samples to array of float audio samples

"Casting" or "rebinding" a pointer only changes the way how memory is
interpreted. You want to compute floating point values from integers,
the new values have a different memory representation (and also a different
size).

Therefore you somehow have to iterate over all input values
and compute the new values. What you can do is to omit the Array
creation:

let samples = sampleData.withUnsafeBytes {
UnsafeBufferPointer<Int16>(start: $0, count: sampleData.count / MemoryLayout<Int16>.size)
}
return samples.map { Float($0) / Float(Int16.max) }

Another option would be to use the vDSP functions from the
Accelerate framework:

import Accelerate
// ...

let numSamples = sampleData.count / MemoryLayout<Int16>.size
var factor = Float(Int16.max)
var floats: [Float] = Array(repeating: 0.0, count: numSamples)

// Int16 array to Float array:
sampleData.withUnsafeBytes {
vDSP_vflt16($0, 1, &floats, 1, vDSP_Length(numSamples))
}
// Scaling:
vDSP_vsdiv(&floats, 1, &factor, &floats, 1, vDSP_Length(numSamples))

I don't know if that is faster, you'll have to check.
(Update: It is faster, as ColGraff demonstrated in his answer.)

An explicit loop is also much faster than using map:

let factor = Float(Int16.max)
let samples = sampleData.withUnsafeBytes {
UnsafeBufferPointer<Int16>(start: $0, count: sampleData.count / MemoryLayout<Int16>.size)
}
var floats: [Float] = Array(repeating: 0.0, count: samples.count)
for i in 0..<samples.count {
floats[i] = Float(samples[i]) / factor
}
return floats

An additional option in your case might be to use CMBlockBufferGetDataPointer() instead of CMBlockBufferCopyDataBytes()
into allocated memory.

How to convert pcm samples in byte array as floating point numbers in the range -1.0 to 1.0 and back?

You ask two questions:

  1. How to downsample from 22kHz to 8kHz?

  2. How to convert from float [-1,1] to 16-bit int and back?

Note that the question has been updated to indicate that #1 is taken care of elsewhere, but I'll leave that part of my answer in in case it helps someone else.

1. How to downsample from 22kHz to 8kHz?

A commenter hinted that this can be solved with the FFT. This is incorrect (One step in resampling is filtering. I mention why not to use the FFT for filtering here, in case you are interested: http://blog.bjornroche.com/2012/08/when-to-not-use-fft.html).

One very good way to resample a signal is with a polyphase filter. However, this is quite complex, even for someone experienced in signal processing. You have several other options:

  • use a library that implements high quality resampling, like libsamplerate
  • do something quick and dirty

It sounds like you have already gone with the first approach, which is great.

A quick and dirty solution won't sound as good, but since you are going down to 8 kHz, I'm guessing sound quality isn't your first priority. One quick and dirty option is to:

  • Apply a low pass filter to the signal. Try to get rid of as much audio above 4 kHz as you can. You can use the filters described here (although ideally you want something much steeper than those filters, they are at least better than nothing).
  • select every 2.75th sample from the original signal to produce the new, resampled signal. When you need a non-integer sample, use linear interpolation. If you need help with linear interpolation, try here.

This technique should be more than good enough for voice applications. However, I haven't tried it, so I don't know for sure, so I strongly recommend using someone else's library.

If you really want to implement your own high quality sample rate conversion, such as a polyphase filter, you should research it, and then ask whatever questions you have on https://dsp.stackexchange.com/, not here.

2. How to convert from float [-1,1] to 16-bit int and back?

This was started by c.fogelklou already, but let me embellish.

To start with, the range of 16 bit integers is -32768 to 32767 (usually 16-bit audio is signed). To convert from int to float you do this:

float f;
int16 i = ...;
f = ((float) i) / (float) 32768
if( f > 1 ) f = 1;
if( f < -1 ) f = -1;

You usually do not need to do that extra "bounding", (in fact you don't if you really are using a 16-bit integer) but it's there in case you have some >16-bit integers for some reason.

To convert back, you do this:

float f = ...;
int16 i;
f = f * 32768 ;
if( f > 32767 ) f = 32767;
if( f < -32768 ) f = -32768;
i = (int16) f;

In this case, it usually is necessary to watch out for out of range values, especially values greater than 32767. You might complain that this introduces some distortion for f = 1. This issue is hotly debated. For some (incomplete) discussion of this, see this blog post.

This is more than "good enough for government work". In other words, it will work fine except in the case where you are concerned about ultimate sound quality. Since you are going to 8kHz, I think we have established that's not the case, so this answer is fine.

However, for completeness, I must add this: if you are trying to keep things absolutely pristine, keep in mind that this conversion introduces distortion. Why? Because the error when converting from float to int is correlated with the signal. It turns out that the correlation of that error is terrible and you can actually hear it, even though it's very small. (fortunately it's small enough that for things like speech and low-dynamic range music it doesn't matter much) To eliminate this error, you must use something called dither in the conversion from float to int. Again, if that's something you care about, research it and ask relevant, specific questions on https://dsp.stackexchange.com/, not here.

You might also be interested in the slides from my talk on the basics of digital audio programming, which has a slide on this topic, although it basically says the same thing (maybe even less than what I just said): http://blog.bjornroche.com/2011/11/slides-from-fundamentals-of-audio.html

convert numpy int16 audio array to float32

By convention, floating point audio data is normalized to the range of [-1.0,1.0] which you can do by scaling:

audio = audio.astype(np.float32, order='C') / 32768.0

This may fix the problem for you but you need to make sure that soundfile.write writes a wav header that indicates float32. It may do that automatically based on the dtype of the array.

How to play an array of [Int16] audio samples from memory in Swift

Had a quick look at SDL library and its audio capabilities. Seems you can just feed whatever buffer type you want, and it just works:

var desiredSpec = SDL_AudioSpec()
desiredSpec.freq = 48000
desiredSpec.format = SDL_AudioFormat(AUDIO_S16) // Specify Int16 as format
desiredSpec.samples = 1024

var obtainedSpec = SDL_AudioSpec()

SDL_OpenAudio(&desiredSpec, &obtainedSpec)
SDL_QueueAudio(1, samples, Uint32(sampleCount)) // Samples and count from original post
SDL_PauseAudio(0) // Starts playing, virtually no lag!

Would still appreciate any feedback on the original post/question, but in terms of a solution I think this is as good as (or better) than any.

How to convert an array of int16 sound samples to a byte array to use in MonoGame/XNA

This method will convert the samples data to the bytes array. It works with any channels count (tested on mono and stereo).

    public static byte[] GetSamplesWaveData(float[] samples, int samplesCount)
{
var pcm = new byte[samplesCount * 2];
int sampleIndex = 0,
pcmIndex = 0;

while (sampleIndex < samplesCount)
{
var outsample = (short)(samples[sampleIndex] * short.MaxValue);
pcm[pcmIndex] = (byte)(outsample & 0xff);
pcm[pcmIndex + 1] = (byte)((outsample >> 8) & 0xff);

sampleIndex++;
pcmIndex += 2;
}

return pcm;
}

Please note that the float[] samples values are expected to be in range [-1;1].

How to convert 16Bit byte array to audio clip data correctly?

Eventually I did it this way:

    public static float[] Convert16BitByteArrayToAudioClipData(byte[] source)
{
int x = sizeof(Int16);
int convertedSize = source.Length / x;
float[] data = new float[convertedSize];
Int16 maxValue = Int16.MaxValue;

for (int i = 0; i < convertedSize; i++)
{
int offset = i * x;
data[i] = (float)BitConverter.ToInt16(source, offset) / maxValue;
++i;
}

return data;
}

How to convert []Int16 to []float using C# and NAudio?

The return value from Stream.Read is the count of the number of bytes read, not what you're after. The data you want is in the buffer, but each 32-bit sample is spread across 4 8-bit bytes.

There are a number of ways to get the data as 32-bit float.

The first is to use an ISampleProvider which converts the data into the floating point format and gives a simple way to read the data in that format:

WaveFileReader reader = new WaveFileReader("wavetest.wav");
ISampleProvider provider = new Pcm16BitToSampleProvider(reader);

int blockSize = 2000;
float[] buffer = new float[blockSize];

// Read blocks of samples until no more available
int rc;
while ((rc = provider.Read(buffer, 0, blockSize)) > 0)
{
// Process the array of samples in here.
// rc is the number of valid samples in the buffer
// ....
}

Alternatively, there is a method in WaveFileReader that lets you read floating point samples directly. The downside is that it reads one sample group (, that is, one sample for each channel - one for mono, two for stereo) at a time, which can be time consuming. Reading and processing arrays is faster in most cases.

WaveFileReader reader = new WaveFileReader("wavetest.wav");
float[] buffer;

while ((buffer = reader.ReadNextSampleFrame()) != null)
{
// Process samples in here.
// buffer contains one sample per channel
// ....
}

Get audio samples from byte array

This page indicates read() is expecting a char* to store the data in. If you have set up the format of the audio device properly the data will indeed be 'segmented' as shorts in the char array and you can simply cast the char* to a short* before passing it to your library.

Convert int16 array to float

You already have your signed int16's so you just have to divide by the min and max int16 value respective to the sign.

let buffer = new Int16Array([0x0001, 0x7fff, 0x000, 0xffff, 0x8000]);
// [1, 32767, 0, -1, -32768]
let result = new Float32Array(buffer.length);
for(let i=0; i<buffer.length; i++) result[i] = buffer[i] / (buffer[i] >= 0 ? 32767 : 32768);
console.log(result[0], result[1], result[2], result[3], result[4]);
// 0.000030518509447574615 1 0 -0.000030517578125 -1


Related Topics



Leave a reply



Submit