How do I plot the spectrum of a wav file using FFT?
The signature is
public static void FFT( float[] data, int length, FourierDirection direction )
- You pass an array of complex numbers, represented as pairs. Since you only have real numbers (the samples), you should put your samples in the even locations in the array - data[0], data[2], data[4] and so on. Odd locations should be 0, data[1] = data[3] = 0...
- The length is the amount of samples you want to calculate your FFT on, it should be exactly half of the length of the data array. You can FFT your entire WAV or parts of it - depends on what you wish to see. Audacity will plot the power spectrum of the selected part of the file, if you wish to do the same, pass the entire WAV or the selected parts.
- FFT will only show you frequencies up to half of your sampling rate. So you should have values between 0 and half your sampling rate. The amount of values depends on the amount of samples you have (the amount of samples will affect the precision of the calculation)
- Audacity plots the power spectrum. You should take each complex number pair in the array you receive and calculate its ABS. ABS is defined as sqrt(r^2+i^2). Each ABS value will correspond to a single frequency.
Here's an example of a working code:
float[] data = new float[8];
data[0] = 1; data[2] = 1; data[4] = 1; data[6] = 1;
Fourier.FFT(data, data.Length/2, FourierDirection.Forward);
I'm giving it 4 samples, all the same. So I expect to get something only at frequency 0. And indeed, after running it, I get
data[0] == 1, data[2] == 1, data[4] == 1, data[6] == 1
And others are 0.
If I want to use the Complex array overload
Complex[] data2 = new Complex[4];
data2[0] = new Complex(1,0);
data2[1] = new Complex(1, 0);
data2[2] = new Complex(1, 0);
data2[3] = new Complex(1, 0);
Fourier.FFT(data2,data2.Length,FourierDirection.Forward);
Please note that here the second parameter equals the length of the array, since each array member is a complex number. I get the same result as before.
I think I missed the complex overload before. I seems less error prone and more natural to use, unless your data already comes in pairs.
sox convert to spectogram parameters meaning
The official sox manual describes the parameters in full and the source code is here spectrogram.c.
But briefly:
−X num:
X-axis pixels/second; the default is auto-calculated to fit the given
or known audio duration to the X-axis size, or 100 otherwise. If given
in conjunction with −d, this option affects the width of the
spectrogram; otherwise, it affects the duration of the spectrogram.
num can be from 1 (low time resolution) to 5000 (high time resolution)
and need not be an integer.
and
-Y num:
Sets the target total height of the spectrogram(s). The default value is 550
pixels. Using this option (and by default), SoX will
choose a height for individual spectrogram channels that is one more
than a power of two, so the actual total height may fall short of the
given number.
For -X 50
, the horizontal time resolution is:
dt = 1000/50 = 20 ms/pixel
For -Y 200
the largest power of 2 less than 200 is 128. Assuming a sampling rate of 44.1 kHz, the frequency resolution is:
bin_size = 44100/128 = 344.5 Hz
Any way I can get SoX to just print the amplitude values from a wav file?
If you want the data specifically for use in C++, it's very easy to use something like Libsndfile. It's a pretty mature C library, but comes with a convenient C++ wrapper (sndfile.hh).
Here's example usage lifted from something I wrote recently where I needed easy access to audio data.
std::string infile_name = "/path/to/vocal2.wav";
// Open input file.
SndfileHandle infile_handle( infile_name );
if( !infile_handle || infile_handle.error() != 0 )
{
std::cerr << "Unable to read " << infile_name << std::endl;
std::cerr << infile_handle.strError() << std::endl;
return 1;
}
// Show file stats
int64_t in_frames = infile_handle.frames();
int in_channels = infile_handle.channels();
int in_samplerate = infile_handle.samplerate();
std::cerr << "Input file: " << infile_name << std::endl;
std::cerr << " * Frames : " << std::setw(6) << in_frames << std::endl;
std::cerr << " * Channels : " << std::setw(6) << in_channels << std::endl;
std::cerr << " * Sample Rate : " << std::setw(6) << in_samplerate << std::endl;
// Read audio data as float
std::vector<float> in_data( in_frames * in_channels );
infile_handle.read( in_data.data(), in_data.size() );
If you just want to use SoX on the command line and get text output, you can do something like this:
sox vocal2.wav -t f32 - | od -ve -An | more
Here I've specified an output of raw 32-bit float, and run it through GNU od. It's a little frustrating that you can't tell od how many columns you want, but you can clean that up with other simple tools. Have a look at the manpage for od if you want different sample encodings.
How do I get an audio file sample rate using sox?
just use:
soxi <filename>
or
sox --i <filename>
to produce output such as:
Input File : 'final.flac'
Channels : 4
Sample Rate : 44100
Precision : 16-bit
Duration : 00:00:11.48 = 506179 samples = 860.849 CDDA sectors
File Size : 2.44M
Bit Rate : 1.70M
Sample Encoding: 16-bit FLAC
Comment : 'Comment=Processed by SoX'
The latter one is in case you're using the win32 version that doesn't include soxi, by default. To grab the sample rate only, just use:
soxi -r <filename>
or
sox --i -r <filename>
which will return the sample rate alone.
Extract Fast Fourier Transform data from file
Here's the final solution to what I was trying to achieve, thanks a lot to Randall Cook's helpful advice. The code to extract sound wave and FFT of a wav file in Ruby:
require "ruby-audio"
require "fftw3"
fname = ARGV[0]
window_size = 1024
wave = Array.new
fft = Array.new(window_size/2,[])
begin
buf = RubyAudio::Buffer.float(window_size)
RubyAudio::Sound.open(fname) do |snd|
while snd.read(buf) != 0
wave.concat(buf.to_a)
na = NArray.to_na(buf.to_a)
fft_slice = FFTW3.fft(na).to_a[0, window_size/2]
j=0
fft_slice.each { |x| fft[j] << x; j+=1 }
end
end
rescue => err
log.error "error reading audio file: " + err
exit
end
# now I can work on analyzing the "fft" and "wave" arrays...
Related Topics
What Does the "Mov Rax, Qword Ptr Fs:0X28" Assembly Instruction Do
Restoring Stdout and Stderr to Default Value
Getting Meteor 0.9.2 Build to Work Osx -> Linux
How Can a Process Try to Access Other Process's Memory in Linux Virtual Memory System
Haskell Ghc Compiling/Linking Error, Not Creating Executable. (Linux)
Cron Error with Using Backquotes
Where Is the 'Sdk' Command Installed for Sdkman
What Is the Meaning of !#:* !#:1- in a Bash Command
How to Build an If Condition in Shell to Check Whether Curl Succeeded
Start X86_64 Code on X86 (32Bit) Linux, Running on X86_64 Cpu
How to Check If a File Contains Only Zeros in a Linux Shell
Direct Control of Hci Device (Bypass Bluetooth Drivers) on Linux
Shell Command to Update Pom File from a Variable
Save Modifications in Place with Non Gnu Awk