Ruby Mp3 Id3 Parsing

Ruby mp3 Id3 parsing

http://www.hakubi.us/ruby-taglib/

I used this for a project and it worked quite well. Wrapper around taglib, which is very portable.

Read ID3 Tags of Remote MP3 File in Ruby/Rails?

which Ruby version are you using?

which ID3 Tag version are you trying to read?

ID3v1 tags are at the end of a file, in the last 128 bytes. With Net::HTTP it doesn't seem to be possible to seek forward towards the end of the file and read only the last N bytes. If you try that, using
headers = {"Range" => "bytes=128-"} , it always seems to download the complete file. resp.body.size => file-size . But no big loss, because ID3 version 1 is pretty much outdated at this point because of it's limitations, such as fixed length format, only ASCII text, ...). iTunes uses ID3 version 2.2.0.

ID3v2 tags are at the beginning of a file - to support streaming - you can download the initial part of the MP3 file, which contains the ID3v2 header, via HTTP protocol >= 1.1

The short answer:

require 'net/http'
require 'uri'
require 'id3'    # id3 RUby library
require 'hexdump'

file_url = 'http://example.com/filename.mp3'
uri = URI(file_url)

size = 1000   # ID3v2 tags can be considerably larger, because of embedded album pictures

Net::HTTP.version_1_2  # make sure we use higher HTTP protocol version than 1.0
http = Net::HTTP.new(uri.host, uri.port)

resp = http.get( file_url , {'Range' => "bytes=0-#{size}"} )
# should check the response status codes here.. 

if resp.body =~ /^ID3/   # we most likely only read a small portion of the ID3v2 tag..
   # file has ID3v2 tag

   puts resp.body.hexdump

   tag2 = ID3::Tag2.new
   tag2.read_from_buffer( resp.body )
   @id3_tag_size = tag2.ID3v2tag_size   # that's the size of the whole ID3v2 tag
                                        # we should now re-fetch the tag with the correct / known size
   # ...
end

e.g.:

 index       0 1 2 3  4 5 6 7  8 9 A B  C D E F

00000000    ["49443302"] ["00000000"] ["11015454"] ["3200000d"]    ID3.......TT2...
00000010    ["004b6167"] ["75796120"] ["48696d65"] ["00545031"]    .Kaguya Hime.TP1
00000020    ["00000e00"] ["4a756e6f"] ["20726561"] ["63746f72"]    ....Juno reactor
00000030    ["0054414c"] ["00001100"] ["4269626c"] ["65206f66"]    .TAL....Bible of
00000040    ["20447265"] ["616d7300"] ["54524b00"] ["00050036"]     Dreams.TRK....6
00000050    ["2f390054"] ["59450000"] ["06003139"] ["39370054"]    /9.TYE....1997.T
00000060    ["434f0000"] ["1300456c"] ["65637472"] ["6f6e6963"]    CO....Electronic
00000070    ["612f4461"] ["6e636500"] ["54454e00"] ["000d0069"]    a/Dance.TEN....i
00000080    ["54756e65"] ["73207632"] ["2e300043"] ["4f4d0000"]    Tunes v2.0.COM..
00000090    ["3e00656e"] ["67695475"] ["6e65735f"] ["43444442"]    >.engiTunes_CDDB
000000a0    ["5f494473"] ["00392b36"] ["34374334"] ["36373436"]    _IDs.9+647C46746
000000b0    ["38413234"] ["38313733"] ["41344132"] ["30334544"]    8A248173A4A203ED
000000c0    ["32323034"] ["4341422b"] ["31363333"] ["39390000"]    2204CAB+163399..
000000d0    ["00000000"] ["00000000"] ["00000000"] ["00000000"]    ................

The long answer looks something like this: (you'll need id3 library version 1.0.0_pre or newer)

require 'net/http'
require 'uri'
require 'id3'    # id3 RUby library                                                                                      
require 'hexdump'

file_url = 'http://example.com/filename.mp3'

def get_remote_id3v2_tag( file_url )    # you would call this..
  id3v2tag_size = get_remote_id3v2_tag_size( file_url )
  if id3v2tag_size > 0
    buffer = get_remote_bytes(file_url, id3v2tag_size )
    tag2 = ID3::Tag2.new
    tag2.read_from_buffer( buffer )
    return tag2
  else
    return nil
  end
end

private
def get_remote_id3v2_tag_size( file_url )
  buffer = get_remote_bytes( file_url, 100 )
  if buffer.bytesize > 0
    return buffer.ID3v2tag_size
  else
    return 0
  end
end

private
def get_remote_bytes( file_url, n)
  uri = URI(file_url)
  size = n   # ID3v2 tags can be considerably larger, because of embedded album pictures                                 
  Net::HTTP.version_1_2  # make sure we use higher HTTP protocol version than 1.0                                        
  http = Net::HTTP.new(uri.host, uri.port)
  resp = http.get( file_url , {'Range' => "bytes=0-#{size-1}"} )                                                                     
  resp_code = resp.code.to_i
  if (resp_code >= 200 && resp_code < 300) then
    return resp.body
  else
    return ''
  end
end

get_remote_id3v2_tag_size( file_url )  
 => 2262

See:

http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35

http://en.wikipedia.org/wiki/Byte_serving

some examples how to download in parts files can be found here:

but please note that there seems to be no way to start downloading "in the middle"

How do I download a binary file over HTTP?

http://unixgods.org/~tilo/Ruby/ID3/docs/index.html

Use ruby mp3info to read mp3 ID3 from external site (without loading whole file)

Well this solution works for id3v2 (the current standard). ID3V1 doesn't have the metadata at the beginning of the file, so it wouldn't work in those cases.

This reads the first 4096 bytes of the file, which is arbitrary. As far as I could tell from the ID3 documentation, there is no limit to the size, but 4kb was when I stopped getting parsing errors in my library.

I was able to build a simple dropbox audio player, which can be seen here:
soundstash.heroku.com

and open-sourced the code here: github.com/miketucker/Dropbox-Audio-Player

require 'open-uri'
require 'stringio'
require 'net/http'
require 'uri'
require 'mp3info'

url = URI.parse('http://example.com/filename.mp3') # turn the string into a URI
http = Net::HTTP.new(url.host, url.port) 
req = Net::HTTP::Get.new(url.path) # init a request with the url
req.range = (0..4096) # limit the load to only 4096 bytes
res = http.request(req) # load the mp3 file
child = {} # prepare an empty array to store the metadata we grab
Mp3Info.open( StringIO.open(res.body) ) do |m|  #do the parsing
    child['title'] = m.tag.title 
    child['album'] = m.tag.album 
    child['artist'] = m.tag.artist
    child['length'] = m.length 
end

Access More Info of File using Ruby

In general, the answer is no: OS X is using specific libraries to access the metadata of the files based on type. These are not stored in a common attribute manner in the filesystem, but are inherent to the data. For example, PNG and JPG files record their height and width differently and can store different types of metadata about the image. The OS is reading these files and extracting this information for the More Info section.

In specific, however, the answer is yes: you want an ID3 library for Ruby like taglib-ruby or ruby-taglib. See the question Ruby mp3 Id3 parsing for more information.

Read audio file metadata with ruby

Taglib has Ruby bindings and does what you want.

c# MP3 ID3 data (without taglib)

The code you provided only supports ID3v1. That version does not support images.

However, the ID3v2 tag does support images. See section 4.15 of the 2.3 informal standard for an explanation of the "attached image" tag. Note that you'll need to write a full ID3v2 parser if you wish to read out the image without a tag library.

Good luck!

How symbian V3 get id3 info from a MP3 file

You can get the ID3 data from an MP3 using the CMetaDataUtility API. It's not part of the public SDK, but you can download the SDK API Plug-in via this Nokia Developer page that also has example code.

This will work in S60 3rd edition, 5th edition, and Symbian^3.

How to normalise the filename to just the mp3 filename with no path ? Ruby

Preliminary remark : post all the code and only the minimum of code, so that we can copy-paste and execute it to reproduce the error. An RSpec tag and the version of RSpec would also be useful in this case.

When I execute your code :

   No such file or directory @ dir_chdir - ./spec/fixtures/mp3s
 # ./lib/t_a.rb:14:in `chdir'

the error is in the statement at line 14 :

Dir.chdir(@path)

This gives a clue that chdirdoes not find the requested subdirectory in the current working directory. Why ? Add a trace to display the current working directory :

def files
    puts "in files, path=#{@path}"
    puts "wd=...#{Dir.getwd.sub(/.*ruby(.*)/, '\1')}"
    current_dir = Dir.getwd
    Dir.chdir(@path)
...

and run the tests (I'm working in ...devl/ruby/zintlist/mp3_importer) :

$ rspec

MP3Importer
  #initialize
    accepts a file path to parse mp3 files from
  #files
in files, path=./spec/fixtures/mp3s
wd=.../zintlist/mp3_importer
    loads all the mp3 files in the path directory
  #xxxx
in files, path=./spec/fixtures/mp3s
wd=.../zintlist/mp3_importer/spec/fixtures/mp3s

and you see the difference :

wd=.../zintlist/mp3_importer
wd=.../zintlist/mp3_importer/spec/fixtures/mp3s

When executing files, you have a side effect : the current directory is changed. In the second execution of files, Dir.chdir starts searching in the current directory left by the first execution, that is .../mp3_importer/spec/fixtures/mp3s, and mp3s of course does not contain ./spec/fixtures/mp3s, hence the error No such file or directory.

The solution is to restore the directory which is current when entering the method :

def files
    puts "in files, path=#{@path}"
    puts "wd=...#{Dir.getwd.sub(/.*ruby(.*)/, '\1')}"
    current_dir = Dir.getwd
    Dir.chdir(@path)
    filenames = Dir.glob("*.mp3")
    Dir.chdir(current_dir)
    filenames
end

Then the trace shows that it has been restored :

wd=.../zintlist/mp3_importer
...
wd=.../zintlist/mp3_importer

You may already know that if you process a file inside a File.open ... do ... end block, the file is closed when the block exits. The same works for restoring the current directory. From The Pickaxe Dir.chdir :

If a block is given, it is passed the name of the new current
directory, and the block is executed with that as the current
directory. The original working directory is restored when the block
exits.

Given these files :

#file t.rb

class MP3Importer
    attr_accessor :path

    def initialize(path)
        @path = path
    end

    def files
#        puts "in files, path=#{@path}"
#        puts "wd=#{Dir.getwd.sub(/.*ruby(.*)/, '\1')}"
        filenames = Dir.chdir(@path) do | path |
#            puts path
            Dir.glob("*.mp3")
        end
        puts "names=#{filenames}"
        filenames
    end
end

# file t_spec.rb

require 't'

RSpec.describe MP3Importer do
    let(:test_music_path) { "./spec/fixtures/mp3s" }
    let(:music_importer)  { MP3Importer.new(test_music_path) }

    describe '#initialize' do
        it 'accepts a file path to parse mp3 files from' do
            expect(music_importer.path).to eq(test_music_path)
        end
    end

    describe '#files' do
        it 'loads all the mp3 files in the path directory' do
            expect(music_importer.files.size).to eq(4)
        end
    end

    describe '#xxxx' do
        it 'normalizes the filename to just the mp3 filename with no path' do
            expect(music_importer.files).to include('f4.mp3')
        end
    end
end

Execution :

$ ruby -v
ruby 2.4.0rc1 (2016-12-12 trunk 57064) [x86_64-darwin15]
$ rspec -v
RSpec 3.6.0.beta2
  - rspec-core 3.6.0.beta2
  - rspec-expectations 3.6.0.beta2
  - rspec-mocks 3.6.0.beta2
  - rspec-support 3.6.0.beta2
$ rspec

MP3Importer
  #initialize
    accepts a file path to parse mp3 files from
  #files
names=["f1.mp3", "f2.mp3", "f3.mp3", "f4.mp3"]
    loads all the mp3 files in the path directory
  #xxxx
names=["f1.mp3", "f2.mp3", "f3.mp3", "f4.mp3"]
    normalizes the filename to just the mp3 filename with no path

Finished in 0.00315 seconds (files took 0.09868 seconds to load)
3 examples, 0 failures

All tests are green.

As the method return value is that of the last executed expression, you can simplify files like so :

def files
    Dir.chdir(@path) do | path |
        Dir.glob("*.mp3")
    end
end

What does the statement "Normalise ... mean ?

I don't know. I suppose it's collecting only the files whose name correspond to a certain pattern, here *.mp3.

What I can say is that RDoc takes input file names from the command line and passes them to a routine called normalized_file_list:

# file rdoc.rb
  ##
  # Given a list of files and directories, create a list of all the Ruby
  # files they contain.
  #
  # If +force_doc+ is true we always add the given files, if false, only
  # add files that we guarantee we can parse.  It is true when looking at
  # files given on the command line, false when recursing through
  # subdirectories.
  #
  # The effect of this is that if you want a file with a non-standard
  # extension parsed, you must name it explicitly.

  def normalized_file_list(relative_files, force_doc = false,
                           exclude_pattern = nil)
    file_list = []

    relative_files.each do |rel_file_name|
      next if rel_file_name.end_with? 'created.rid'
      next if exclude_pattern && exclude_pattern =~ rel_file_name
      stat = File.stat rel_file_name rescue next

      case type = stat.ftype
      when "file" then
        next if last_modified = @last_modified[rel_file_name] and
                stat.mtime.to_i <= last_modified.to_i

        if force_doc or RDoc::Parser.can_parse(rel_file_name) then
          file_list << rel_file_name.sub(/^\.\//, '')
          @last_modified[rel_file_name] = stat.mtime
        end
      when "directory" then
        next if rel_file_name == "CVS" || rel_file_name == ".svn"

        created_rid = File.join rel_file_name, "created.rid"
        next if File.file? created_rid

        dot_doc = File.join rel_file_name, RDoc::DOT_DOC_FILENAME

        if File.file? dot_doc then
          file_list << parse_dot_doc_file(rel_file_name, dot_doc)
        else
          file_list << list_files_in_directory(rel_file_name)
        end
      else
        warn "rdoc can't parse the #{type} #{rel_file_name}"
      end
    end

    file_list.flatten
  end

  ##
  # Return a list of the files to be processed in a directory. We know that
  # this directory doesn't have a .document file, so we're looking for real
  # files. However we may well contain subdirectories which must be tested
  # for .document files.

  def list_files_in_directory dir
    files = Dir.glob File.join(dir, "*")

    normalized_file_list files, false, @options.exclude
  end

Ruby Mp3 Id3 Parsing