How to Test Whether a String Would Match a Glob in Ruby

How do I test whether a string would match a glob in Ruby?

Yes, it is possible using the fnmatch method:

File.fnmatch("foo*", "food") #=> true

Check if string is a glob pattern

Next I want to get array of matches if string is glob pattern and in case when string is just plain path I want to get array with single element - this path.

They're both valid glob patterns. One contains a wildcard, one does not. Run them both through Pathname.glob() and you'll always get an array back. Bonus, it'll check if it matches anything.

$ irb
2.3.3 :001 > require "pathname"
=> true
2.3.3 :002 > Pathname.glob("test.data")
=> [#<Pathname:test.data>]
2.3.3 :003 > Pathname.glob("test.*")
=> [#<Pathname:test.asm>, #<Pathname:test.c>, #<Pathname:test.cpp>, #<Pathname:test.csv>, #<Pathname:test.data>, #<Pathname:test.dSYM>, #<Pathname:test.html>, #<Pathname:test.out>, #<Pathname:test.php>, #<Pathname:test.pl>, #<Pathname:test.py>, #<Pathname:test.rb>, #<Pathname:test.s>, #<Pathname:test.sh>]
2.3.3 :004 > Pathname.glob("doesnotexist")
=> []

This is a great way to normalize and validate your data early, so the rest of the program doesn't have to.


If you really want to figure out if something is a literal path or a glob, you could try scanning for any special glob characters, but that rapidly gets complicated and error prone. It requires knowing how glob works in detail and remembering to check for quoting and escaping. foo* has a glob pattern. foo\* does not. foo[123] does. foo\[123] does not. And I'm not sure what foo[123\] is doing, I think it counts as a non-terminated set.

In general, you want to avoid writing code that has to reproduce the inner workings of another piece of code. If there was a Pathname.has_glob_chars you could use that, but there isn't such a thing.

Pathname.glob uses File.fnmatch to do the globbing and you can use that without touching the filesystem. You might be able to come up with something using that, but I can't make it work. I thought maybe only a literal path will match itself, but foo* defeats that.

Instead, check if it exists.

Pathname.new(path).exist?

If it exists, it was a real path to a real file. If it didn't exist, it might have been a real path, or it might be a glob. That's probably good enough.

You can also check by looking to see if Pathname.glob(path) returned a single element that matches the original path. Note that when matching paths it's important to normalize both sides with cleanpath.

paths = Pathname.glob(path)

if paths.size == 1 && paths[0].cleanpath == Pathname.new(path).cleanpath
puts "#{path} is a literal path"
elsif paths.size == 0
puts "#{path} matched nothing"
else
puts "#{path} was a glob"
end

Check if file matching regex, glob, or wildcard pattern exists?

You can use Dir["/pictures/tank_man.*"]
It will return existing paths

Is there a way to write a glob pattern that matches all files except those in a folder?

If you really need to use a glob then you can list what is allowed, making it equivalent to the negation:

extglob = "{[^f]*,f,f[^o]*,fo,fo[^o]*,foo?*}/**/*"

File.fnmatch(extglob, "hello/world.js", File::FNM_EXTGLOB | File::FNM_PATHNAME)
#=> true

File.fnmatch(extglob, "test/some/globs", File::FNM_EXTGLOB | File::FNM_PATHNAME)
#=> true

File.fnmatch(extglob, "foo/bar/something.txt", File::FNM_EXTGLOB | File::FNM_PATHNAME)
#=> false

File.fnmatch(extglob, "food/bar/something.txt", File::FNM_EXTGLOB | File::FNM_PATHNAME)
#=> true

{[^f]*,f,f[^o]*,fo,fo[^o]*,foo?*} means:

  • Any string that doesn't start with f
  • The strinf f
  • Any string that starts with f and whose second character is not a o
  • The string fo
  • Any string that starts with fo and whose third character is not a o
  • Any string that starts with foo if it has at least one more character


Update

If your string literal is too long it could become a pain to generate a glob that negates it, so why not make a function for it?

def extglob_neg str
str.each_char.with_index.with_object([]) do |(_,i),arr|
arr << "#{str[0,i]}[^#{str[i]}]*"
arr << str[0..i]
end.join(',').prepend('{').concat('?*}')
end

extglob_neg "Food"
#=> "{[^F]*,F,F[^o]*,Fo,Fo[^o]*,Foo,Foo[^d]*,Food?*}"

note: I didn't implement any glob escaping in this function because it seemed a little complicated. I may be wrong though

Check if there is any file or directory matching pattern using ruby

Ruby-only

You could use Find, find and find :D.

I couldn't find any other File/Dir method that returns an Enumerator.

require 'find'
Find.find("/var/data/").find{|f| f=~/\.xml$/i }
#=> first xml file found inside "/var/data". nil otherwise
# or
Find.find("/var/data/").find{|f| File.extname(f).downcase == ".xml" }

If you really just want a boolean :

require 'find'
Find.find("/var/data/").any?{|f| f=~/\.xml$/i }

Note that if "/var/data/" exists but there is no .xml file inside it, this method will be at least as slow as Dir.glob.

As far as I can tell :

Dir.glob("/var/data/**/*.xml"){|f| break f}

creates a complete array first before returning its first element.

Bash-only

For a bash-only solution, you could use :

  • compgen
  • Shell find

Ruby Dir.glob alternative with Regexp

The thing with glob is that it can match subpaths. When you write Dir.glob('*/*') it'll return all the files and directories directly under subdirectories of the current directories. It can do that because glob patterns are simple enough for the computer to understand - if it was regex it would have to scan the entire filesystem and compare each path with the pattern, which is simply too much.

However, you can combine - use Dir.glob to choose where to search, and grep to choose what to search:

[1] pry(main)> Dir.glob('/usr/lib/*').grep(/\/lib[A-Z]+\.so$/)
=> ["/usr/lib/libFLAC.so", "/usr/lib/libEGL.so", "/usr/lib/libHACD.so", "/usr/lib/libLTO.so", "/usr/lib/libGLEW.so", "/usr/lib/libGL.so", "/usr/lib/libSDL.so", "/usr/lib/libGLU.so", "/usr/lib/libSM.so", "/usr/lib/libICE.so"]

How to Find a String in a Directory of Files Using grep and glob?

Your code looks almost good except .each missing

i did my test and it works fine.

Dir.glob('spec/features/*.rb').each do |f|
puts 'haha' if File.readlines(f).any?{|line| line.include?('capybara')}
end

Extglob syntax style matching in Ruby

The documentation that you linked to describes everything that can be done with FNM_EXTGLOB in Ruby; that is, the braces are the only extra functionality that you get while using that flag. Not sure if there are any external libraries but I doubt that there are.

In Ruby, how can I interpret (expand) a glob relative to a directory?

If I were implementing that behaviour, I would go with filtering an array, returned by Dir#entries:

Dir.entries("#{target}").select { |f| f =~ /\A#{filename}\z/i }

Please be aware that on unix platform both . and .. entries will be listed as well, but they are unlikely to be matched on the second step. Also, probably the filename should be escaped with Regexp.escape:

Dir.entries("#{target}").select { |f| f =~ /\A#{Regexp.escape(filename)}\z/i }


Related Topics



Leave a reply



Submit