How to do a safe join pathname in ruby?
I recommend using File.join
>> File.join("path", "to", "join")
=> "path/to/join"
Implement #absolute_path for Pathname child
Are you trying to get a faux-realpath for files that don't exist? If so, you might be better served by using .join
and handing in the necessary components:
Pathname.new(Dir.pwd).join("some/nonexistent/path")
If you have a file that exists, but your path string is a fragment and you need to provide another base directory to realpath
, you can do that too:
path = Pathname.new('/path/and/file.jpg')
path.realpath('some/existing')
#=> '/some/existing/path/and/file.jpg'
An example implementation might be…
class FileResolver
BASE_DIR = Dir.pwd.freeze
def initialize(filepath)
@filepath = filepath
end
def absolute_path
Pathname.new(BASE_DIR).join(@filepath)
end
end
Or, if your files aren't relative to this class but instead your Rails project, a better way is to use Rails.root.join
:
Rails.root.join('your/path', 'some/filename.txt')
How to make a Ruby string safe for a filesystem?
From http://web.archive.org/web/20110529023841/http://devblog.muziboo.com/2008/06/17/attachment-fu-sanitize-filename-regex-and-unicode-gotcha/:
def sanitize_filename(filename)
returning filename.strip do |name|
# NOTE: File.basename doesn't work right with Windows paths on Unix
# get only the filename, not the whole path
name.gsub!(/^.*(\\|\/)/, '')
# Strip out the non-ascii character
name.gsub!(/[^0-9A-Za-z.\-]/, '_')
end
end
Ruby's Dir vs File vs Pathname?
According to the Ruby docs for Dir, File, and Pathname, they definitely appear to have a lot in common.
The principle different between Dir
and File
seems to be that Dir
assumes the object it's working with is a directory and File
assumes files. For most purposes they can apparently be used interchangeably, but even if the code works, it might be confusing to anyone reading your code if you manipulate directories using File
and files using Dir
.
Pathname
looks to be a multi-OS method of locating files and directories. Since Windows and *nix machines handle file management differently it can be a pain to refer to files or directories in an OS-specific way if you want scripts to run anywhere. From the docs:
Pathname represents a pathname which locates a file in a filesystem. The pathname depends on OS: Unix, Windows, etc. Pathname library works with pathnames of local OS. However non-Unix pathnames are supported experimentally.
It does not represent the file itself. A Pathname can be relative or absolute. It’s not until you try to reference the file that it even matters whether the file exists or not.
Pathname is immutable. It has no method for destructive update.
Hope this helps.
How to split a directory string in Ruby?
There's no built-in function to split a path into its component directories like there is to join them, but you can try to fake it in a cross-platform way:
directory_string.split(File::SEPARATOR)
This works with relative paths and on non-Unix platforms, but for a path that starts with "/"
as the root directory, then you'll get an empty string as your first element in the array, and we'd want "/"
instead.
directory_string.split(File::SEPARATOR).map {|x| x=="" ? File::SEPARATOR : x}
If you want just the directories without the root directory like you mentioned above, then you can change it to select from the first element on.
directory_string.split(File::SEPARATOR).map {|x| x=="" ? File::SEPARATOR : x}[1..-1]
Escape spaces in a linux pathname with Ruby gsub
Stefan is right; I just want to point out that if you have to escape strings for shell use you should check Shellwords::shellescape
:
require 'shellwords'
puts Shellwords.shellescape "/mnt/drive/site/usa/1201 East/1201 East Invoice.pdf"
# prints /mnt/drive/site/usa/1201\ East/1201\ East\ Invoice.pdf
# or
puts "/mnt/drive/site/usa/1201 East/1201 East Invoice.pdf".shellescape
# prints /mnt/drive/site/usa/1201\ East/1201\ East\ Invoice.pdf
# or (as reported by @hagello)
puts shellwords.escape "/mnt/drive/site/usa/1201 East/1201 East Invoice.pdf"
# prints /mnt/drive/site/usa/1201\ East/1201\ East\ Invoice.pdf
Extract filename with and without terminating characters
Do it in three stages.
- Split on
;
to separate out the statements. - Split the key/value pair on
=
. - Deal with the quoting of the value.
Here's a basic example.
def get_value(line)
# Split into statements
statements = line.split(/\s*;\s*/)
# Extract the value of the 2nd statement
_,value = statements[1].split(/\s*=\s*/)
# Strip the quotes
value.gsub!(/^(['"]?)(.*)\1$/, '\2')
return value
end
There's a few edge cases that doesn't handle: What if the statement you're interested in isn't the second one? But that can be fixed up as needed. It's a lot easier to improve your parsing when it's done in multiple steps rather than trying to cram it into one regex.
For example, this correctly handles embedded and escaped quotes like %q[inline; filename="name's.extension"]
and %q[inline; filename="name's.\\"extension\\""]
.
If you really want to do it as a single regex, ok, you asked for it.
re = /
\bfilename
\s*=\s*
(?:
(?<quote>['"])(?<value>.*)\k<quote> |
(?<value>[^;]+)
)
/x
return re.match(line)['value']
That splits the handling of the extension into two alternatives: one with quotes and one without. Otherwise filename=name.ext;
will pick up the semicolon and I can't figure out another way to stop it that doesn't introduce a new problem.
For example, /\bfilename\s*=\s*(?<quote>['"]?)(?<value>.*?)\k<quote>;?$/
will work on the test data, but then it will fail if there's anything after the semicolon like %q[inline; filename='name.extension'; foo]
.
You asked for expert regex knowledge. Part of being a regex expert is to know when you shouldn't use a regex. This should probably be handled with a grammar or you'll be constantly chasing edge cases.
Related Topics
Supporting Ruby 1.9's Hash Syntax in Ruby 1.8
How to Install JSON Gem - Failed to Build Gem Native Extension(MAC 10.10)
How to Alter the Timezone of a Datetime in Ruby
How to Fix Rubygems Recent Deprecation Warning
Validate Words Against an English Dictionary in Rails
How to Download via Http Only Piece of Big File with Ruby
Why Can't Singleton Methods Be Defined on Symbols or Fixnums
When Would a Ruby Flip-Flop Be Useful
Get Single Char from Console Immediately
Openssl::Ssl::Sslerror: Ssl_Connect Returned=1 Errno=0 State=Unknown State: Unknown Protocol
Using Custom To_JSON Method in Nested Objects
Why Many People Use "-%>" Instead of "%>" in Rails
Asynchronous Http Request in Ruby
(Ruby) Getting Net::Smtp Working with Gmail...
Warning: Can't Verify Csrf Token Authenticity in Case of API Development