Differencebetween Tr and Gsub

Performance implications of string#gsub chains?


str.tr('-_', ' ') 

is worth considering (doc)

using gsub in ruby strings correctly

Your second attempt was very close. The problem is that you left a space after the closing bracket, meaning it was only looking for one of those symbols followed by a space.

Try this:

channelName = rhash["Channel"].gsub(/[':;]/, " ")

What is Enumerator object? (Created with String#gsub)

An Enumerator object provides some methods common to enumerations -- next, each, each_with_index, rewind, etc.

You're getting the Enumerator object here because gsub is extremely flexible:

gsub(pattern, replacement) → new_str
gsub(pattern, hash) → new_str
gsub(pattern) {|match| block } → new_str
gsub(pattern) → enumerator

In the first three cases, the substitution can take place immediately, and return a new string. But, if you don't give a replacement string, a replacement hash, or a block for replacements, you get back the Enumerator object that lets you get to the matched pieces of the string to work with later:

irb(main):022:0> s="one two three four one"
=> "one two three four one"
irb(main):023:0> enum = s.gsub("one")
=> #<Enumerable::Enumerator:0x7f39a4754ab0>
irb(main):024:0> enum.each_with_index {|e, i| puts "#{i}: #{e}"}
0: one
1: one
=> " two three four "
irb(main):025:0>

Why is String#split( \n ) and Array#join(' ') quicker than String#gsub(/\n/, ' ')?

I think gsub takes more time for two reasons:

The first is that using a regex engine has an initial cost, at least to parse the pattern.

The second and probably the most important here is that the regex engine works with a dumb walk character by character and tests the pattern for each positions in the string when the split (with a literal string here) uses a fast string search algorithm (probably the Boyer-Moore algorithm).

Note that even if the split/join way is faster, it uses probably more memory since this way needs to generate new strings.

Note2: some regex engines are able to use this fast string search algorithm before the walk to find positions, but I have no informations about this for the ruby regex engine.

Note3: It may be interesting to have a better idea of what happens to include tests with few repeatitions but with larger strings. [edit] After several tests with @spickermann code, it seems that it doesn't change anything (or nothing very significative) even with very few repetitions. So the initial cost may be not so important.

String#gsub to maintain case?

How about using String#tr:

'Strings'.tr('sS', 'zZ')
# => "Ztringz"

Ruby multiple string replacement

Since Ruby 1.9.2, String#gsub accepts hash as a second parameter for replacement with matched keys. You can use a regular expression to match the substring that needs to be replaced and pass hash for values to be replaced.

Like this:

'hello'.gsub(/[eo]/, 'e' => 3, 'o' => '*')    #=> "h3ll*"
'(0) 123-123.123'.gsub(/[()-,. ]/, '') #=> "0123123123"

In Ruby 1.8.7, you would achieve the same with a block:

dict = { 'e' => 3, 'o' => '*' }
'hello'.gsub /[eo]/ do |match|
dict[match.to_s]
end #=> "h3ll*"


Related Topics



Leave a reply



Submit