Ruby string strip defined characters
There is no such method in ruby, but you can easily define it like:
def my_strip(string, chars)
chars = Regexp.escape(chars)
string.gsub(/\A[#{chars}]+|[#{chars}]+\z/, "")
end
my_strip " [la[]la] ", " []"
#=> "la[]la"
Strip ruby string of a specific control character
I figured it out! .gsub(/\u2028/, '')
Deleting all special characters from a string - ruby
You can do this
a.gsub!(/[^0-9A-Za-z]/, '')
How to achieve Python like string strip in Ruby?
You can use regular expressions:
"atestabctestcb".gsub(/(^[abc]*)|([abc]*$)/, '')
# => "testabctest"
Of course you can make this a method as well:
def strip_arbitrary(s, chars)
r = chars.chars.map { |c| Regexp.quote(c) }.join
s.gsub(/(^[#{r}]*)|([#{r}]*$)/, '')
end
strip_arbitrary("foobar", "fra") # => "oob"
How can I remove non-printable invisible characters from string?
First, let's figure out what the offending character is:
str = "Kanha"
p str.codepoints
# => [75, 97, 110, 104, 97, 8236]
The first five codepoints are between 0 and 127, meaning they're ASCII characters. It's safe to assume they're the letters K-a-n-h-a, although this is easy to verify if you want:
p [75, 97, 110, 104, 97].map(&:ord)
# => ["K", "a", "n", "h", "a"]
That means the offending character is the last one, codepoint 8236. That's a decimal (base 10) number, though, and Unicode characters are usually listed by their hexadecimal (base 16) number. 8236 in hexadecimal is 202C (8236.to_s(16) # => "202c"
), so we just have to google for U+202C.
Google very quickly tells us that the offending character is U+202C POP DIRECTIONAL FORMATTING and that it's a member of the "Other, Format" category of Unicode characters. Wikipedia says of this category:
Includes the soft hyphen, joining control characters (zwnj and zwj), control characters to support bi-directional text, and language tag characters
It also tells us that the "value" or code for the category is "Cf". If these sound like characters you want to remove from your string along with U+202C, you can use the \p{Cf}
property in a Ruby regular expression. You can also use \P{Print}
(note the capital P
) as an equivalent to [^[:print]]
:
str = "Kanha"
p str.length # => 6
p str.gsub(/\P{Print}|\p{Cf}/, '') # => "Kahna"
p str.gsub(/\P{Print}|\p{Cf}/, '').length # => 5
See it on repl.it: https://repl.it/@jrunning/DutifulRashTag
Remove everything after some characters
This can be relatively easily accomplished by running split("More info")
on your strings. What that does is breaks the string in to an array like so:
new_string = "Posted today More info Go to Last Post"
new_string = new_string.split("More info")
# becomes ["Posted today ", " Go to Last Post"]
What split does is it breaks a string apart in to an array, where each element is what preceded the argument. So if you have "1,2,3"
then split(",")
will return [1, 2, 3]
So to continue your solution, you can get the posting date like this:
new_string[0].strip
.strip
removes spaces at the front or back of a string, so you'll be left with just "Posted today"
Remove character from string if it starts with that character?
Why not just include the regex in the sub!
method?
string.sub!(/^1/, '')
How do I remove a substring after a certain character in a string using Ruby?
new_str = str.slice(0..(str.index('blah')))
What is the easiest way to remove the first character from a string?
I kind of favor using something like:
asdf = "[12,23,987,43"
asdf[0] = ''
p asdf
# >> "12,23,987,43"
I'm always looking for the fastest and most readable way of doing things:
require 'benchmark'
N = 1_000_000
puts RUBY_VERSION
STR = "[12,23,987,43"
Benchmark.bm(7) do |b|
b.report('[0]') { N.times { "[12,23,987,43"[0] = '' } }
b.report('sub') { N.times { "[12,23,987,43".sub(/^\[+/, "") } }
b.report('gsub') { N.times { "[12,23,987,43".gsub(/^\[/, "") } }
b.report('[1..-1]') { N.times { "[12,23,987,43"[1..-1] } }
b.report('slice') { N.times { "[12,23,987,43".slice!(0) } }
b.report('length') { N.times { "[12,23,987,43"[1..STR.length] } }
end
Running on my Mac Pro:
1.9.3
user system total real
[0] 0.840000 0.000000 0.840000 ( 0.847496)
sub 1.960000 0.010000 1.970000 ( 1.962767)
gsub 4.350000 0.020000 4.370000 ( 4.372801)
[1..-1] 0.710000 0.000000 0.710000 ( 0.713366)
slice 1.020000 0.000000 1.020000 ( 1.020336)
length 1.160000 0.000000 1.160000 ( 1.157882)
Updating to incorporate one more suggested answer:
require 'benchmark'
N = 1_000_000
class String
def eat!(how_many = 1)
self.replace self[how_many..-1]
end
def first(how_many = 1)
self[0...how_many]
end
def shift(how_many = 1)
shifted = first(how_many)
self.replace self[how_many..-1]
shifted
end
alias_method :shift!, :shift
end
class Array
def eat!(how_many = 1)
self.replace self[how_many..-1]
end
end
puts RUBY_VERSION
STR = "[12,23,987,43"
Benchmark.bm(7) do |b|
b.report('[0]') { N.times { "[12,23,987,43"[0] = '' } }
b.report('sub') { N.times { "[12,23,987,43".sub(/^\[+/, "") } }
b.report('gsub') { N.times { "[12,23,987,43".gsub(/^\[/, "") } }
b.report('[1..-1]') { N.times { "[12,23,987,43"[1..-1] } }
b.report('slice') { N.times { "[12,23,987,43".slice!(0) } }
b.report('length') { N.times { "[12,23,987,43"[1..STR.length] } }
b.report('eat!') { N.times { "[12,23,987,43".eat! } }
b.report('reverse') { N.times { "[12,23,987,43".reverse.chop.reverse } }
end
Which results in:
2.1.2
user system total real
[0] 0.300000 0.000000 0.300000 ( 0.295054)
sub 0.630000 0.000000 0.630000 ( 0.631870)
gsub 2.090000 0.000000 2.090000 ( 2.094368)
[1..-1] 0.230000 0.010000 0.240000 ( 0.232846)
slice 0.320000 0.000000 0.320000 ( 0.320714)
length 0.340000 0.000000 0.340000 ( 0.341918)
eat! 0.460000 0.000000 0.460000 ( 0.452724)
reverse 0.400000 0.000000 0.400000 ( 0.399465)
And another using /^./
to find the first character:
require 'benchmark'
N = 1_000_000
class String
def eat!(how_many = 1)
self.replace self[how_many..-1]
end
def first(how_many = 1)
self[0...how_many]
end
def shift(how_many = 1)
shifted = first(how_many)
self.replace self[how_many..-1]
shifted
end
alias_method :shift!, :shift
end
class Array
def eat!(how_many = 1)
self.replace self[how_many..-1]
end
end
puts RUBY_VERSION
STR = "[12,23,987,43"
Benchmark.bm(7) do |b|
b.report('[0]') { N.times { "[12,23,987,43"[0] = '' } }
b.report('[/^./]') { N.times { "[12,23,987,43"[/^./] = '' } }
b.report('[/^\[/]') { N.times { "[12,23,987,43"[/^\[/] = '' } }
b.report('sub+') { N.times { "[12,23,987,43".sub(/^\[+/, "") } }
b.report('sub') { N.times { "[12,23,987,43".sub(/^\[/, "") } }
b.report('gsub') { N.times { "[12,23,987,43".gsub(/^\[/, "") } }
b.report('[1..-1]') { N.times { "[12,23,987,43"[1..-1] } }
b.report('slice') { N.times { "[12,23,987,43".slice!(0) } }
b.report('length') { N.times { "[12,23,987,43"[1..STR.length] } }
b.report('eat!') { N.times { "[12,23,987,43".eat! } }
b.report('reverse') { N.times { "[12,23,987,43".reverse.chop.reverse } }
end
Which results in:
# >> 2.1.5
# >> user system total real
# >> [0] 0.270000 0.000000 0.270000 ( 0.270165)
# >> [/^./] 0.430000 0.000000 0.430000 ( 0.432417)
# >> [/^\[/] 0.460000 0.000000 0.460000 ( 0.458221)
# >> sub+ 0.590000 0.000000 0.590000 ( 0.590284)
# >> sub 0.590000 0.000000 0.590000 ( 0.596366)
# >> gsub 1.880000 0.010000 1.890000 ( 1.885892)
# >> [1..-1] 0.230000 0.000000 0.230000 ( 0.223045)
# >> slice 0.300000 0.000000 0.300000 ( 0.299175)
# >> length 0.320000 0.000000 0.320000 ( 0.325841)
# >> eat! 0.410000 0.000000 0.410000 ( 0.409306)
# >> reverse 0.390000 0.000000 0.390000 ( 0.393044)
Here's another update on faster hardware and a newer version of Ruby:
2.3.1
user system total real
[0] 0.200000 0.000000 0.200000 ( 0.204307)
[/^./] 0.390000 0.000000 0.390000 ( 0.387527)
[/^\[/] 0.360000 0.000000 0.360000 ( 0.360400)
sub+ 0.490000 0.000000 0.490000 ( 0.492083)
sub 0.480000 0.000000 0.480000 ( 0.487862)
gsub 1.990000 0.000000 1.990000 ( 1.988716)
[1..-1] 0.180000 0.000000 0.180000 ( 0.181673)
slice 0.260000 0.000000 0.260000 ( 0.266371)
length 0.270000 0.000000 0.270000 ( 0.267651)
eat! 0.400000 0.010000 0.410000 ( 0.398093)
reverse 0.340000 0.000000 0.340000 ( 0.344077)
Why is gsub so slow?
After doing a search/replace, gsub
has to check for possible additional matches before it can tell if it's finished. sub
only does one and finishes. Consider gsub
like it's a minimum of two sub
calls.
Also, it's important to remember that gsub
, and sub
can also be handicapped by poorly written regex which match much more slowly than a sub-string search. If possible anchor the regex to get the most speed from it. There are answers here on Stack Overflow demonstrating that so search around if you want more information.
Related Topics
Calling a Class Method Within a Class
Ruby Method, Proc, and Block Confusion
Combine Thumbnails to One Large Image with Rmagick
Stub Method Only on The First Call with Rspec
Ruby, How to Add a Param to an Url That You Don't Know If It Has Any Other Param Already
How to Download a File Over Http Using Ruby
Gmail Threading, Imap and Ruby
Implicit Argument Passing of Super from Method Defined by Define_Method() Is Not Supported
Phonegap and Rails 3: How to Interact with a Rails 3 App
Determining Whether One Array Contains the Contents of Another Array in Ruby
Get All Local Variables or Available Methods from Irb
Port in Use When Not Using a Port
Access Localhost on MAC from Xcode? Phonegap Communicating with Ajax to a Local Rails App
Load Works on Local Path, Require Doesn'T
Preventing Delayed_Job Background Jobs from Consuming Too Much CPU on a Single Server