ruby string split with terminal strings empty
You need to say:
string.split(',',-1)
to avoid omitting the trailing blanks.
per Why does Ruby String#split not treat consecutive trailing delimiters as separate entities?
The second parameter is the "limit" parameter, documented at http://ruby-doc.org/core-2.0.0/String.html#method-i-split as follows:
If the "limit" parameter is omitted, trailing null fields are
suppressed. If limit is a positive number, at most that number of
fields will be returned (if limit is 1, the entire string is returned
as the only entry in an array). If negative, there is no limit to the
number of fields returned, and trailing null fields are not
suppressed.
Ignore empty captures when splitting string
When splitting with a regex containing capturing groups, consecutive matches always produce empty array items.
Rather than switch to a matching approach, use
arr = arr.reject { |c| c.empty? }
Or any other method, see How do I remove blank elements from an array?
Else, you will have to match the substrings using a regex that will match the deilimiters first and then any text that does not start the delimiter texts (that is, you will need to build a tempered greedy token):
arr = s.scan(/(?x)\*{2}|[*\n.]|(?:(?!\*{2})[^*\n.])+/)
See the regex demo.
Here,
(?x)
- a freespacing/comment modifier\*{2}
-**
substring|
- or[*\n.]
- a char that is either*
, newline LF or a.
|
- or(?:(?!\*{2})[^*\n.])+
- 1 or more (+
) chars that are not*
, LF or.
([^*\n.]
) that do not start a**
substring.
Splitting into empty substrings
Logic is described in the documentation:
If the
limit
parameter is omitted, trailing null fields are suppressed.
Trailing empty fields are removed, but not leading ones.
If, by any chance, what you were asking is "yeah, but where's the logic in that?", then imagine we're parsing some CSV.
fname,sname,id,email,status
,,1,sergio@example.com,
We want the first two position to remain empty (rather than be removed and have fname become 1 and sname - sergio@example.com).
We care less about trailing empty fields. Removed or kept, they don't shift data.
Why does Ruby String#split not treat consecutive trailing delimiters as separate entities?
You need to pass a negative value as the second parameter to split
. This prevents it from suppressing trailing null fields:
"w$x$$\r\n".chomp.split('$', -1)
# => ["w", "x", "", ""]
See the docs on split
.
The letter disapperaed after Splitting string in my ruby program
It looks like you were expecting String#tr
to behave like String#gsub
.
Calling string.tr("GPS:", '')
does not replace the complete string "GPS:"
with the empty string. Instead, it replaces any character from within the string "GPS:"
with an empty string. Commonly you will find .tr()
called with an equal number of input and replacement characters, and in that case the input character is replaced by the output character in the corresponding position. But the way you have called it with only the empty string ''
as its translation argument, will delete any of G, P, S, :
from anywhere within the string.
>> "String with S and G and a: P".tr("GPS:", '')
=> "tring with and and a "
Instead, use .gsub('GPS:', '')
to replace the complete match as a group.
string = "GPS:3;S23.164865;E113.428970;88"
info = string.gsub('GPS:', '')
info_array = info.split(";")
puts "GPS: #{info_array[0]},#{info_array[1]},#{info_array[2]}"
# prints
GPS: 3,S23.164865,E113.428970
Here we've called .gsub()
with a string argument. It is probably more often called with a regexp search match argument though.
Split a string into a string and an integer
Use a positive lookbehind assertion based regex in string.split
.
> "10480ABCD".split(/(?<=\d)(?=[A-Za-z])/)
=> ["10480", "ABCD"]
(?<=\d)
Positive lookbehind which asserts that the match must be preceded by a digit character.(?=[A-Za-z])
which asserts that the match must be followed by an alphabet. So the above regex would match the boundary which exists between a digit and an alphabet. Splitting your input according to the matched boundary will give you the desired output.
OR
Use string.scan
> "10480ABCD".scan(/\d+|[A-Za-z]+/)
=> ["10480", "ABCD"]
Split a string with multiple delimiters in Ruby
What about the following:
options.gsub(/ or /i, ",").split(",").map(&:strip).reject(&:empty?)
- replaces all delimiters but the
,
- splits it at
,
- trims each characters, since stuff like
ice cream
with a leading space might be left - removes all blank strings
Can't split/strip by space in a string in Ruby because it's an NBSP character
You should split on all whitespaces, including the non-ASCII ones:
a, b = str.split(/[[:space:]]/)
I'm assuming you are using Ruby 1.9+ and that your str
has the right encoding (e.g. utf-8). As explained in the regex reference, \s
matches only ASCII spaces, while [[:space:]]
will match all unicode spaces (same for \d
vs [[:digit:]]
, etc...)
What is the best way to split a string to get all the substrings by Ruby?
def split_word s
(0..s.length).inject([]){|ai,i|
(1..s.length - i).inject(ai){|aj,j|
aj << s[i,j]
}
}.uniq
end
And you can also consider using Set
instead of Array for the result.
PS: Here's another idea, based on array product:
def split_word s
indices = (0...s.length).to_a
indices.product(indices).reject{|i,j| i > j}.map{|i,j| s[i..j]}.uniq
end
Using string .split (and a regular expression) to check for inner quotes
You want to use this regular expression (see on rubular.com):
/"[^"]*"|'[^']*'|[^"'\s]+/
This regex matches the tokens instead of the delimiters, so you'd want to use scan
instead of split
.
The […]
construct is called a character class. [^"]
is "anything but the double quote".
There are essentially 3 alternates:
"[^"]*"
- double quoted token (may include spaces and single quotes)'[^']*'
- single quoted token (may include spaces and double quotes)[^"'\s]+
- a token consisting of one or more of anything but quotes and whitespaces
References
- regular-expressions.info/Character Class
Snippet
Here's a Ruby implementation:
s = %_foobar "your mom"bar'test course''test lesson'asdf_
puts s
puts s.scan(/"[^"]*"|'[^']*'|[^"'\s]+/)
The above prints (as seen on ideone.com):
foobar "your mom"bar'test course''test lesson'asdf
foobar
"your mom"
bar
'test course'
'test lesson'
asdf
See also
- Which style of Ruby string quoting do you favour?
Related Topics
How to Use Functions Like Concat(), etc. in Arel
Select Checkbox Pass Array in Ruby on Rails
What Is the Opposite of Ruby's Include
How to Do String Comparison in Ruby
Is There a Difference Between :: and . When Calling Class Methods in Ruby
Uploading Files in Ruby on Rails
How to Run Code After Each Line in Ruby
Multiple Applications Using a Single Code Base in Ruby
In Ruby/Rails, How to Sort on a Date Value Where the Date Can Sometimes Be Null
Set Locale Automatically in Ruby on Rails
How to Write to File When Using Marshal::Dump in Ruby for Object Serialization
Errno::Eaccess: Permission Denied @ Dir_S_Mkdir
Which Equality Test Does Ruby's Hash Use When Comparing Keys
Encryption-Decryption in Rails
Removing or Overriding an Activerecord Validation Added by a Superclass or Mixin