Why Does Ruby's String#To_I Sometimes Return 0 When the String Contains a Number

Why does Ruby's String#to_i sometimes return 0 when the string contains a number?

The to_i method returns the number that is formed by all parseable digits at the start of a string. Your first string starts with a with digit so to_i returns that, the second string doesn't start with a digit so 0 is returned. BTW, whitespace is ignored, so " 123abc".to_i returns 123.

How to return a number using to_i when the input starts with $

Meditate on these:

"$1\n"[1..-1].to_i # => 1
"$1.99\n"[1..-1].to_i # => 1
"$100.99\n"[1..-1].to_i # => 100

This breaks down when there's a leading -:

"-$100.99\n"[1..-1].to_i # => 0

That can be fixed using sub instead of a slice:

"-$100.99\n".sub('$', '').to_i # => -100

The problem is, money is not all integers, so to_i is probably not really what you want. Instead you should be using to_f:

"$1\n".sub('$', '').to_f # => 1.0
"$1.99\n".sub('$', '').to_f # => 1.99
"$100.99\n".sub('$', '').to_f # => 100.99
"-$100.99\n".sub('$', '').to_f # => -100.99
"$-100.99\n".sub('$', '').to_f # => -100.99

Note: It isn't necessary to use chomp to remove the trailing new-line. to_i and to_f will stop when they see a non-digit:

"1\n".to_i # => 1
"1.99\n".to_f # => 1.99

"1 1\n".to_i # => 1
"1.99 2\n".to_f # => 1.99

Again, because of this behavior, "\n" and 1\n or 2\n will be ignored.

How to return a number using to_i when the input starts with $

Meditate on these:

"$1\n"[1..-1].to_i # => 1
"$1.99\n"[1..-1].to_i # => 1
"$100.99\n"[1..-1].to_i # => 100

This breaks down when there's a leading -:

"-$100.99\n"[1..-1].to_i # => 0

That can be fixed using sub instead of a slice:

"-$100.99\n".sub('$', '').to_i # => -100

The problem is, money is not all integers, so to_i is probably not really what you want. Instead you should be using to_f:

"$1\n".sub('$', '').to_f # => 1.0
"$1.99\n".sub('$', '').to_f # => 1.99
"$100.99\n".sub('$', '').to_f # => 100.99
"-$100.99\n".sub('$', '').to_f # => -100.99
"$-100.99\n".sub('$', '').to_f # => -100.99

Note: It isn't necessary to use chomp to remove the trailing new-line. to_i and to_f will stop when they see a non-digit:

"1\n".to_i # => 1
"1.99\n".to_f # => 1.99

"1 1\n".to_i # => 1
"1.99 2\n".to_f # => 1.99

Again, because of this behavior, "\n" and 1\n or 2\n will be ignored.

Ruby Nil and Zero

NilClass defines #to_i for the same reason it defines a #to_a that returns []. It's giving you something of the right type but an empty sort of value.

This is actually quite useful. For example:

<%= big.long.expr.nil? ? "" : big.long.expr %>

becomes:

<%= big.long.expr %>

Much nicer! (Erb is calling #to_s which, for nil, is "".) And:

if how.now.brown.cow && how.now.brown.cow[0]
how.now.brown.cow[0]
else
0
end

becomes:

how.now.brown.cow.to_a[0].to_i

The short conversions exist when only a representation is needed. The long conversions are the ones that the Ruby core methods call and they require something very close. Use them if you want a type check.

That is:

thing.to_int # only works when almost Integer already. NilClass throws NoMethodError

thing.to_i # this works for anything that cares to define a conversion

Ruby How to convert string to integer without .to_i

You can use Kernel::Integer:

Integer("219")
#=> 219
Integer("21cat9")
# ArgumentError: invalid value for Integer(): "21cat9"

Sometimes this method is used as follows:

def convert_to_i(str)
begin
Integer(str)
rescue ArgumentError
nil
end
end

convert_to_i("219")
#=> 219
convert_to_i("21cat9")
#=> nil
convert_to_i("1_234")
#=> 1234
convert_to_i(" 12 ")
#=> 12
convert_to_i("0b11011") # binary representation
#=> 27
convert_to_i("054") # octal representation
#=> 44
convert_to_i("0xC") # hexidecimal representation
#=> 12

Some use an "inline rescue" (though it is less selective, as it rescues all exceptions):

def convert_to_i(str)
Integer(str) rescue nil
end

There are similar Kernel methods to convert a string to a float or rational.

Use case for `nil`.to_i #= 0`, ` .to_i #= 0` `nil`.to_f #= 0.0`, ` .to_f #= 0.0`

Two situations:

  1. Starting value.

    @count = @count.to_i.next

    as opposed to

    @count = 0
    @count += 1
  2. Convenience with transformations.

    Often times you get collection returned with some values, which are nil. Usually, you want to either remove those or count them as default values.

    Lets say you have a method that is supposed to calculate the average score of the highest rated post of each SO user. User#highest_rated will return nil if the user has no posts at all:

    users.map{ |user| user.highest_rated.to_i }.reduce(:+) / users.size

    This was not the perfect example, but it happens in everyday transformations all the time.


Why not raise an exception or return nil?

Ruby exceeds at usability. Having to make checks for nil values everywhere will look a little clumsy. In fact in a lot of cases when I use these operations it is exactly to make sure nil values are transformed to their default counterparts.

Also, there is the expectation that to_i (for example) will return an integer. Returning nil will be a slight wtf moment.

How to specify a range in Ruby

Regex really is the right way to do this. It's specifically for testing patterns in strings. This is how you'd test "do all characters in this string fall in the range of characters 0-9?":

pin.match(/\A[0-9]+\z/)

This regex says "Does this string start and end with at least one of the characters 0-9, with nothing else in between?" - the \A and \z are start-of-string and end-of-string matchers, and the [0-9]+ matches any one or more of any character in that range.

You could even do your entire check in one line of regex:

pin.match(/\A([0-9]{4}|[0-9]{6})\z/)

Which says "Does this string consist of the characters 0-9 repeated exactly 4 times, or the characters 0-9, repeated exactly 6 times?"

Ruby's String#count method does something similar to this, though it just counts the number of occurrences of the characters passed, and it uses something similar to regex ranges to allow you to specify character ranges.

The sequence c1-c2 means all characters between c1 and c2.

Thus, it expands the parameter "0-9" into the list of characters "0123456789", and then it tests how many of the characters in the string match that list of characters.

This will work to verify that a certain number of numbers exist in the string, and the length checks let you implicitly test that no other characters exist in the string. However, regexes let you assert that directly, by ensuring that the whole string matches a given pattern, including length constraints.



Related Topics



Leave a reply



Submit