Count, Size, Length...Too Many Choices in Ruby

Count, size, length...too many choices in Ruby?

For arrays and hashes size is an alias for length. They are synonyms and do exactly the same thing.

count is more versatile - it can take an element or predicate and count only those items that match.

> [1,2,3].count{|x| x > 2 }
=> 1

In the case where you don't provide a parameter to count it has basically the same effect as calling length. There can be a performance difference though.

We can see from the source code for Array that they do almost exactly the same thing. Here is the C code for the implementation of array.length:

static VALUE
rb_ary_length(VALUE ary)
{
long len = RARRAY_LEN(ary);
return LONG2NUM(len);
}

And here is the relevant part from the implementation of array.count:

static VALUE
rb_ary_count(int argc, VALUE *argv, VALUE ary)
{
long n = 0;

if (argc == 0) {
VALUE *p, *pend;

if (!rb_block_given_p())
return LONG2NUM(RARRAY_LEN(ary));

// etc..
}
}

The code for array.count does a few extra checks but in the end calls the exact same code: LONG2NUM(RARRAY_LEN(ary)).

Hashes (source code) on the other hand don't seem to implement their own optimized version of count so the implementation from Enumerable (source code) is used, which iterates over all the elements and counts them one-by-one.

In general I'd advise using length (or its alias size) rather than count if you want to know how many elements there are altogether.


Regarding ActiveRecord, on the other hand, there are important differences. check out this post:

  • Counting ActiveRecord associations: count, size or length?

Are there performance reasons to prefer size over length or count in Ruby?

It seems a bit wrong, for most commonly used cases (Array, Hash, String), size and length are either aliases or are implemented in the same way (you can read more here or check the implementation of each method) and will run in O(1).

count however:

  • For Hash is not redefined and will fallback to Enumerable#count, which means its complexity will be O(n) as all key-values will be traversed.
  • For Array it is redefined (Array#count) and at the very least it will check the number of arguments given which is something that neither Array#size nor Array#length have to do.
  • for String it's used to count substrings.

All in all, I would say that

Prefer size or length over count for performance reasons.

would be more accurate.

Which is faster count or length?

In ruby, count, length and size all do pretty much the same thing regarding arrays. See here for more info.

When using ActiveRecord objects, however, count is better than length, and size is even better.

find_all_by_country returns a dumb array so you shouldn't use that method (because it always returns an array). Instead, use where(country: params[:country]).

I'll let Code School's Rails Best Practices slide nº 93 speak for itself (and hope they don't get mad at me for reproducing it here).

Sample Image

Just in case the image gets taken down, basically:

  1. length always pulls all the records and then calls .length on the array - bad

  2. count always does a count query - good

  3. size looks at the cache if you have a cache counter, otherwise does a count query - best

Which is faster count or length?

In ruby, count, length and size all do pretty much the same thing regarding arrays. See here for more info.

When using ActiveRecord objects, however, count is better than length, and size is even better.

find_all_by_country returns a dumb array so you shouldn't use that method (because it always returns an array). Instead, use where(country: params[:country]).

I'll let Code School's Rails Best Practices slide nº 93 speak for itself (and hope they don't get mad at me for reproducing it here).

Sample Image

Just in case the image gets taken down, basically:

  1. length always pulls all the records and then calls .length on the array - bad

  2. count always does a count query - good

  3. size looks at the cache if you have a cache counter, otherwise does a count query - best

size, length and count in Rails

length will load all your objects just to count them; something like:

select * from addresses...

and then return the results count.
As you can imagine - it's bad performance

count will just issue

select count(*) from addresses...

which is better, because we are not loading all addresses just to count them

size is smarter - it'll check if the association is already loaded and if true then return the length (without issuing a call to the database).

size also checks for counter_cache if you have a field named address_count in your user model, then size will use this field for the count, so there is no need to issue a count on the addresses table.

if all fails, size will issue a select count(*) on the database

Confusing difference between `count` and `size`

Check this documentation out on size for Rails: Rails ActiveRecord Size Documentation

Also the documentation for count is here as well: Rails ActiveRecord Count Documentation


There are Ruby AND ActiveRecord methods length, size, and count which are completely different from each other.




Your first example of:>> @order.products.count is attempting to call the Rails ActiveRecord count method (counting records in the DB) while your other example of >> @order.products.to_a.count is attempting to call the Ruby count method (counting items in the container within memory with no connection to the DB).




So to answer your question when using the >> @order = Order.new(products: [my_product]) you are only creating the object in memory and not within the DB. You can read the documentation on size I posted a link to above to tell you why it is able to tell you either the length of the collection or the count of records in the DB depending on the context of its use.

Hope this helps!

Would someone help me understand the Ruby manual's example of str.counts?

It's counting the number of occurences of the letters you passed in as an argument

a.count("lo") # 5, counts [l, o]
hello world
*** * *

# counts all [h, e, o], but not "l" because of the caret
a.count "hello", "^l" # 4
hello world
** * *

a.count "ej-m" # counts e, and the characters in range j, k, l, m
hello world
*** *

There's a couple special characters:

  • caret ^ is negated.
  • The - means a range
  • The \ escapes the other two and is ignored

Loop using count + modulo (when applicable) in Ruby

Yes. This is very well supported by Rails, you do not have to roll your own code for finding batches of records.

The easiest is to simply use find_each, which seamlessly loads 1000 records at a time:

Model.find_each do |model|
# ...
end

The underlying mechanism is find_in_batches with a default batch size of 1000. You can use find_in_batches directly, but you do not have to, find_each is sufficient:

Model.find_in_batches(batch_size: 100) do |batch|
batch.each do |model|
# ...
end
end


Related Topics



Leave a reply



Submit