Differencebetween 'Range#Include' and 'Range#Cover'

What is the difference between `Range#include?` and `Range#cover?`?

The two methods are designed to do two slightly different things on purpose. Internally they are implemented very differently too. You can take a look at the sources in the documentation and see that .include? is doing a lot more than .cover?

The .cover? method is related to the Comparable module, and checks whether an item would fit between the end points in a sorted list. It will return true even if the item is not in the set implied by the Range.

The .include? method is related to the Enumerable module, and checks whether an item is actually in the complete set implied by the Range. There is some finessing with numerics - Integer ranges are counted as including all the implied Float values (I'm not sure why).

These examples might help:

('a'..'z').cover?('yellow')
# => true

('a'..'z').include?('yellow')
# => false

('yellaa'..'yellzz').include?('yellow')
=> true

Additionally, if you try

('aaaaaa'..'zzzzzz').include?('yellow')

you should notice it takes a much longer time than

('aaaaaa'..'zzzzzz').cover?('yellow')

Difference between Range.include? and Range.member? in Ruby

They are aliases of each other. If you expand the source code in the docs you see they both refer to the same internal function.

What is the difference between regular for statement and range-based for statement in C++

The difference in your case is, that the first version with iterators, well, uses iterators (that's why cout << i << endl; is not working), and the second version (the range-based for loop) gives you either a copy, a reference, or const reference.

So this:

for(auto i = vec.begin(); i != vec.end(); i++)
{
cout << i << endl; // should be *i
}

uses iterators (vec.begin() gives you an iterator to the first element).

Whereas this:

for(auto i : vec)
{
cout << i << endl;
}

uses copies of elements in your vector.

While this:

for(auto& i : vec)
{
cout << i << endl;
}

uses references to your vector elements.

Why doesn't `Range#cover?` raise an exception when comparison fails?

The doc doesn't mention an implementation detail. range_cover is implemented in terms of r_less (via r_cover_p). And r_less comment says:

/* compares _a_ and _b_ and returns:
* < 0: a < b
* = 0: a = b
* > 0: a > b or non-comparable
*/

Here is the source of r_cover_p:

static VALUE
r_cover_p(VALUE range, VALUE beg, VALUE end, VALUE val)
{
if (r_less(beg, val) <= 0) {
int excl = EXCL(range);
if (r_less(val, end) <= -excl)
return Qtrue;
}
return Qfalse;
}

As we can see, a positive number returned from either of r_less invocations will result in a Qfalse.

Now, the reason why the doc doesn't mention it, I think, is to keep it light. Normally (99.9999% of cases), you're supposed to compare comparable things, right? And in the odd case you don't, you still get a correct answer ("this Time does not belong to this range of integers").

Range and the mysteries that it covers going out of my range

Why does ([1]..[10]).cover?([9,11,335]) return true

Let's take a look at the source. In Ruby 1.9.3 we can see a following definition.

static VALUE
range_cover(VALUE range, VALUE val)
{
VALUE beg, end;

beg = RANGE_BEG(range);
end = RANGE_END(range);
if (r_le(beg, val)) {
if (EXCL(range)) {
if (r_lt(val, end))
return Qtrue;
}
else {
if (r_le(val, end))
return Qtrue;
}
}
return Qfalse;
}

If the beginning of the range isn't lesser or equal to the given value cover? returns false. Here lesser or equal to is determined in terms of the r_lt function, which uses the <=> operator for comparison. Let's see how does it behave in case of arrays

[1] <=> [9,11,335] # => -1

So apparently [1] is indeed lesser than [9,11,335]. As a result we go into the body of the first if. Inside we check whether the range excludes its end and do a second comparison, once again using the <=> operator.

[10] <=> [9,11,335] # => 1

Therefore [10] is greater than [9,11,335]. The method returns true.

Why do you see ArgumentError: bad value for range

The function responsible for raising this error is range_failed. It's called only when range_check returns a nil. When does it happen? When the beginning and the end of the range are uncomparable (yes, once again in terms of our dear friend, the <=> operator).

true <=> false # => nil

true and false are uncomparable. The range cannot be created and the ArgumentError is raised.

On a closing note, Range.cover?'s dependence on <=> is in fact an expected and documented behaviour. See RubySpec's specification of cover?.

Difference between '..' (double-dot) and '...' (triple-dot) in range generation?

The documentation for Range says this:

Ranges constructed using .. run from the beginning to the end inclusively. Those created using ... exclude the end value.

So a..b is like a <= x <= b, whereas a...b is like a <= x < b.


Note that, while to_a on a Range of integers gives a collection of integers, a Range is not a set of values, but simply a pair of start/end values:

(1..5).include?(5)           #=> true
(1...5).include?(5) #=> false

(1..4).include?(4.1) #=> false
(1...5).include?(4.1) #=> true
(1..4).to_a == (1...5).to_a #=> true
(1..4) == (1...5) #=> false



The docs used to not include this, instead requiring reading the Pickaxe’s section on Ranges. Thanks to @MarkAmery (see below) for noting this update.

Minimal interface for Range#include? support

The documentation is incorrect or (rather, I suspect) outdated. Range#cover? works the way you expect [bold emphasis mine]:

cover?(object)true or false

Returns true if the given argument is within self, false otherwise.

With non-range argument object, evaluates with <= and <.

The documentation for Range#include? contains a somewhat ominous statement [bold emphasis mine]:

If begin and end are numeric, include? behaves like cover?

[…]

But when not numeric, the two methods may differ:

('a'..'d').include?('cc') # => false
('a'..'d').cover?('cc') # => true

Here you can see the difference: Range#cover? evaluates to true because 'a' <= 'cc' && 'cc' <= 'd', whereas Range#include? evaluates to false because ('a'..'d').to_a == ['a', 'b', 'c', 'd'] and thus ('a'..'d').each.include?('cc') is falsey.

Note that the introductory example using Time still works because Time is explicitly special-cased in the spec.

There is a spec which says both Range#include? and Range#cover? use <=>, but it is only tested with Integers, for which we know from the ominous documentation above that Range#include? and Range#cover? behave the same.

There is quite a lot of special-casing going on for Ranges and it is not the first time this has led to bugs and/or non-intuitive behavior:

  • Ruby: Can't Iterate From Time Despite Responding to Succ / Bug #18237 Remove unnecessary checks for Time in Range#each as per the comment / https://github.com/ruby/spec/pull/852 / https://github.com/ruby/ruby/pull/4928
  • https://bugs.ruby-lang.org/issues/18155
  • https://bugs.ruby-lang.org/issues/18577
  • https://bugs.ruby-lang.org/issues/18580
  • https://github.com/ruby/dev-meeting-log/blob/master/DevMeeting-2022-02-17.md#bug-18580-rangeinclude-inconsistency-for-beginless-string-ranges-zverok

Personally, I am not a big fan of all this special-casing. I assume it is done for performance reasons, but the way to get better performance is not to add weird special cases to the language specification, it is to remove them which makes the language simpler and thus easier to optimize. Or, put another way: at any given point in time, a compiler writer can either spend the time implementing weird special cases or awesome optimizations, but not both. XRuby, Ruby.NET, MacRuby, MagLev, JRuby, IronRuby, TruffleRuby, Rubinius, Topaz, and friends have shown that the way to get high-performance Ruby is a powerful compiler, not weird hand-rolled special-cased C code.

I would file a bug, if only to get some clarification into the docs and specs.

Determine if a range is completely covered by a set of ranges

You want a recursive query finding the real ranges (0 to 60 and 80 to 100 in your case). We'd start with the ranges given and look for ranges extending these. At last we stick with the most extended ranges (e.g. the range 10 to 30 can be extended to 0 to 40 and then to 0 to 60, so we keep the widest range 0 to 60).

with wider_ranges(a, b, grp) as
(
select a, b, id from ranges
union all
select
case when r1.a < r2.a then r1.a else r2.a end,
case when r1.b > r2.b then r1.b else r2.b end,
r1.grp
from wider_ranges r1
join ranges r2 on (r2.a < r1.a and r2.b >= r1.a)
or (r2.b > r1.b and r2.a <= r1.b)
)
, real_ranges(a, b) as
(
select distinct min(a), max(b)
from wider_ranges
group by grp
)
select *
from tests
where exists
(
select *
from real_ranges
where tests.a >= real_ranges.a and tests.b <= real_ranges.b
);

Rextester demo: http://rextester.com/BDJA16583

As requested this works in SQL Server, but is standard SQL, so it should work in about every DBMS featuring recursive queries.

Sorting ranges of numbers to find one range that covers the values of the others using python

If you are only looking for a simple graphic representation to get a feel of the distribution of the intevals you can simply use a plot with transparent lines to indicate highly puplicated intervalls and vise versa.

For example as follows:

import matplotlib.pyplot as plt

intervals = [[0.5, 1.5],
[2,3],
[0.2,4]]

for int in intervals:
plt.plot(int,[0,0], 'b', alpha = 0.2, linewidth = 100)

plt.show()

Giving the following result:

Sample Image

Clearly indication that the intervall 1.5-2 is highly populated. I now see that I failed to copy your intervalls correctly but the principle is the same.



Related Topics



Leave a reply



Submit