Finding Number of Elements in One Vector That Are Less Than an Element in Another Vector

Finding number of elements in one vector that are less than an element in another vector

Assuming that a is weakly sorted increasingly, use findInterval:

a <- sort(a)
## gives points less than or equal to b[i]
findInterval(b, a)
# [1] 1 3 3 4 5
## to do strictly less than, subtract a small bit from b
## uses .Machine$double.eps (the smallest distinguishable difference)
findInterval(b - sqrt(.Machine$double.eps), a)
# [1] 0 1 3 4 4

Count values in a vector less than each one of the elements in another vector

You can use singleton expansion with bsxfun: faster, more elegant than the loop, but also more memory-intensive:

result = sum(bsxfun(@lt, r(:), d(:).'), 1);

In recent Matlab versions bsxfun can be dropped thanks to implicit singleton expansion:

result = sum(r(:)<d(:).', 1);

An alternative approach is to use the histcounts function with the 'cumcount' option:

result = histcounts(r(:), [-inf; d(:); inf], 'Normalization', 'cumcount');
result = result(1:end-1);

How many elements of a vector are smaller or equal to each element of this vector?

You can also use the *apply family as follows,

sapply(x, function(i) sum(x <= i))
#[1] 1 3 7 6 5 4 3 8 9

How to tell what is in one vector and not another?

you can use the setdiff() (set difference) function:

> setdiff(x, y)
[1] 1

Find the first n elements of one vector which contain all the elements of another vector

One option could be:

max(match(vecB, vecA))

Results for different situations:

vecB <- 1:3
vecA <- c(1, 2, 2, 1, 3, 2)

[1] 5

vecB <- 1:3
vecA <- c(3, 2, 2, 1)

[1] 4

vecB <- 1:3
vecA <- c(2, 2, 1)

[1] NA

Find each element that is less than some element to its right

Your algorithm is so slow since if any(...)has to check n items on the first iteration, then n-1 items on the second iteration ... until checking a single item in the last iteration. Overal it thus has to do roughly n^2/2 comparisons, so its running time is quadratic as a function of the length of the input vector!

One solution that is linear in time and memory might be to first calculate a vector with the maximum from that point until the end, which can be calculated in one backwards pass
(you could call this a reversed cumulative maximum, which cannot be vectorized). After this, this vector is compared directly to x (untested):

% calculate vector mx for which mx(i) = max(x(i:end))
mx = zeros(size(x));
mx(end) = x(end);
for i = length(x)-1:-1:1 % iterate backwards
mx(i) = max(x(i), mx(i+1));
end

for i = 1:length(x) - 1
if mx(i) > x(i)
do_such_and_such(i);
end
end

In case you don't care about the order in which do_such_and_such is executed, these for loops can even be combined like so:

mx = x(end);
for i = length(x)-1:-1:1 % iterate backwards
if x(i) < mx
do_such_and_such(i);
end
mx = max(x(i), mx); % maximum of x(i:end)
end

How many elements in a vector are greater than x without using a loop

Use length or sum:

> length(x[x > 10])
[1] 2
> sum(x > 10)
[1] 2

In the first approach, you would be creating a vector that subsets the values that matches your condition, and then retrieving the length of the vector.

In the second approach, you are simply creating a logical vector that states whether each value matches the condition (TRUE) or doesn't (FALSE). Since TRUE and FALSE equate to "1" and "0", you can simply use sum to get your answer.

Because the first approach requires indexing and subsetting before counting, I am almost certain that the second approach would be faster than the first.



Related Topics



Leave a reply



Submit