Ruby Equivalent of Numpy

Numpy equivalents of Ruby array functions

Initially this looked like a bincount or histogram, but the output is the bins where each value fits, not the number of items per bin:

In [3]: eq_width_bin(data,3)                                                    
Out[3]: [1, 2, 1, 0, 0, 0, 1, 1, 2, 2, 2, 1]

Your bins:

In [10]: np.linspace(np.min(data),np.max(data),4)                               
Out[10]: array([ 10., 50., 90., 130.])

we can identify the bin for each value with a simple integer division:

In [12]: (data-10)//40                                                          
Out[12]: array([1, 2, 1, 0, 0, 0, 1, 1, 2, 2, 3, 1])

and correct that 3 with:

In [16]: np.minimum((data-10)//40,2)                                            
Out[16]: array([1, 2, 1, 0, 0, 0, 1, 1, 2, 2, 2, 1])

But that doesn't answer you question about .select .collect .inject .sort_by. Off hand I'm not familiar with those (though I was a fan of Squeak years ago, and dabbled in Ruby a bit). They sound more like iterators, such as those collected in itertools.

With numpy we don't usually take an iterative approach. Rather we try to look at the arrays as a whole, doing things like division and min/max for the whole thing.

===

searchsorted also works for this problem:

In [19]: np.searchsorted(Out[10],data)                                              
Out[19]: array([2, 3, 2, 1, 1, 0, 2, 2, 3, 3, 3, 2])

In [21]: np.maximum(0,np.searchsorted(Out[10],data)-1)
Out[21]: array([1, 2, 1, 0, 0, 0, 1, 1, 2, 2, 2, 1])

A (possibly) cleaner expression of your Python loop:

def foo(i, edges):
for j,n in enumerate(edges):
if i<n:
return j-1
return j-1
In [34]: edges = np.linspace(np.min(data),np.max(data),4).tolist()
In [35]: [foo(i,edges) for i in data]
Out[35]: [1, 2, 1, 0, 0, 0, 1, 1, 2, 2, 2, 1]

I converted edges to a list, because it's faster to iterate that way.

With itertools:

In [55]: [len(list(itertools.takewhile(lambda x: x<i,edges)))-1 for i in data]  
Out[55]: [1, 2, 1, 0, 0, -1, 1, 1, 2, 2, 2, 1]

===

Another approach

In [45]: edges = np.linspace(np.min(data),np.max(data),4)                       
In [46]: data[:,None]<-edges
Out[46]:
array([[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False]])
In [47]: np.argmax(data[:,None]<edges, axis=1)-1
Out[47]: array([ 1, 2, 1, 0, 0, 0, 1, 1, 2, 2, -1, 1])

That -1 needs cleaning (the row where there's no True).

edit

Lists have an index method; with that we can get an expression that's a lot like your last Ruby line. Looks like list comprehension is a lot like the Ruby collect:

In [88]: [[i<j for i in edges].index(False)-1 for j in data]                    
Out[88]: [1, 2, 1, 0, 0, -1, 1, 1, 2, 2, 2, 1]

Anything like SciPy in Ruby?

There's nothing quite as mature or well done as SciPy, but check out SciRuby and Numerical Ruby.

What is the equivalent of numpy.random.choice([0,1], p=[0.2, 0.8]) in ruby?

You could use https://github.com/fl00r/pickup

A simple example is:

require 'pickup'
headings = {
A: 40,
B: 20,
C: 40,
}
pickup = Pickup.new(headings)
pickup.pick
#=> A
pickup.pick
#=> B
pickup.pick
#=> A
pickup.pick
#=> C
pickup.pick
#=> C

In your case you have 2 option 1/0 and the probabilities is 20 and 80. But this solution is also applicable if you have a non binary situations.

Scientific Programming with Ruby

Linear algebra is at the heart of most large-scale scientific computing. LAPACK is the gold standard for linear algebra libraries, first written in FORTRAN.

There's a port to Ruby here. Once you have that, the rest is incidental, but there are also plotting routines in Ruby.

Python Equivalent to Ruby's chunk_while?

Python doesn't have an equivalent function. You'll have to write your own.

Here's my implementation, using an iterator and the yield statement:

def chunk_while(iterable, predicate):
itr = iter(iterable)

try:
prev_value = next(itr)
except StopIteration:
# if the iterable is empty, yield nothing
return

chunk = [prev_value]
for value in itr:
# if the predicate returns False, start a new chunk
if not predicate(prev_value, value):
yield chunk
chunk = []

chunk.append(value)
prev_value = value

# don't forget to yield the final chunk
if chunk:
yield chunk

Which can be used like so:

>>> list(chunk_while([1, 3, 2, 5, 5], lambda prev, next_: next_ <= prev + 2))
[[1, 3, 2], [5, 5]]

Ruby NArray.to_na() and Python numpy.array()

You can easily achieve this by using the fromstring method of numpy:

import numpy as np

line = "#!/usr/bin/ruby\n#\n# Gen"
array = np.fromstring(line, dtype=float)
print array

Executing the above code results in

[  9.05457127e+164   3.30197767e-258   6.15310337e+223]

Python equivalent of ruby's `map.with_index`?

Use enumerate which yields tuples containing indices with original values:

>>> enumerate(array)
<enumerate object at 0x7f4ad46d0190>
>>> list(enumerate(array))
[(0, 200), (1, 100), (2, 150), (3, 500), (4, 25), (5, 650), (6, 175)]

combining with list comprehension:

>>> array = [200,100,150,500,25,650,175]
>>> [i for i, x in enumerate(array) if x < 200]
[1, 2, 4, 6]


Related Topics



Leave a reply



Submit