Numpy equivalents of Ruby array functions
Initially this looked like a bincount
or histogram
, but the output is the bins where each value fits, not the number of items per bin:
In [3]: eq_width_bin(data,3)
Out[3]: [1, 2, 1, 0, 0, 0, 1, 1, 2, 2, 2, 1]
Your bins:
In [10]: np.linspace(np.min(data),np.max(data),4)
Out[10]: array([ 10., 50., 90., 130.])
we can identify the bin for each value with a simple integer division:
In [12]: (data-10)//40
Out[12]: array([1, 2, 1, 0, 0, 0, 1, 1, 2, 2, 3, 1])
and correct that 3 with:
In [16]: np.minimum((data-10)//40,2)
Out[16]: array([1, 2, 1, 0, 0, 0, 1, 1, 2, 2, 2, 1])
But that doesn't answer you question about .select .collect .inject .sort_by
. Off hand I'm not familiar with those (though I was a fan of Squeak
years ago, and dabbled in Ruby
a bit). They sound more like iterators, such as those collected in itertools
.
With numpy
we don't usually take an iterative approach. Rather we try to look at the arrays as a whole, doing things like division and min/max for the whole thing.
===
searchsorted
also works for this problem:
In [19]: np.searchsorted(Out[10],data)
Out[19]: array([2, 3, 2, 1, 1, 0, 2, 2, 3, 3, 3, 2])
In [21]: np.maximum(0,np.searchsorted(Out[10],data)-1)
Out[21]: array([1, 2, 1, 0, 0, 0, 1, 1, 2, 2, 2, 1])
A (possibly) cleaner expression of your Python loop:
def foo(i, edges):
for j,n in enumerate(edges):
if i<n:
return j-1
return j-1
In [34]: edges = np.linspace(np.min(data),np.max(data),4).tolist()
In [35]: [foo(i,edges) for i in data]
Out[35]: [1, 2, 1, 0, 0, 0, 1, 1, 2, 2, 2, 1]
I converted edges
to a list, because it's faster to iterate that way.
With itertools
:
In [55]: [len(list(itertools.takewhile(lambda x: x<i,edges)))-1 for i in data]
Out[55]: [1, 2, 1, 0, 0, -1, 1, 1, 2, 2, 2, 1]
===
Another approach
In [45]: edges = np.linspace(np.min(data),np.max(data),4)
In [46]: data[:,None]<-edges
Out[46]:
array([[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False]])
In [47]: np.argmax(data[:,None]<edges, axis=1)-1
Out[47]: array([ 1, 2, 1, 0, 0, 0, 1, 1, 2, 2, -1, 1])
That -1
needs cleaning (the row where there's no True).
edit
Lists have an index
method; with that we can get an expression that's a lot like your last Ruby
line. Looks like list comprehension is a lot like the Ruby collect
:
In [88]: [[i<j for i in edges].index(False)-1 for j in data]
Out[88]: [1, 2, 1, 0, 0, -1, 1, 1, 2, 2, 2, 1]
Anything like SciPy in Ruby?
There's nothing quite as mature or well done as SciPy, but check out SciRuby and Numerical Ruby.
What is the equivalent of numpy.random.choice([0,1], p=[0.2, 0.8]) in ruby?
You could use https://github.com/fl00r/pickup
A simple example is:
require 'pickup'
headings = {
A: 40,
B: 20,
C: 40,
}
pickup = Pickup.new(headings)
pickup.pick
#=> A
pickup.pick
#=> B
pickup.pick
#=> A
pickup.pick
#=> C
pickup.pick
#=> C
In your case you have 2 option 1/0
and the probabilities is 20
and 80
. But this solution is also applicable if you have a non binary situations.
Scientific Programming with Ruby
Linear algebra is at the heart of most large-scale scientific computing. LAPACK is the gold standard for linear algebra libraries, first written in FORTRAN.
There's a port to Ruby here. Once you have that, the rest is incidental, but there are also plotting routines in Ruby.
Python Equivalent to Ruby's chunk_while?
Python doesn't have an equivalent function. You'll have to write your own.
Here's my implementation, using an iterator and the yield
statement:
def chunk_while(iterable, predicate):
itr = iter(iterable)
try:
prev_value = next(itr)
except StopIteration:
# if the iterable is empty, yield nothing
return
chunk = [prev_value]
for value in itr:
# if the predicate returns False, start a new chunk
if not predicate(prev_value, value):
yield chunk
chunk = []
chunk.append(value)
prev_value = value
# don't forget to yield the final chunk
if chunk:
yield chunk
Which can be used like so:
>>> list(chunk_while([1, 3, 2, 5, 5], lambda prev, next_: next_ <= prev + 2))
[[1, 3, 2], [5, 5]]
Ruby NArray.to_na() and Python numpy.array()
You can easily achieve this by using the fromstring
method of numpy:
import numpy as np
line = "#!/usr/bin/ruby\n#\n# Gen"
array = np.fromstring(line, dtype=float)
print array
Executing the above code results in
[ 9.05457127e+164 3.30197767e-258 6.15310337e+223]
Python equivalent of ruby's `map.with_index`?
Use enumerate
which yields tuples containing indices with original values:
>>> enumerate(array)
<enumerate object at 0x7f4ad46d0190>
>>> list(enumerate(array))
[(0, 200), (1, 100), (2, 150), (3, 500), (4, 25), (5, 650), (6, 175)]
combining with list comprehension:
>>> array = [200,100,150,500,25,650,175]
>>> [i for i, x in enumerate(array) if x < 200]
[1, 2, 4, 6]
Related Topics
Verb-Agnostic Matching in Sinatra
Rails: Switch Connection on Each Request But Keep a Connection Pool
Aptana 3 Ruby Debugger - Exception in Debugthread Loop: Undefined Method 'Is_Binary_Data'
How to Alias a Class Method in Rails Model
Jekyll: How to Use Custom Plugins with Github Pages
How to Use Truly Local Variables in Ruby Proc/Lambda
Help Understanding Yield and Enumerators in Ruby
Getting the Full Rspec Test Name from Within a Before(:Each) Block
Is It Acceptable Practice to Patch Ruby's Base Classes, Such as Fixnum
How to Define a Ruby Singleton Method Using a Block
Carrierwave - Resizing Images to Fixed Width
Why Was the Object_Id for True and Nil Changed in Ruby2.0
Natural Language Date Parser for Ruby/Rails
How to Run My Ruby Code After Rails Server Start
Devise Skip_Confirmation! Fails to Avoid to Send the Confirmation Instructions
Setting Up Facets in Elasticsearch with Searchkick Gem in Rails 4.1