R's which() and which.min() Equivalent in Python
Numpy does have built-in functions for it
x = [1,2,3,4,0,1,2,3,4,11]
x=np.array(x)
np.where(x == 2)
np.min(np.where(x==2))
np.argmin(x)
np.where(x == 2)
Out[9]: (array([1, 6], dtype=int64),)
np.min(np.where(x==2))
Out[10]: 1
np.argmin(x)
Out[11]: 4
What is the equivalent of python's idxmin() in R?
which.min()
is R's the equivalent of idxmin(). Both find the minimum value in an array and return the index of the first such value - useful if there are ties.
Pandas Equivalent of R's which()
I may not understand clearly the question, but it looks like the response is easier than what you think:
using pandas DataFrame:
df['colname'] > somenumberIchoose
returns a pandas series with True / False values and the original index of the DataFrame.
Then you can use that boolean series on the original DataFrame and get the subset you are looking for:
df[df['colname'] > somenumberIchoose]
should be enough.
See http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing
R equivalent of Python's range() function
There is no exact equivalent. As noted, seq
doesn't work because the by
argument is automatically set with the correct sign, and generates an error if you try to explicitly pass a positive sign when to < from
. Just create a very simple wrapper if you need to have the exact match.
py_range <- function(from, to) {
if (to <= from) return(integer(0))
seq(from = from, to = to - 1)
}
py_range(1, 4)
#> [1] 1 2 3
py_range(1, 0)
#> integer(0)
py_range(1, 1)
#> integer(0)
These will work in a loop with printing as you desire.
for (i in py_range(1, 4)) {
print(i)
}
#> [1] 1
#> [1] 2
#> [1] 3
for (i in py_range(1, 0)) {
print(i)
}
#> Nothing was actually printed here!
for (i in py_range(1, 1)) {
print(i)
}
#> Nothing was actually printed here!
R `summary` function closest equivalent in python
Without pandas:
from scipy import stats
import numpy as np
a = np.random.rand(100,3)
summary = stats.describe(a, axis = 0)
print(summary.mean)
print(summary.minmax)
...
Using pandas:
import pandas as pd
summary_across_rows = pd.DataFrame(a).describe() # across axis=0
print(summary)
0 1 2
count 100.000000 100.000000 100.000000
mean 0.495204 0.573827 0.476202
std 0.275131 0.246189 0.271626
min 0.005202 0.037195 0.023595
25% 0.295210 0.399358 0.258712
50% 0.512023 0.562181 0.417322
75% 0.710216 0.790970 0.712047
max 0.998371 0.997717 0.980840
Note: for the summary across the other dimension you need:
summary_across_columns = pd.DataFrame(a.T).describe() # across axis=1
What are Python pandas equivalents for R functions like str(), summary(), and head()?
summary()
~describe()
head()
~head()
I'm not sure about the str()
equivalent.
R equivalent of performing operations on an empty list in Python?
The equivalent of Python S1 = []; S1.append(x)
in R is S1 <- list(); S1 <- c(S1, list(x))
in R.
In your example c(S1, x)
will work because the numeric value you are trying to append will be automatically wrapped in a list, but it's safer to do it explicitly. If x
is already a list, then c(S1, x)
will append its elements to S1
, while c(S1, list(x))
will append a single entry containing a copy of x
to S1
.
You could use the append()
function in R, but then remember that it's rare for R functions to modify their arguments, so you would write
S1 <- append(S1, list(x))
In this situation it's essentially identical to c()
.
What is the equivalent of R's lm function for fitting simple linear regressions in python?
Use OLS
implementation from statsmodels
and its .summary
attribute, don't forget to add constant manually using add_constant
since it's not added by default.
import statsmodels.api as sm
reg = sm.OLS(y, sm.add_constant(X)).fit()
reg.summary
Related Topics
Split by Comma and Strip Whitespace in Python
Why Is "If Not Someobj:" Better Than "If Someobj == None:" in Python
Django: Redirect to Previous Page After Login
Django: Multiple Models in One Template Using Forms
Typeerror: Objectid('') Is Not JSON Serializable
What's an Efficient Way to Find If a Point Lies in the Convex Hull of a Point Cloud
Efficient Calculation of Fibonacci Series
Pandas Get Column Average/Mean
Matplotlib (Equal Unit Length): with 'Equal' Aspect Ratio Z-Axis Is Not Equal to X- and Y-
Pyinstaller Unable to Access Data Folder
How to Check If Any Value Is Nan in a Pandas Dataframe
Finding What Methods a Python Object Has
Logger Configuration to Log to File and Print to Stdout
Styling Multi-Line Conditions in 'If' Statements