Loop Over Rows of Dataframe Applying Function with If-Statement

Loop over rows of dataframe applying function with if-statement

This operation doesn't require loops, apply statements or if statements. Vectorised operations and subsetting is all you need:

t.d <- within(t.d, V4 <- V1 + V3)
t.d[!(t.d$V1>1 & t.d$V3<9), "V4"] <- 0
t.d

V1 V2 V3 V4
1 1 4 7 0
2 2 5 8 10
3 3 6 9 0

Why does this work?

In the first step I create a new column that is the straight sum of columns V1 and V4. I use within as a convenient way of referring to the columns of d.f without having to write d.f$V all the time.

In the second step I subset all of the rows that don't fulfill your conditions and set V4 for these to 0.

Using apply() to loop over each row of a dataframe with an if statement

There are many ways in R to create a new column based on conditions. There should be no need to loop or apply for this kind of thing, you should be using R's "vectorised" operations which can act on all rows at once:

iris$NewColumn = ifelse(iris$Petal.Width == 0.2, "Response String 1", "Other")

The vectorised operations are generally much faster, and will often work better as you can cause unintended problems when you try to add to columns element-by-element.

Applying an if then loop to each row of data frame in R

You almost certainly want to avoid loops and if statements here in favor of vectorized conditionals and assignment.

Let's take your first example if (RT2 > Mean + (2.5 * SD)) RT2 = Mean + 2.5 * SD, assuming your data.frame is called dat:

sel <- dat$RT2>dat$mean + 2.5*dat$SD # creates a boolean of length nrow(dat)
dat$RT2[sel] <- with(dat[sel,], mean + 2.5*SD)

You can use with() to save a lot of typing of "dat$".

N.B. I haven't tested this since there's no reproducible dataset. There's almost certainly a typo somewhere!

Using IF, ELSE conditions to iterate through a pandas dataframe, then obtaining and using data from where the set condition was satisfied

Use .loc to filter your dataframe and apply your function. Use a lambda function as a proxy to call your function with the right signature.

def myfunc(firstname, surname):
return firstname + ' ' + surname

out = df.loc[df['Age'].gt(11) & df['Size'].lt(51), ['First Name', 'Surname']] \
.apply(lambda x: myfunc(*x), axis=1)

Output:

>>> out
1 Alex Mulan
2 Leo Carlitos
dtype: object

>>> type(out)
pandas.core.series.Series

Fastest way to use if/else statements when looping through dataframe with pandas

Locate the relevant rows and modify them:

df.loc[df["date"].str.len() == 7, "date"] = "0" + df.loc[df["date"].str.len()== 7, "date"]

Dataframe for loops, if statements and append()

I think the part you are seeking for is to make a for loop over all your rows and do some calculation based on the row, so I will not go through all your detailed process and just show the basic usage.

Please take a look on apply(). (doc)

The apply() function can make you loop over along any axes.

You can easily write down your logic in a function (like switch() in the following example).

In this function, you can access your column value by dot operator. (like row.var1).

Here is a minimal example.

import pandas as pd
d = {'var1': [0.7, 0.5],
'var2': [0.6, 0.3],
'var3': [3, 4],
'var4': [3, 4]}
df = pd.DataFrame(data=d)

def switch(row):
if row.var1 >= 0.5 and row.var2 <= 0.5:
return 'foo'
elif row.var1 <= 0.5 and row.var1 >= 0.5:
return 'bar'
else:
return 'baz'

port_switching = df.apply(switch, axis=1)

How to combine for loop through rows and if statement

If I understand correctly, you want to check if any of the values are outside the given limits, and if so, print the indices (row and column) of those values.

To do so, you need to compare each value to the limits, not a row:

for index, row in df.iterrows():
for column, value in row.items():
if value > sup_value:
print(index, column, value)
if value < inf_value:
print(index, column, value)

Iterate over rows and write a new column if condition meets python

If I understand correctly content is a list not a dataframe. If this is the case you can us .isin which will return True or False for each row which can be mapped to whatever suffix you want.

import pandas as pd
content = ['P53-Malat1','Neat1-Malat1','Gap1-Malat1']

f2 = pd.DataFrame({'intA': {0: 'P53-Malat1', 1: 'Gap1-Malat1'},
'intB': {0: 'Neat1-Malat1', 1: 'Malat1-Pias3'}})

f2['col1_search'] = f2.intA + f2.intA.isin(content).map({True:'_found',False:'_not_found'})
f2['col2_search'] = f2.intB + f2.intB.isin(content).map({True:'_found',False:'_not_found'})

Output

          intA          intB        col1_search             col2_search
0 P53-Malat1 Neat1-Malat1 P53-Malat1_found Neat1-Malat1_found
1 Gap1-Malat1 Malat1-Pias3 Gap1-Malat1_found Malat1-Pias3_not_found

Or perhaps if you have many columns:

(f2 + f2.isin(content).replace({True:'_found',False:'_not_found'})).add_suffix('_search')

Output

         intA_search             intB_search
0 P53-Malat1_found Neat1-Malat1_found
1 Gap1-Malat1_found Malat1-Pias3_not_found

which could be merged back to the original data with

pd.concat([f2,(f2 + f2.isin(content).replace({True:'_found',False:'_not_found'})).add_suffix('_search')], axis=1)

Output

          intA          intB        intA_search             intB_search
0 P53-Malat1 Neat1-Malat1 P53-Malat1_found Neat1-Malat1_found
1 Gap1-Malat1 Malat1-Pias3 Gap1-Malat1_found Malat1-Pias3_not_found

Trying to for-loop over and find specific strings in a dataframe using if-statement

You should not iterate rows when you can apply a vectorial function. Also, here your i is an integer so it will never match "Happy".

Use instead for inclusion in a list:

df["Emotional State"].isin(['Happy']).sum()

or, for exact match:

df["Emotional State"].eq('Happy').sum()

or, for partial match:

df["Emotional State"].str.contains('Happy').sum()

or, to count all (exact) values:

df["Emotional State"].value_counts()

S9 dtype

those are bytes, you need to convert to string first:

df['Emotional State'] = df['Emotional State'].str.decode("utf-8")


Related Topics



Leave a reply



Submit