In Pandas, Is Inplace = True Considered Harmful, or Not

Understanding inplace=True in pandas

When inplace=True is passed, the data is renamed in place (it returns nothing), so you'd use:

df.an_operation(inplace=True)

When inplace=False is passed (this is the default value, so isn't necessary), performs the operation and returns a copy of the object, so you'd use:

df = df.an_operation(inplace=False) 

Python sort_values (inplace=True) but not really?

You need to reset indices to see the correct order in the loop:

frame.sort_values(by=['state','date1'], inplace=True).reset_index(inplace = True)

Otherwise, when iterating over the data frame, it moves forward based on the row indices. Hence, you can see the same order as you had in the original data frame. You can also verify the fact by looking at the indices in your examples.

Pandas: peculiar performance drop for inplace rename after dropna

This is a copy of the explanation on github.

There is no guarantee that an inplace operation is actually faster. Often they are actually the same operation that works on a copy, but the top-level reference is reassigned.

The reason for the difference in performance in this case is as follows.

The (df1-df2).dropna() call creates a slice of the dataframe. When you apply a new operation, this triggers a SettingWithCopy check because it could be a copy (but often is not).

This check must perform a garbage collection to wipe out some cache references to see if it's a copy. Unfortunately python syntax makes this unavoidable.

You can not have this happen, by simply making a copy first.

df = (df1-df2).dropna().copy()

followed by an inplace operation will be as performant as before.

My personal opinion: I never use in-place operations. The syntax is harder to read and it does not offer any advantages.

pandas.series.replace() inplace = True not working

My interpretation of your question might be incorrect, but if you are cycling through a list of punctuation characters in punc and you want to just remove all of them while keeping the rest of the text, I think you can do something simpler like the following:

for ch in punc:
des = des.str.replace(ch, "")

As you probably know, replace is the standard python string method to replace one series of characters with another. E.g.:

'abc'.replace('b', 'z')

returns 'azc'

When you use Series.str.replace() you are using that same string replace method, but now it will be applied to every element in the Series. AFAIK, all string methods can be applied element wise to a series using this same syntax Series.str.some_string_method()

Python Pandas data frame cell value does not update after using clip

It may be better to default to explicit assignment so that it's clearer what's happening. inplace=True performed on this slice of the dataframe doesn't appear to be assigning as expected, consistently.

There's some debate on whether the flag should stick around at all. In pandas, is inplace = True considered harmful, or not?

df_init['W_EX'] = df_init['W_EX'].clip(upper=1.0)


Related Topics



Leave a reply



Submit