How to deal with SettingWithCopyWarning in Pandas
The SettingWithCopyWarning
was created to flag potentially confusing "chained" assignments, such as the following, which does not always work as expected, particularly when the first selection returns a copy. [see GH5390 and GH5597 for background discussion.]
df[df['A'] > 2]['B'] = new_val # new_val not set in df
The warning offers a suggestion to rewrite as follows:
df.loc[df['A'] > 2, 'B'] = new_val
However, this doesn't fit your usage, which is equivalent to:
df = df[df['A'] > 2]
df['B'] = new_val
While it's clear that you don't care about writes making it back to the original frame (since you are overwriting the reference to it), unfortunately this pattern cannot be differentiated from the first chained assignment example. Hence the (false positive) warning. The potential for false positives is addressed in the docs on indexing, if you'd like to read further. You can safely disable this new warning with the following assignment.
import pandas as pd
pd.options.mode.chained_assignment = None # default='warn'
Other Resources
- pandas User Guide: Indexing and selecting data
- Python Data Science Handbook: Data Indexing and Selection
- Real Python: SettingWithCopyWarning in Pandas: Views vs Copies
- Dataquest: SettingwithCopyWarning: How to Fix This Warning in Pandas
- Towards Data Science: Explaining the SettingWithCopyWarning in pandas
Pandas DataFrame: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame
I think need copy
and omit loc
for select columns:
df = df[df['my_col'].notnull()].copy()
df['my_col'] = df['my_col'].astype(int).astype(str)
Explanation:
If you modify values in df
later you will find that the modifications do not propagate back to the original data (df
), and that Pandas does warning.
solve SettingWithCopyWarning in pandas
Copy AMZN
when you create it:
AMZN = Data_stack[Data_stack.Ticker=='AMZN'].copy()
# ^^^^^^^
Then the rest of your code won't have a warning.
Pandas weird SettingWithCopyWarning warning
I got it.
What I didn't say (and thought wasn't relevant) in the question is that I was building df
out of slicing an existing DataFrame:
df = other_df[['volume_']]
When I do this instead:
df = other_df[['volume_']].copy()
Then everything falls back in order.
Still, my takeaway is that the warning message could probably benefit from being worded a bit more clearly, to say the least.
How can I get rid of settingwithcopywarning pandas
As mentioned by @jezrael. Whenever you want to duplicate/subset a dataframe, use the .copy() at the end to avoid this type of warning. To understand why this happens, I would recommend the 2nd answer by "cs95" in this post: How to deal with SettingWithCopyWarning in Pandas?
Pandas - SettingWithCopyWarning in a loop
Indeed in this line:
gdp_ire = gdp_data[gdp_data['LOCATION'] == "IRL"]
you are selecting a portion of the global dataframe and in the line below, you are modifying this subset.
One simple fix could be:
gdp_ire = gdp_data[gdp_data['LOCATION'] == "IRL"].copy()
Related Topics
How Are Iloc and Loc Different
Difference Between '/' and '//' When Used For Division
Deploying a Minimal Flask App in Docker - Server Connection Issues
Lazy Method For Reading Big File in Python
Does Pandas Iterrows Have Performance Issues
Extracting Text from HTML File Using Python
Convert Xml/Html Entities into Unicode String in Python
How to Select a HTML Element No Matter What Frame It Is in in Selenium
How to Import a Module Given the Full Path
Iterating Over Dictionaries Using 'For' Loops
How to List All Files of a Directory
How to Profile a Python Script
What Does Ruby Have That Python Doesn'T, and Vice Versa
Do Python Regular Expressions Have an Equivalent to Ruby'S Atomic Grouping
Why Do Some Regex Engines Match .* Twice in a Single Input String