Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
The or
and and
Python statements require truth-values. For pandas, these are considered ambiguous, so you should use "bitwise" |
(or) or &
(and) operations:
df = df[(df['col'] < -0.25) | (df['col'] > 0.25)]
These are overloaded for these kinds of data structures to yield the element-wise or
or and
.Just to add some more explanation to this statement:
The exception is thrown when you want to get the bool
of a pandas.Series
:
>>> import pandas as pd
>>> x = pd.Series([1])
>>> bool(x)
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
You hit a place where the operator implicitly converted the operands to bool
(you used or
but it also happens for and
, if
and while
):>>> x or x
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> x and x
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> if x:
... print('fun')
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> while x:
... print('fun')
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Besides these four statements, there are several Python functions that hide some bool
calls (like any
, all
, filter
, ...). These are normally not problematic with pandas.Series
, but for completeness I wanted to mention these.In your case, the exception isn't really helpful, because it doesn't mention the right alternatives. For
and
and or
, if you want element-wise comparisons, you can use:numpy.logical_or
:
or simply the>>> import numpy as np
>>> np.logical_or(x, y)|
operator:>>> x | y
numpy.logical_and
:
or simply the>>> np.logical_and(x, y)
&
operator:>>> x & y
There are several logical NumPy functions which should work on pandas.Series
.
The alternatives mentioned in the Exception are more suited if you encountered it when doing
if
or while
. I'll shortly explain each of these:If you want to check if your Series is empty:
Python normally interprets the>>> x = pd.Series([])
>>> x.empty
True
>>> x = pd.Series([1])
>>> x.empty
Falselen
gth of containers (likelist
,tuple
, ...) as truth-value if it has no explicit Boolean interpretation. So if you want the Python-like check, you could do:if x.size
orif not x.empty
instead ofif x
.If your
Series
contains one and only one Boolean value:>>> x = pd.Series([100])
>>> (x > 50).bool()
True
>>> (x < 50).bool()
FalseIf you want to check the first and only item of your Series (like
.bool()
, but it works even for non-Boolean contents):>>> x = pd.Series([100])
>>> x.item()
100If you want to check if all or any item is not-zero, not-empty or not-False:
>>> x = pd.Series([0, 1, 2])
>>> x.all() # Because one element is zero
False
>>> x.any() # because one (or more) elements are non-zero
True
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). df[condition]
As Michael Szczesny also pointed out in the comment. DataFrame.apply
uses a Series
as input. The change(name)
function defined expects a string. The message ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
comes from trying to compare a Series
to a string.
One fix pointed out by Register Sole is to use conditions instead.
condition = (df[‘embark_town’] == 'Southampton')
df[condition]['embark_town'] = 'Manchester'
To keep using apply, the change function would need to look something like this:def change(series):
if series.name == 'embark_town':
series[series.values == 'Southampton'] = 'Manchester'
return series
The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). in conditional in python
Are these Pandas DataFrames?
df1['new_column'] = df1.apply(lambda row: 1 if row.Year > 5 and row.TotalMntProducts > 2000 else 0, axis=1)
Improved Edit... let's vectorize:
df1['new_column'] = (df.Year.gt(5) & df.TotalMntProducts.gt(2000)).astype(int)
Pandas dataframe ValueError: The truth value of a Series is ambiguous when using .apply
It is because x >= 1
returns a series of True/False values based on the original numeric value, because x
is a series representing a column inside your lambda.
You could use (x >= 1).all()
or any()
or such but that won't suit your needs.
Instead you may use the below to transform each value in the df:
df.apply(lambda x : [1 if e >= 1 else 0 for e in x])
For Loop in Python Error: The truth value of a Series is ambiguous
Update 3
I rewrote your function in the same manner of your, so without change the logic and the type of your columns. I let you compare the two versions:
def delivery_year(delivery_date, build_year, service_year):
out = []
for i in range(len(delivery_date)):
if pd.notna(delivery_date[i]):
out.append(delivery_date[i])
elif pd.isna(delivery_date[i]) and pd.notna(build_year[i]):
out.append(build_year[i])
elif pd.isna(build_year[i]) and pd.notna(service_year[i]):
out.append(service_year[i].strip()[-4:])
else:
out.append(float("nan"))
return out
df["Delivery Year"] = delivery_year(df["Delivery Date"],
df["Build Year"],
df["In Service Date"])
Notes:I changed the name of your first parameter because
delivery_year
is also the name of your function, so it can be confusing.I also replaced the
.isna()
and.notna()
methods by their equivalent functions:pd.isna(...)
andpd.notna(...)
.The second
if
becameelif
Use combine_first
to replace your function. combine_first
updates first series ('Delivery Date') with the second series where values are NaN
. You can chain them to fill your 'Delivery Year'.
df['Delivery Year'] = df['Delivery Date'] \
.combine_first(df['Build Year']) \
.combine_first(df['In Service Date'].str[-4:])
Output:>>> df
Platform ID Delivery Date Build Year In Service Date Delivery Year
0 1 2009 NaN NaN 2009
1 2 NaN 2009 14-11-2010 2009
2 3 NaN NaN 14-11-2009 2009
3 4 NaN NaN NaN NaN
UpdateYou forgot the [i]
:
if delivery_year[i].notna():
The truth value of a Series is ambiguous:>>> delivery_year.notna()
0 True # <- 2009
1 False # <- NaN
2 False
3 False
Name: Delivery Date, dtype: bool
Pandas should consider the series is True (2009) or False (NaN)?You have to aggregate the result with .any()
or .all()
>>> delivery_year.notna().any()
True # because there is at least one non nan-value.
>>> delivery_year.notna().all()
False # because all values are not nan.
Why am i getting this error The truth value of a Series is ambiguous python
You can use apply()
method:
df_reps["Cat"]=df_reps["TermCnt"].apply(lambda x:x if x < 3 else 99)
Now if you print df_reps
you will get your desired output
Related Topics
How to Change Data Points Color Based on Some Variable
What's the Difference Between Subprocess Popen and Call (How to Use Them)
In Python, What Happens When You Import Inside of a Function
Python Progress Bar and Downloads
How to Crop an Image Using Pil
How to Use Numpy.Correlate to Do Autocorrelation
Finding the Values of the Arrow Keys in Python: Why Are They Triples
How to Find the Min/Max Value of a Common Key in a List of Dicts
Convert Categorical Data in Pandas Dataframe
Generate Rfc 3339 Timestamp in Python
Slicing of a Numpy 2D Array, or How to Extract an Mxm Submatrix from an Nxn Array (N>M)
How to Plot Empirical Cdf (Ecdf)
Django - Makemigrations - No Changes Detected
Create Spark Dataframe. Can Not Infer Schema for Type
Why Python Has Limit for Count of File Handles
Python Observer Pattern: Examples, Tips
Saving Plots (Axessubplot) Generated from Python Pandas with Matplotlib's Savefig