Python-Pandas: the Truth Value of a Series Is Ambiguous

Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

The or and and Python statements require truth-values. For pandas, these are considered ambiguous, so you should use "bitwise" | (or) or & (and) operations:

df = df[(df['col'] < -0.25) | (df['col'] > 0.25)]

These are overloaded for these kinds of data structures to yield the element-wise or or and.


Just to add some more explanation to this statement:

The exception is thrown when you want to get the bool of a pandas.Series:

>>> import pandas as pd
>>> x = pd.Series([1])
>>> bool(x)
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

You hit a place where the operator implicitly converted the operands to bool (you used or but it also happens for and, if and while):

>>> x or x
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> x and x
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> if x:
... print('fun')
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> while x:
... print('fun')
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Besides these four statements, there are several Python functions that hide some bool calls (like any, all, filter, ...). These are normally not problematic with pandas.Series, but for completeness I wanted to mention these.


In your case, the exception isn't really helpful, because it doesn't mention the right alternatives. For and and or, if you want element-wise comparisons, you can use:

  • numpy.logical_or:

    >>> import numpy as np
    >>> np.logical_or(x, y)

    or simply the | operator:

    >>> x | y
  • numpy.logical_and:

    >>> np.logical_and(x, y)

    or simply the & operator:

    >>> x & y

If you're using the operators, then be sure to set your parentheses correctly because of operator precedence.

There are several logical NumPy functions which should work on pandas.Series.


The alternatives mentioned in the Exception are more suited if you encountered it when doing if or while. I'll shortly explain each of these:

  • If you want to check if your Series is empty:

    >>> x = pd.Series([])
    >>> x.empty
    True
    >>> x = pd.Series([1])
    >>> x.empty
    False

    Python normally interprets the length of containers (like list, tuple, ...) as truth-value if it has no explicit Boolean interpretation. So if you want the Python-like check, you could do: if x.size or if not x.empty instead of if x.

  • If your Series contains one and only one Boolean value:

    >>> x = pd.Series([100])
    >>> (x > 50).bool()
    True
    >>> (x < 50).bool()
    False
  • If you want to check the first and only item of your Series (like .bool(), but it works even for non-Boolean contents):

    >>> x = pd.Series([100])
    >>> x.item()
    100
  • If you want to check if all or any item is not-zero, not-empty or not-False:

    >>> x = pd.Series([0, 1, 2])
    >>> x.all() # Because one element is zero
    False
    >>> x.any() # because one (or more) elements are non-zero
    True

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). df[condition]

As Michael Szczesny also pointed out in the comment. DataFrame.apply uses a Series as input. The change(name) function defined expects a string. The message ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). comes from trying to compare a Series to a string.

One fix pointed out by Register Sole is to use conditions instead.

condition = (df[‘embark_town’] == 'Southampton')
df[condition]['embark_town'] = 'Manchester'

To keep using apply, the change function would need to look something like this:

def change(series):
if series.name == 'embark_town':
series[series.values == 'Southampton'] = 'Manchester'

return series

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). in conditional in python

Are these Pandas DataFrames?

df1['new_column'] = df1.apply(lambda row: 1 if row.Year > 5 and row.TotalMntProducts > 2000 else 0, axis=1)

Improved Edit... let's vectorize:

df1['new_column'] = (df.Year.gt(5) & df.TotalMntProducts.gt(2000)).astype(int)

Pandas dataframe ValueError: The truth value of a Series is ambiguous when using .apply

It is because x >= 1 returns a series of True/False values based on the original numeric value, because x is a series representing a column inside your lambda.

You could use (x >= 1).all() or any() or such but that won't suit your needs.

Instead you may use the below to transform each value in the df:

df.apply(lambda x : [1 if e >= 1 else 0 for e in x])

For Loop in Python Error: The truth value of a Series is ambiguous

Update 3

I rewrote your function in the same manner of your, so without change the logic and the type of your columns. I let you compare the two versions:

def delivery_year(delivery_date, build_year, service_year):
out = []
for i in range(len(delivery_date)):
if pd.notna(delivery_date[i]):
out.append(delivery_date[i])
elif pd.isna(delivery_date[i]) and pd.notna(build_year[i]):
out.append(build_year[i])
elif pd.isna(build_year[i]) and pd.notna(service_year[i]):
out.append(service_year[i].strip()[-4:])
else:
out.append(float("nan"))
return out

df["Delivery Year"] = delivery_year(df["Delivery Date"],
df["Build Year"],
df["In Service Date"])

Notes:

  1. I changed the name of your first parameter because delivery_year is also the name of your function, so it can be confusing.

  2. I also replaced the .isna() and .notna() methods by their equivalent functions: pd.isna(...) and pd.notna(...).

  3. The second if became elif

Update 2

Use combine_first to replace your function. combine_first updates first series ('Delivery Date') with the second series where values are NaN. You can chain them to fill your 'Delivery Year'.

df['Delivery Year'] = df['Delivery Date'] \
.combine_first(df['Build Year']) \
.combine_first(df['In Service Date'].str[-4:])

Output:

>>> df
Platform ID Delivery Date Build Year In Service Date Delivery Year
0 1 2009 NaN NaN 2009
1 2 NaN 2009 14-11-2010 2009
2 3 NaN NaN 14-11-2009 2009
3 4 NaN NaN NaN NaN

Update

You forgot the [i]:

if delivery_year[i].notna():

The truth value of a Series is ambiguous:

>>> delivery_year.notna()
0 True # <- 2009
1 False # <- NaN
2 False
3 False
Name: Delivery Date, dtype: bool

Pandas should consider the series is True (2009) or False (NaN)?

You have to aggregate the result with .any() or .all()

>>> delivery_year.notna().any()
True # because there is at least one non nan-value.

>>> delivery_year.notna().all()
False # because all values are not nan.

Why am i getting this error The truth value of a Series is ambiguous python

You can use apply() method:

df_reps["Cat"]=df_reps["TermCnt"].apply(lambda x:x if x < 3 else 99)

Now if you print df_reps you will get your desired output



Related Topics



Leave a reply



Submit