What Does 'Valueerror: Cannot Reindex from a Duplicate Axis' Mean

What does `ValueError: cannot reindex from a duplicate axis` mean?

This error usually rises when you join / assign to a column when the index has duplicate values. Since you are assigning to a row, I suspect that there is a duplicate value in affinity_matrix.columns, perhaps not shown in your question.

ValueError: cannot reindex from a duplicate axis in groupby Pandas

This error is often thrown due to duplications in your column names (not necessarily values)

First, just check if there is any duplication in your column names using the code:
df.columns.duplicated().any()

If it's true, then remove the duplicated columns

df.loc[:,~df.columns.duplicated()]

After you remove the duplicated columns, you should be able to run your groupby operation.

Convenient way to deal with ValueError: cannot reindex from a duplicate axis

Operations between series require non-duplicated indices, otherwise Pandas doesn't know how to align values in calculations. This isn't the case with your data currently.

If you are certain that your series are aligned by position, you can call reset_index on each dataframe:

wind = pd.DataFrame({'DATE (MM/DD/YYYY)': ['2018-01-01', '2018-02-01', '2018-03-01']})
temp = pd.DataFrame({'stamp': ['1', '2', '3']}, index=[0, 1, 1])

# ATTEMPT 1: FAIL
wind['timestamp'] = wind['DATE (MM/DD/YYYY)'] + ' ' + temp['stamp']
# ValueError: cannot reindex from a duplicate axis

# ATTEMPT 2: SUCCESS
wind = wind.reset_index(drop=True)
temp = temp.reset_index(drop=True)
wind['timestamp'] = wind['DATE (MM/DD/YYYY)'] + ' ' + temp['stamp']

print(wind)

DATE (MM/DD/YYYY) timestamp
0 2018-01-01 2018-01-01 1
1 2018-02-01 2018-02-01 2
2 2018-03-01 2018-03-01 3

How to fix ValueError: cannot reindex on an axis with duplicate labels in python when I try to do?

Make two columns have lists of the same length, then two columns can be exploded at the same time.

import pandas as pd

data = [
[1,"user1",[1,2,3,4],["absd","efgh","ij``k"]],
[2,"user2",[5,6,7,8],["lmkf","sfajf"]],
[3,"user3",[9],[]],
]
df = pd.DataFrame(
data,
columns=list("ABCD")
)

def fill_list(a,length):
_a = a.copy()
tail = [None for i in range(length - len(a))]
_a.extend(tail)
return _a

df.assign(
D = df[["C","D"]].apply(lambda x:fill_list(x[1],len(x[0])),axis=1,raw=False)
).explode(["C","D"])

Pandas version is 1.3.5

Solution for multiple columns

import pandas as pd

data = [
[1, "user1", [1, 2, 3, 4], ["absd", "efgh", "ij``k"], [3, 2]],
[2, "user2", [5, 6, 7, 8], ["lmkf", "sfajf"], [3, 2, 1, 4, 2, 6]],
[3, "user3", [9], [], [3, 2]],
]
df = pd.DataFrame(
data,
columns=list("ABCDE")
)

def fill_list(*lists):
_lists = lists[:]
max_len = max([len(x) for x in _lists])
for l in _lists:
tail = [None for i in range(max_len - len(l))]
l.extend(tail)
return _lists

list_cols = ["C", "D", "E"]

df[list_cols] = df[list_cols].apply(lambda x: fill_list(*x), axis=1, raw=False, result_type="expand")
df.explode(list_cols)

ValueError: cannot reindex from a duplicate axis in explode

Try this:

df.explode(['Product_ID', 'No_of_items'])
> initial_row_index Date Product_ID No_of_items
0 1 2021-07-11 A13N 3
0 1 2021-07-11 A4BE 5
0 1 2021-07-11 5GH$ 1
1 2 2021-07-12 A13N 7
1 2 2021-07-12 X9HE 2
1 2 2021-07-12 7H3T 4
2 3 2021-07-13 A4BE 8
2 3 2021-07-13 X9HE 4


Related Topics



Leave a reply



Submit