What does `ValueError: cannot reindex from a duplicate axis` mean?
This error usually rises when you join / assign to a column when the index has duplicate values. Since you are assigning to a row, I suspect that there is a duplicate value in affinity_matrix.columns
, perhaps not shown in your question.
ValueError: cannot reindex from a duplicate axis in groupby Pandas
This error is often thrown due to duplications in your column names (not necessarily values)
First, just check if there is any duplication in your column names using the code:df.columns.duplicated().any()
If it's true, then remove the duplicated columns
df.loc[:,~df.columns.duplicated()]
After you remove the duplicated columns, you should be able to run your groupby
operation.
Convenient way to deal with ValueError: cannot reindex from a duplicate axis
Operations between series require non-duplicated indices, otherwise Pandas doesn't know how to align values in calculations. This isn't the case with your data currently.
If you are certain that your series are aligned by position, you can call reset_index
on each dataframe:
wind = pd.DataFrame({'DATE (MM/DD/YYYY)': ['2018-01-01', '2018-02-01', '2018-03-01']})
temp = pd.DataFrame({'stamp': ['1', '2', '3']}, index=[0, 1, 1])
# ATTEMPT 1: FAIL
wind['timestamp'] = wind['DATE (MM/DD/YYYY)'] + ' ' + temp['stamp']
# ValueError: cannot reindex from a duplicate axis
# ATTEMPT 2: SUCCESS
wind = wind.reset_index(drop=True)
temp = temp.reset_index(drop=True)
wind['timestamp'] = wind['DATE (MM/DD/YYYY)'] + ' ' + temp['stamp']
print(wind)
DATE (MM/DD/YYYY) timestamp
0 2018-01-01 2018-01-01 1
1 2018-02-01 2018-02-01 2
2 2018-03-01 2018-03-01 3
How to fix ValueError: cannot reindex on an axis with duplicate labels in python when I try to do?
Make two columns have lists of the same length, then two columns can be exploded at the same time.
import pandas as pd
data = [
[1,"user1",[1,2,3,4],["absd","efgh","ij``k"]],
[2,"user2",[5,6,7,8],["lmkf","sfajf"]],
[3,"user3",[9],[]],
]
df = pd.DataFrame(
data,
columns=list("ABCD")
)
def fill_list(a,length):
_a = a.copy()
tail = [None for i in range(length - len(a))]
_a.extend(tail)
return _a
df.assign(
D = df[["C","D"]].apply(lambda x:fill_list(x[1],len(x[0])),axis=1,raw=False)
).explode(["C","D"])
Pandas version is 1.3.5
Solution for multiple columns
import pandas as pd
data = [
[1, "user1", [1, 2, 3, 4], ["absd", "efgh", "ij``k"], [3, 2]],
[2, "user2", [5, 6, 7, 8], ["lmkf", "sfajf"], [3, 2, 1, 4, 2, 6]],
[3, "user3", [9], [], [3, 2]],
]
df = pd.DataFrame(
data,
columns=list("ABCDE")
)
def fill_list(*lists):
_lists = lists[:]
max_len = max([len(x) for x in _lists])
for l in _lists:
tail = [None for i in range(max_len - len(l))]
l.extend(tail)
return _lists
list_cols = ["C", "D", "E"]
df[list_cols] = df[list_cols].apply(lambda x: fill_list(*x), axis=1, raw=False, result_type="expand")
df.explode(list_cols)
ValueError: cannot reindex from a duplicate axis in explode
Try this:
df.explode(['Product_ID', 'No_of_items'])
> initial_row_index Date Product_ID No_of_items
0 1 2021-07-11 A13N 3
0 1 2021-07-11 A4BE 5
0 1 2021-07-11 5GH$ 1
1 2 2021-07-12 A13N 7
1 2 2021-07-12 X9HE 2
1 2 2021-07-12 7H3T 4
2 3 2021-07-13 A4BE 8
2 3 2021-07-13 X9HE 4
Related Topics
How to Manually Create a Legend
How to Find All Positions of the Maximum Value in a List
Label Python Data Points on Plot
Extracting Date from a String in Python
How to Download a File on a Click Event Using Selenium
How to Get First Element in a List of Tuples
Calling Class Staticmethod Within the Class Body
Working with Big Data in Python and Numpy, Not Enough Ram, How to Save Partial Results on Disc
Python Sharing a Lock Between Processes
Plotting a 2D Heatmap with Matplotlib
Returning the Product of a List
Cmd Opens Windows Store When I Type 'Python'
Given a Url to a Text File, What Is the Simplest Way to Read the Contents of the Text File
Isprime Function for Python Language
How to Decode Base64 Data in Python
How to Forward-Declare a Function to Avoid 'Nameerror's for Functions Defined Later