Pandas "Can Only Compare Identically-Labeled Dataframe Objects" Error

Pandas Can only compare identically-labeled DataFrame objects error

Here's a small example to demonstrate this (which only applied to DataFrames, not Series, until Pandas 0.19 where it applies to both):

In [1]: df1 = pd.DataFrame([[1, 2], [3, 4]])

In [2]: df2 = pd.DataFrame([[3, 4], [1, 2]], index=[1, 0])

In [3]: df1 == df2
Exception: Can only compare identically-labeled DataFrame objects

One solution is to sort the index first (Note: some functions require sorted indexes):

In [4]: df2.sort_index(inplace=True)

In [5]: df1 == df2
Out[5]: 
      0     1
0  True  True
1  True  True

Note: == is also sensitive to the order of columns, so you may have to use sort_index(axis=1):

In [11]: df1.sort_index().sort_index(axis=1) == df2.sort_index().sort_index(axis=1)
Out[11]: 
      0     1
0  True  True
1  True  True

Note: This can still raise (if the index/columns aren't identically labelled after sorting).

Comparing 2 dataframes gives : Can only compare identically-labeled DataFrame objects

You can use reindex_like to make bru2 have the same indexing as bru then compare the dataframes.

bru2.reindex_like(bru).compare(bru)

And you can use pd.Index.difference to find the rows or columns in bru2 that are in bru.

bru.index.difference(bru2.index) #and like wise with bru.columns and bru2.columns

Compare two DataFrames for differences but getting 'Can only compare identically-labeled DataFrame objects' error

Seems some indices are different, is possible extract same in both by Index.intersection:

BOOL_FIELDS = ['is_mobile','is_desktop','is_cancelled','is_existing_customer']

customer_df_2020.set_index('customer_id',inplace=True)
customer_df_2021.set_index('customer_id',inplace=True)

sameidx = customer_df_2020.index.intersection(customer_df_2021.index)

temp_df  = (customer_df_2020.loc[sameidx, BOOL_FIELDS] != 
            customer_df_2021.loc[sameidx, BOOL_FIELDS])

ErrorCan only compare identically-labeled Series objects and sort_index

I think you need reset_index for same index values and then comapare - for create new column is better use mask or numpy.where:

Also instead + use | because working with booleans.

df1 = df1.reset_index(drop=True)
df2 = df2.reset_index(drop=True)
df1['v_100'] = df1['choice'].mask(df1['choice'] != df2['choice'],
                                  (df1['choice'] + df2['choice']) * 0.5)

df1['v_100'] = np.where(df1['choice'] != df2['choice'],
                       (df1['choice'] | df2['choice']) * 0.5,
                        df1['choice'])

Samples:

print (df1)
   v_100  choice
5      7    True
6      0    True
7      7   False
8      2    True

print (df2)
   v_100  choice
4      1   False
5      2    True
6     74    True
7      6    True

df1 = df1.reset_index(drop=True)
df2 = df2.reset_index(drop=True)
print (df1)
   v_100  choice
0      7    True
1      0    True
2      7   False
3      2    True

print (df2)
   v_100  choice
0      1   False
1      2    True
2     74    True
3      6    True

df1['v_100'] = df1['choice'].mask(df1['choice'] != df2['choice'],
                                  (df1['choice'] | df2['choice']) * 0.5)

print (df1)
   v_100  choice
0    0.5    True
1    1.0    True
2    0.5   False
3    1.0    True

Pandas Join- Can only compare identically-labeled Series objects

I suggest you use pd.merge

df = pd.merge(telemetry, errors1, how='left', left_on=['machineID','datetime'], right_on = ['machineID','datetime'])

Python Pandas Only Compare Identically Labeled DataFrame Objects

In order to get around this, you want to compare the underlying numpy arrays.

import pandas as pd

df1 = pd.DataFrame([[1, 2], [3, 4]], columns=['A', 'B'], index=['One', 'Two'])
df2 = pd.DataFrame([[1, 2], [3, 4]], columns=['a', 'b'], index=['one', 'two'])

df1.values == df2.values

array([[ True,  True],
       [ True,  True]], dtype=bool)

Pandas "Can Only Compare Identically-Labeled Dataframe Objects" Error