Pandas dataframe with multiindex column - merge levels
you could always change the columns:
grouped.columns = ['%s%s' % (a, '|%s' % b if b else '') for a, b in grouped.columns]
Merge MultiIndex columns together into 1 level
There's discussion of this here:
Python Pandas - How to flatten a hierarchical index in columns
And the consensus seems to be:
x.columns = ['_'.join(col) for col in x.columns.values]
print(x)
sum_a sum_b max_a max_b
date
1/1/2016 2 6 1 4
1/2/2016 1 1 1 1
Would be nice if there was an inbuilt method for this, but there doesn't seem to be.
Merge DataFrame with multi index columns
Use:
print (df1)
stockA stockB
O L H C O L H C
1/1/19 10 15 20 17 35 30 39 37
2/1/19 12 13 26 27 31 50 29 17
print (df2)
stockA stockB
2/1/19 1.5 3.2
3/1/19 1.2 6.2
Convert index in both index
to datetime
s if necessary:
df1.index = pd.to_datetime(df1.index, format='%d/%m/%y')
df2.index = pd.to_datetime(df2.index, format='%d/%m/%y')
Get same values in both indices by Index.intersection
:
idx = df1.index.intersection(df2.index)
print (idx)
DatetimeIndex(['2019-01-02'], dtype='datetime64[ns]', freq=None)
Create MultiIndex
in MultiIndex.from_product
in df2
:
df2.columns = pd.MultiIndex.from_product([df2.columns, ['new']])
print (df2)
stockA stockB
new new
2019-01-02 1.5 3.2
2019-01-03 1.2 6.2
Filter both DataFrames by DataFrame.loc
, join together by DataFrame.join
and last sorting MultiIndex
by DataFrame.sort_index
:
df = df1.loc[idx].join(df2.loc[idx]).sort_index(level=0, axis=1)
print (df)
stockA stockB
C H L O new C H L O new
2019-01-02 27 26 13 12 1.5 17 29 50 31 3.2
Merging Pandas DataFrames on MultiIndex column values
For me working add []
for tuples
:
df = pd.merge(df1,df2, left_on=[('df1_labels','col2')], right_on=[('df2_labels','col4')])
print (df)
df1_labels df2_labels
col1 col2 col3 col4
0 1 10 100 10
1 1 10 100 10
2 2 20 200 20
3 2 20 200 20
Pandas multi-index dataframe merge issue
The problem here is that on
can use one or more columns to merge two dataframes
so when you pass on=('id', '0')
it thinks you want to merge on two fields. Writing on=[('id', '0')]
removes the ambiguity. One column to merge on and two labels specified as part of the multiindex:
df3 = pd.merge(df1, df2, on=[('id', '0')], how="outer")
pandas merge with multiindex columns but single index index
Is it what you expect:
>>> pd.merge(dfL, dfR, left_index=True, right_on='index1', suffixes=('_n','_p'))
bar_n bar_p index1
one one
0 0.610819 0.238307 1
From merge
documentation:
If it is a MultiIndex, the number of keys in the other DataFrame (either the index or a number of columns) must match the number of levels.
Combine Pandas DataFrames while creating MultiIndex Columns
You can use the keys
kwarg of concat:
In [11]: res = pd.concat([close, volume], axis=1, keys=["close", "volume"])
In [12]: res
Out[12]:
close volume
AAPL CSCO MSFT AAPL CSCO MSFT
Date
2016-10-03 112.52 31.50 57.42 21701800 14070500 19189500
2016-10-04 113.00 31.35 57.24 29736800 18460400 20085900
2016-10-05 113.05 31.59 57.64 21453100 11808600 16726400
With a little rearrangement:
In [13]: res.swaplevel(0, 1, axis=1).sort_index(axis=1)
Out[13]:
AAPL CSCO MSFT
close volume close volume close volume
Date
2016-10-03 112.52 21701800 31.50 14070500 57.42 19189500
2016-10-04 113.00 29736800 31.35 18460400 57.24 20085900
2016-10-05 113.05 21453100 31.59 11808600 57.64 16726400
Pandas merge with MultiIndex for repeated columns
Try concat
, with keys
parameter and join='inner'
:
print(pd.concat([left_feet, right_feet], axis=1, keys=['Left','Right'], join='inner'))
Left Right
Length Width Length Width
1 30 10 30 10
2 25 9 24 8
Related Topics
Plot a Histogram Such That Bar Heights Sum to 1 (Probability)
Matplotlib Log Scale Tick Label Number Formatting
Drawing Lines Between Two Plots in Matplotlib
How to Write to a File, Using the Logging Python Module
Check If String Contains Only Whitespace
Time Complexity of Python Set Operations
How to Run an Ipython Magic from a Script (Or Timing a Python Script)
Syntax Error When Using Command Line in Python
Check If a Number Is Int or Float
Looping Over All Member Variables of a Class in Python
Dll Load Failed When Importing Pyqt5
Python 3.5 - "Geckodriver Executable Needs to Be in Path"