Prepend a level to a pandas MultiIndex
A nice way to do this in one line using pandas.concat()
:
import pandas as pd
pd.concat([df], keys=['Foo'], names=['Firstlevel'])
An even shorter way:
pd.concat({'Foo': df}, names=['Firstlevel'])
This can be generalized to many data frames, see the docs.
How can I add a level to a MultiIndex?
The simpliest is use stack
with new DataFrame
with columns by new level values:
df1 = pd.DataFrame(data=1,index=df.index, columns=new_level_labels).stack()
df1.index.names = ['Level0','Level1',new_level_name]
print (df1)
Level0 Level1 New level
foo a p 1
q 1
b p 1
q 1
qux a p 1
q 1
dtype: int64
print (df1.index)
MultiIndex(levels=[['foo', 'qux'], ['a', 'b'], ['p', 'q']],
labels=[[0, 0, 0, 0, 1, 1], [0, 0, 1, 1, 0, 0], [0, 1, 0, 1, 0, 1]],
names=['Level0', 'Level1', 'New level'])
Pandas add a second level index to the columns using a list
Pandas multi index creation requires a list(or list like) passed as an argument:
df.columns = pd.MultiIndex.from_arrays([food_cat, df.columns])
df
fv j
apple strawberry banana chocolate cake
0 7 3 1 5 4
1 5 5 2 8 4
2 6 2 1 4 5
3 4 1 2 2 1
4 7 3 2 1 3
5 5 0 2 6 0
6 8 4 1 4 0
7 6 2 3 5 3
How do I add a multi-level column index to an existing df?
You can manually construct a pandas.MultiIndex
using one of several constructors. From the docs for your case:
MultiIndex.from_arrays
Convert list of arrays to MultiIndex.MultiIndex.from_tuples
Convert list of tuples to a MultiIndex.MultiIndex.from_frame
Make a MultiIndex from a DataFrame.
For your case, I think pd.MultiIndex.from_arrays
might be the easiest way:
df.columns=pd.MultiIndex.from_arrays([['H','H'],['Cat1','Cat2'],df.columns],names=['Importance','Category',''])
output:
Importance| H | H |
Category | Cat1 | Cat2 |
|Total Assets | AUMs |
Firm 1 | 100 | 300 |
Firm 2 | 200 | 3400 |
Firm 3 | 300 | 800 |
Firm 4 | NaN | 800 |
pd.MultiIndex: How do I add 1 more level (0) to a multi-index column?
Use concat()
:
df=pd.concat([df],keys=['H'],names=['Importance'],axis=1)
Pandas - Add a column level to multi index
Here's a way:
cols = pd.MultiIndex(levels=[['Foo', 'Bar'], ['A', 'B', 'C'], ['a']],
labels=[[0, 0, 0, 1, 1, 1], [0, 1, 2, 0, 1, 2], [0, 0, 0, 0, 0, 0]],
names=['L1', 'L2', 'L3'])
pd.DataFrame(columns = cols).T\
.assign(x = [0.01, 0.01, 0.01, 0.02, 0.02, 0.02])\
.set_index('x', append=True).T
Output:
Is there a way to add a new column to a pandas multiindex that only aligns with one level?
Use pd.IndexSlice
:
idx = pd.IndexSlice
df.loc[idx[:, 1], 'Color'] = ['Yellow', 'Red']
print(df)
# Output
Value Color
A 1 300 Yellow
2 850 NaN
3 2000 NaN
B 1 100 Red
2 70 NaN
3 400 NaN
Or only with slice
:
df.loc[(slice(None), 1), 'Color'] = ['Yellow', 'Red']
print(df)
# Output
Value Color
A 1 300 Yellow
2 850 NaN
3 2000 NaN
B 1 100 Red
2 70 NaN
3 400 NaN
Where to add multiindex level in pandas?
The most straightforward modification would be to use the subset
argument of Styler.apply
in conjunction with pd.IndexSlice
. Then reduce from DataFrame applied styles to Series level applied styles for each column in the MultiIndex matching the subset:
def highlight_cells(x):
c = 'background-color: white'
c1 = 'background-color: red'
c2 = 'background-color: blue'
c3 = 'background-color: green'
k1 = x.str.contains("a", na=False)
k2 = x.str.contains("b", na=False)
k3 = x.str.contains("c", na=False)
# Build Series for _column_ level Styles
color_data = pd.Series(c, index=x.index)
color_data[k1] = c1
color_data[k2] = c2
color_data[k3] = c3
return color_data
idx = pd.IndexSlice
end = Ldata.style.apply(highlight_cells, subset=idx[:, idx[:, 'new']])
end
Naturally this can be done without the separate variables as well:
def highlight_cells(x):
color_data = pd.Series('background-color: white', index=x.index)
color_data[x.str.contains("a", na=False)] = 'background-color: red'
color_data[x.str.contains("b", na=False)] = 'background-color: blue'
color_data[x.str.contains("c", na=False)] = 'background-color: green'
return color_data
idx = pd.IndexSlice
end = Ldata.style.apply(highlight_cells, subset=idx[:, idx[:, 'new']])
end
Or without building an indexed structure with np.select
:
def highlight_cells(x):
return np.select(
[x.str.contains("a", na=False),
x.str.contains("b", na=False),
x.str.contains("c", na=False)],
['background-color: red',
'background-color: blue',
'background-color: green'],
default='background-color: white'
)
idx = pd.IndexSlice
end = Ldata.style.apply(highlight_cells, subset=idx[:, idx[:, 'new']])
end
All options produce:
Add an additional level using MultiIndex.from_tuples in pandas
Try:
# cache the original columns
columns = df1.columns
# force columns to RangeIndex
df1.columns = np.arange(df1.shape[1])
# rename the columns with the new name
df1.columns = pd.MultiIndex.from_tuples([
(x,a,y) for (x,y),a in zip(columns, [1,2,1,3])
])
Output:
first second
1 2 1 3
D a b c
0 A 1 2 3
1 B -4 5 -6
2 d 7 0 9
Intestingly, a direct reassignment of the columns throw a warning and did not do anything, hence the go-around.
Related Topics
Download File Using Partial Download (Http)
Asyncio.Gather VS Asyncio.Wait
Group Dataframe and Get Sum and Count
How to Change the String Representation of a Python Class
How to Cycle Through Line Styles in Matplotlib
How to Return a Subset of a List That Matches a Condition
Test Case Execution Order in Pytest
What Does "Error: Option --Single-Version-Externally-Managed Not Recognized" Indicate
Why Is 'Object' an Instance of 'Type' and 'Type' an Instance of 'Object'
Beautiful Soup 4 Find_All Don't Find Links That Beautiful Soup 3 Finds
Pandas: Resample Timeseries with Groupby
Pandas: Valueerror: Cannot Convert Float Nan to Integer
Convert Year/Month/Day to Day of Year in Python
Removing Control Characters from a String in Python
How to Use Python Numpy.Savetxt to Write Strings and Float Number to an Ascii File