How to simply add a column level to a pandas dataframe
As suggested by @StevenG himself, a better answer:
df.columns = pd.MultiIndex.from_product([df.columns, ['C']])
print(df)
# A B
# C C
# a 0 0
# b 1 1
# c 2 2
# d 3 3
# e 4 4
Inserting/Adding another column level to pandas dataframe
Try pd.MultiIndex.from_arrays
df.columns = pd.MultiIndex.from_arrays([col_l1,df.columns],names=['L0','L1'])
L0 col0_L0 col1_L0 col2_L0
L1 col0_L1 col1_L1 col2_L1
0 14 58 52
1 92 21 16
2 39 86 93
print(df.columns)
MultiIndex([('col0_L0', 'col0_L1'),
('col1_L0', 'col1_L1'),
('col2_L0', 'col2_L1')],
names=['L0', 'L1'])
how to add levels to a new column in pandas dataframe based on a condition?
Update: @enke's comment is MUCH cleaner than my code. If you want to place the new column in the first position, simply add the column first.
Update2: Adding more code per your question below this post about adding the answer_option column:
import pandas as pd
cols = ['Answer', 'Avg_Score']
data=[['w1', 'N/A'],
['w2', 4.3],
['w3', 1.2],
['w4', 3.5],
['w5', 'N/A'],
['w6', 3.1],
['w7', 2.4],
['w8', 1.7],
['w9', 4.6],
['w10', 'N/A'],
['w11', 3.0]]
df = pd.DataFrame(data, columns = cols)
# insert new answer column in first so it's before the Answer and the Avg_Score
df.insert(0,'answer_option','')
# insert new question number column in the first position
df.insert(0,'question_number','')
# Line from @enke's comment
df['question_number'] = 'Q' + df['Avg_Score'].eq('N/A').cumsum().astype(str)
df['answer_option'] = 'A' + df.groupby('question_number').cumcount().add(1).astype(str)
print(df)
New Output:
question_number answer_option Answer Avg_Score
0 Q1 A1 w1 N/A
1 Q1 A2 w2 4.3
2 Q1 A3 w3 1.2
3 Q1 A4 w4 3.5
4 Q2 A1 w5 N/A
5 Q2 A2 w6 3.1
6 Q2 A3 w7 2.4
7 Q2 A4 w8 1.7
8 Q2 A5 w9 4.6
9 Q3 A1 w10 N/A
10 Q3 A2 w11 3.0
My original post is below and you can ignore if ya wanna
I inserted a new column in the first column position then loop through each row. I rename the column names just because I can. :D I did not include the quotes and periods in your original df. But the code below may still be useful.
import pandas as pd
cols = ['Answer', 'Avg_Score']
data=[['w1', 'N/A'],
['w2', 4.3],
['w3', 1.2],
['w4', 3.5],
['w5', 'N/A'],
['w6', 3.1],
['w7', 2.4],
['w8', 1.7],
['w9', 4.6],
['w10', 'N/A'],
['w11', 3.0]]
df = pd.DataFrame(data, columns = cols)
# insert new column before the Answer and the Avg_Score
df.insert(0,'Question','')
# start the question counter at 0
qnum = 0
# loop through each row
for index,row in df.iterrows():
# if 'N/A' found increase the question counter
# this assume first row will always have an 'N/A'
if df.loc[index,'Avg_Score'] == 'N/A':
qnum += 1
df.loc[index,'Question'] = 'Q{}'.format(qnum)
print(df)
Ouput:
Question Answer Avg_Score
0 Q1 w1 N/A
1 Q1 w2 4.3
2 Q1 w3 1.2
3 Q1 w4 3.5
4 Q2 w5 N/A
5 Q2 w6 3.1
6 Q2 w7 2.4
7 Q2 w8 1.7
8 Q2 w9 4.6
9 Q3 w10 N/A
10 Q3 w11 3.0
Pandas - Add a column level to multi index
Here's a way:
cols = pd.MultiIndex(levels=[['Foo', 'Bar'], ['A', 'B', 'C'], ['a']],
labels=[[0, 0, 0, 1, 1, 1], [0, 1, 2, 0, 1, 2], [0, 0, 0, 0, 0, 0]],
names=['L1', 'L2', 'L3'])
pd.DataFrame(columns = cols).T\
.assign(x = [0.01, 0.01, 0.01, 0.02, 0.02, 0.02])\
.set_index('x', append=True).T
Output:
Adding a multi-level column to a single-level pandas dataframe
First we make the existing columns multi-index:
df.columns = pd.MultiIndex.from_arrays([df.columns,['']*len(df.columns)])
and then add new ones indexed by tuples
df[('z','z1')] = [1,1,1]
df[('z','z2')] = [2,2,2]
df
to get
x y z
z1 z2
0 0 0 1 2
1 0 0 1 2
2 0 0 1 2
Pandas add column level which increases total columns count
Use DataFrame.reindex
for repeat columns by MultiIndex
values:
mux = pd.MultiIndex.from_product([df.columns, ['foo','bar','baz']])
df = df.reindex(mux, axis=1)
print (df)
a b c d
foo bar baz foo bar baz foo bar baz foo bar baz
idx1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
idx2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
idx3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
How to add a column level to pandas dataframe
Use GroupBy.cumcount
for counts duplicates with converted columns to series by Index.to_series
, add prefix and last assign back for MultiIndex in columns
:
df = pd.DataFrame({
'A':list('abcdef'),
'B':[4,5,4,5,5,4],
'C':[7,8,9,4,2,3],
'D':[1,3,5,7,1,0],
'E':[5,3,6,9,2,4],
'F':list('aaabbb')
})
df.columns= list('ABCABC')
print (df)
A B C A B C
0 a 4 7 1 5 a
1 b 5 8 3 3 a
2 c 4 9 5 6 a
3 d 5 4 7 9 b
4 e 5 2 1 2 b
5 f 4 3 0 4 b
s = df.columns.to_series()
df.columns = ['ticker ' + s.groupby(s).cumcount().add(1).astype(str), s]
print (df)
ticker 1 ticker 2
A B C A B C
0 a 4 7 1 5 a
1 b 5 8 3 3 a
2 c 4 9 5 6 a
3 d 5 4 7 9 b
4 e 5 2 1 2 b
5 f 4 3 0 4 b
Related Topics
How to Remove Unused Packages from Virtualenv
Combine Year, Month and Day in Python to Create a Date
Merge Multiple Rows to Single Row - Pandas
How to Compute Mean() for Particular Column in Pandas Dataframe Without Considering Nan Values
Changing Value in Data Frame Column in a Loop Python
Checking If a Pair of Value Is Inside a 2D Array Python
Python:Compare Two CSV Files and Print Out Differences
How to Append Dataframes Inside a for Loop in Python
How to Make the Program to Rerun Itself in Python
Django Rest Framework Csrf Failed: Csrf Cookie Not Set
What Is the Easiest Way to Convert Ndarray into Cv::Mat
How to Get the Amount of Consecutive Sub Strings of an Object in a List
How to Simply Add a Column Level to a Pandas Dataframe
Efficient Date Range Overlap Calculation
How to Remove Comma and Brackets
How to Split Vector into Columns - Using Pyspark
Comparing Two Dataframes and Getting the Differences
How to Determine Whether a Column/Variable Is Numeric or Not in Pandas/Numpy