Convert row to column header for Pandas DataFrame,
In [21]: df = pd.DataFrame([(1,2,3), ('foo','bar','baz'), (4,5,6)])
In [22]: df
Out[22]:
0 1 2
0 1 2 3
1 foo bar baz
2 4 5 6
Set the column labels to equal the values in the 2nd row (index location 1):
In [23]: df.columns = df.iloc[1]
If the index has unique labels, you can drop the 2nd row using:
In [24]: df.drop(df.index[1])
Out[24]:
1 foo bar baz
0 1 2 3
2 4 5 6
If the index is not unique, you could use:
In [133]: df.iloc[pd.RangeIndex(len(df)).drop(1)]
Out[133]:
1 foo bar baz
0 1 2 3
2 4 5 6
Using df.drop(df.index[1])
removes all rows with the same label as the second row. Because non-unique indexes can lead to stumbling blocks (or potential bugs) like this, it's often better to take care that the index is unique (even though Pandas does not require it).
Python Pandas: Convert Rows as Column headers
You're looking for pivot_table
:
In [11]: medals = df.pivot_table('no of medals', ['Year', 'Country'], 'medal')
In [12]: medals
Out[12]:
medal Bronze Gold Silver
Year Country
1896 Afghanistan 3 5 4
Algeria 3 1 2
and if you want to reorder the columns:
In [12]: medals.reindex_axis(['Gold', 'Silver', 'Bronze'], axis=1)
Out[12]:
medal Gold Silver Bronze
Year Country
1896 Afghanistan 5 4 3
Algeria 1 2 3
Convert combinations of row+column as Column headers
Let us do pivot_table
then swaplevel
s = df.pivot_table(index= ['Machine','Time'],
columns = df.Part.astype(str).radd('Part'),
values=['PowerA','PowerB'],
fill_value=-1).swaplevel(1,0, axis=1).sort_index(level=0, axis=1)
s.columns = s.columns.map('_'.join)
s.reset_index(inplace=True)
s
Out[751]:
Machine Time Part1_PowerA Part1_PowerB Part2_PowerA Part2_PowerB
0 1 20:30 0.1 0.4 0.9 0.7
1 1 20:31 0.3 0.1 0.2 0.3
2 2 20:30 0.2 0.5 -1.0 -1.0
3 2 20:31 0.8 0.4 -1.0 -1.0
How to Convert rows into columns headers and values of other column as data?
I think you need convert Name
to index, select by double []
for one column DataFrame
and transpose:
df1 = df.set_index('Name')[['Data']].T
Convert Rows as Column Headers
First set Bulan
columns to ordered Categoricals, for correct sorting:
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep',
'Oct', 'Nov', 'Dec']
df.Bulan = pd.Categorical(df.Bulan,
ordered=True,
categories=months)
Then reshape by melt
with pivoting:
df = (df.melt(['Tahun','Bulan','Dashboard Name'])
.pivot(index=['variable','Tahun','Bulan'],
columns='Dashboard Name',
values='value')
.sort_index(level=[1,2])
.reset_index(level=[1,2])
.rename_axis(index=None, columns=None)
)
print (df)
Tahun Bulan Potensi Realisasi Usulan
Hutan Adat 2021 Jan 0.0 3.0 0.0
PS 2021 Jan 0.0 0.0 0.0
TORA 2021 Jan 0.0 15.0 0.0
Hutan Adat 2021 Feb 0.0 NaN 0.0
PS 2021 Feb 6.0 NaN 0.0
TORA 2021 Feb 4.0 NaN 0.0
Or:
df = (df.set_index(['Tahun','Bulan','Dashboard Name'])
.stack()
.unstack(level=2)
.reset_index(level=[0,1])
.rename_axis(index=None, columns=None)
)
print (df)
Tahun Bulan Potensi Realisasi Usulan
TORA 2021 Jan 0.0 15.0 0.0
PS 2021 Jan 0.0 0.0 0.0
Hutan Adat 2021 Jan 0.0 3.0 0.0
TORA 2021 Feb 4.0 NaN 0.0
PS 2021 Feb 6.0 NaN 0.0
Hutan Adat 2021 Feb 0.0 NaN 0.0
Last if order of last columns is important:
df = df[df.columns[2:].tolist() + df.columns[:2].tolist()]
print (df)
Potensi Realisasi Usulan Tahun Bulan
Hutan Adat 0.0 3.0 0.0 2021 Jan
PS 0.0 0.0 0.0 2021 Jan
TORA 0.0 15.0 0.0 2021 Jan
Hutan Adat 0.0 NaN 0.0 2021 Feb
PS 6.0 NaN 0.0 2021 Feb
TORA 4.0 NaN 0.0 2021 Feb
Using Pandas convert column values to column header
I think code below will help you:
# create your processing function
def handle_function(row: pd.Series) -> pd.Series:
"""
What you want to do here...
"""
# get label and value
label = row["field_label"]
value = row["field_value"]
# create column for label if it exists no create
row[label] = value
# return new row
return row
# apply on your dataframe
df.apply(handle_function, axis=1)
# remove columns for label and values
df.drop(["field_label", "field_value", axis=1, inplace=True]
How to convert row values as column names and same column names as column values in python?
You can use pivot
, but you need to specify the index
argument to keep your "col1", "col2", "col3". Once you do that, you can clean up the dataframe a little bit to get the result you want.
out = (df.pivot(index=["col1", "col2", "col3"], columns="col4", values="col4")
.rename_axis(columns=None)
.reset_index())
print(out)
col1 col2 col3 a1 a2 a3
0 x1 y1 z1 a1 a2 a3
Steps
pivot(...)
: pivot the dataframe as desired. This makes aMultiIndex
of "col1", "col2", and "col3". Then the actual columns & values from the value of "col4"rename_axis(columns=None)
:pivot
makes a columns anIndex
object with a name. I find that this name throws people off be adjusting how the dataframe is represented. Here I remove the name from the columnIndex
so that it can be represented how OP expectsreset_index()
as stated earlier-pivot
makes aMultiIndex
of "col1", "col2", and "col3". We usereset_index()
here to take these values and insert it into the actual data of theDataFrame
instead of being aMultiIndex
How can I create column headers from row values in a slice of columns resulting in the column row values matching the column headers?
The idea is to convert the DataFrame to the long form and then pivot:
df = df.reset_index()
df = pd.melt(df, id_vars=['index']).pivot(index = 'index', values='variable', columns='value')
for c in df.columns:
df.loc[df[c].notna(), c] = c
value bar baz egg foo ham
index
0 bar NaN NaN foo ham
1 bar baz NaN foo NaN
2 NaN baz egg NaN ham
Detailed explanation:
melt
would convert the DataFrame to the following form:
index variable value
0 0 A foo
1 1 A baz
2 2 A ham
3 0 B bar
4 1 B foo
5 2 B baz
6 0 C ham
7 1 C bar
8 2 C egg
Then use pivot to make the columns to be all the unique values:
value bar baz egg foo ham
index
0 B NaN NaN A C
1 C A NaN B NaN
2 NaN B C NaN A
Then simply replace all non-na columns to be the column name.
Convert first row of pandas dataframe to column name
I believe need to add parameter to read_html
:
df = pd.read_html(url, header=1)[0]
Or:
df = pd.read_html(url, skiprows=1)[0]
Related Topics
What's the Best Way to Return Multiple Values from a Function
Can Modules Have Properties the Same Way That Objects Can
Pandas Column Access W/Column Names Containing Spaces
Getting an Instance Name Inside Class _Init_()
How to Release Memory Used by a Pandas Dataframe
How to Return 0 with Divide by Zero
How to Get Value Counts for Multiple Columns at Once in Pandas Dataframe
Pip Freeze Creates Some Weird Path Instead of the Package Version
How to Get the Index of a Maximum Element in a Numpy Array Along One Axis
Plotly: How to Define Colors in a Figure Using Plotly Graph Objects and Plotly Express
Using a Django Variable in a CSS File
R Markdown: How to Make Rstudio Display Python Plots Inline Instead of in New Window
How to Add Sum to Zero Constraint to Glm in Python
Data Scraping from Published Power Bi Visual