Python Pandas: Convert Rows as Column Headers

Convert row to column header for Pandas DataFrame,

In [21]: df = pd.DataFrame([(1,2,3), ('foo','bar','baz'), (4,5,6)])

In [22]: df
Out[22]:
0 1 2
0 1 2 3
1 foo bar baz
2 4 5 6

Set the column labels to equal the values in the 2nd row (index location 1):

In [23]: df.columns = df.iloc[1]

If the index has unique labels, you can drop the 2nd row using:

In [24]: df.drop(df.index[1])
Out[24]:
1 foo bar baz
0 1 2 3
2 4 5 6

If the index is not unique, you could use:

In [133]: df.iloc[pd.RangeIndex(len(df)).drop(1)]
Out[133]:
1 foo bar baz
0 1 2 3
2 4 5 6

Using df.drop(df.index[1]) removes all rows with the same label as the second row. Because non-unique indexes can lead to stumbling blocks (or potential bugs) like this, it's often better to take care that the index is unique (even though Pandas does not require it).

Python Pandas: Convert Rows as Column headers

You're looking for pivot_table:

In [11]: medals = df.pivot_table('no of medals', ['Year', 'Country'], 'medal')

In [12]: medals
Out[12]:
medal Bronze Gold Silver
Year Country
1896 Afghanistan 3 5 4
Algeria 3 1 2

and if you want to reorder the columns:

In [12]: medals.reindex_axis(['Gold', 'Silver', 'Bronze'], axis=1)
Out[12]:
medal Gold Silver Bronze
Year Country
1896 Afghanistan 5 4 3
Algeria 1 2 3

Convert combinations of row+column as Column headers

Let us do pivot_table then swaplevel

s = df.pivot_table(index= ['Machine','Time'], 
columns = df.Part.astype(str).radd('Part'),
values=['PowerA','PowerB'],
fill_value=-1).swaplevel(1,0, axis=1).sort_index(level=0, axis=1)

s.columns = s.columns.map('_'.join)
s.reset_index(inplace=True)
s
Out[751]:
Machine Time Part1_PowerA Part1_PowerB Part2_PowerA Part2_PowerB
0 1 20:30 0.1 0.4 0.9 0.7
1 1 20:31 0.3 0.1 0.2 0.3
2 2 20:30 0.2 0.5 -1.0 -1.0
3 2 20:31 0.8 0.4 -1.0 -1.0

How to Convert rows into columns headers and values of other column as data?

I think you need convert Name to index, select by double [] for one column DataFrame and transpose:

df1 = df.set_index('Name')[['Data']].T

Convert Rows as Column Headers

First set Bulan columns to ordered Categoricals, for correct sorting:

months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep',
'Oct', 'Nov', 'Dec']

df.Bulan = pd.Categorical(df.Bulan,
ordered=True,
categories=months)

Then reshape by melt with pivoting:

df = (df.melt(['Tahun','Bulan','Dashboard Name'])
.pivot(index=['variable','Tahun','Bulan'],
columns='Dashboard Name',
values='value')
.sort_index(level=[1,2])
.reset_index(level=[1,2])
.rename_axis(index=None, columns=None)

)
print (df)
Tahun Bulan Potensi Realisasi Usulan
Hutan Adat 2021 Jan 0.0 3.0 0.0
PS 2021 Jan 0.0 0.0 0.0
TORA 2021 Jan 0.0 15.0 0.0
Hutan Adat 2021 Feb 0.0 NaN 0.0
PS 2021 Feb 6.0 NaN 0.0
TORA 2021 Feb 4.0 NaN 0.0

Or:

df = (df.set_index(['Tahun','Bulan','Dashboard Name'])
.stack()
.unstack(level=2)
.reset_index(level=[0,1])
.rename_axis(index=None, columns=None)

)
print (df)
Tahun Bulan Potensi Realisasi Usulan
TORA 2021 Jan 0.0 15.0 0.0
PS 2021 Jan 0.0 0.0 0.0
Hutan Adat 2021 Jan 0.0 3.0 0.0
TORA 2021 Feb 4.0 NaN 0.0
PS 2021 Feb 6.0 NaN 0.0
Hutan Adat 2021 Feb 0.0 NaN 0.0

Last if order of last columns is important:

df = df[df.columns[2:].tolist() + df.columns[:2].tolist()]
print (df)
Potensi Realisasi Usulan Tahun Bulan
Hutan Adat 0.0 3.0 0.0 2021 Jan
PS 0.0 0.0 0.0 2021 Jan
TORA 0.0 15.0 0.0 2021 Jan
Hutan Adat 0.0 NaN 0.0 2021 Feb
PS 6.0 NaN 0.0 2021 Feb
TORA 4.0 NaN 0.0 2021 Feb

Using Pandas convert column values to column header

I think code below will help you:

# create your processing function
def handle_function(row: pd.Series) -> pd.Series:
"""
What you want to do here...
"""

# get label and value
label = row["field_label"]
value = row["field_value"]

# create column for label if it exists no create
row[label] = value

# return new row
return row

# apply on your dataframe
df.apply(handle_function, axis=1)

# remove columns for label and values
df.drop(["field_label", "field_value", axis=1, inplace=True]

How to convert row values as column names and same column names as column values in python?

You can use pivot, but you need to specify the index argument to keep your "col1", "col2", "col3". Once you do that, you can clean up the dataframe a little bit to get the result you want.

out = (df.pivot(index=["col1", "col2", "col3"], columns="col4", values="col4")
.rename_axis(columns=None)
.reset_index())

print(out)
col1 col2 col3 a1 a2 a3
0 x1 y1 z1 a1 a2 a3

Steps

  • pivot(...): pivot the dataframe as desired. This makes a MultiIndex of "col1", "col2", and "col3". Then the actual columns & values from the value of "col4"
  • rename_axis(columns=None): pivot makes a columns an Index object with a name. I find that this name throws people off be adjusting how the dataframe is represented. Here I remove the name from the column Index so that it can be represented how OP expects
  • reset_index() as stated earlier- pivot makes a MultiIndex of "col1", "col2", and "col3". We use reset_index() here to take these values and insert it into the actual data of the DataFrame instead of being a MultiIndex

How can I create column headers from row values in a slice of columns resulting in the column row values matching the column headers?

The idea is to convert the DataFrame to the long form and then pivot:

df = df.reset_index()
df = pd.melt(df, id_vars=['index']).pivot(index = 'index', values='variable', columns='value')
for c in df.columns:
df.loc[df[c].notna(), c] = c
value   bar baz egg foo ham
index
0 bar NaN NaN foo ham
1 bar baz NaN foo NaN
2 NaN baz egg NaN ham

Detailed explanation:

melt would convert the DataFrame to the following form:

   index    variable    value
0 0 A foo
1 1 A baz
2 2 A ham
3 0 B bar
4 1 B foo
5 2 B baz
6 0 C ham
7 1 C bar
8 2 C egg

Then use pivot to make the columns to be all the unique values:

value   bar baz egg foo ham
index
0 B NaN NaN A C
1 C A NaN B NaN
2 NaN B C NaN A

Then simply replace all non-na columns to be the column name.

Convert first row of pandas dataframe to column name

I believe need to add parameter to read_html:

df = pd.read_html(url, header=1)[0]

Or:

df = pd.read_html(url, skiprows=1)[0]


Related Topics



Leave a reply



Submit