Pandas: Multiple Columns into One Column

Pandas: Multiple columns into one column

Update

pandas has a built in method for this stack which does what you want see the other answer.

This was my first answer before I knew about stack many years ago:

In [227]:

df = pd.DataFrame({'Column 1':['A', 'B', 'C', 'D'],'Column 2':['E', 'F', 'G', 'H']})
df
Out[227]:
Column 1 Column 2
0 A E
1 B F
2 C G
3 D H

[4 rows x 2 columns]

In [228]:

df['Column 1'].append(df['Column 2']).reset_index(drop=True)
Out[228]:
0 A
1 B
2 C
3 D
4 E
5 F
6 G
7 H
dtype: object

How to stack/append all columns into one column in Pandas?

Very simply with melt:

import pandas as pd
df.melt().drop('variable',axis=1).rename({'value':'A'},axis=1)


   A
0 1
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 9

How to convert multiple columns in one column in pandas?

Use melt:

>>> df.melt(var_name='route', value_name='edge')
route edge
0 route1 19.0
1 route1 47.0
2 route1 56.0
3 route1 43.0
4 route2 51.0
5 route2 46.0
6 route2 37.0
7 route2 2.0

If you have some columns to protect, use id_vars=['col1', 'col2', ...] to not flatten them.

Merge multiple column values into one column in python pandas

You can call apply pass axis=1 to apply row-wise, then convert the dtype to str and join:

In [153]:
df['ColumnA'] = df[df.columns[1:]].apply(
lambda x: ','.join(x.dropna().astype(str)),
axis=1
)
df

Out[153]:
Column1 Column2 Column3 Column4 Column5 ColumnA
0 a 1 2 3 4 1,2,3,4
1 a 3 4 5 NaN 3,4,5
2 b 6 7 8 NaN 6,7,8
3 c 7 7 NaN NaN 7,7

Here I call dropna to get rid of the NaN, however we need to cast again to int so we don't end up with floats as str.

Transpose multiple columns into one column using Python

This can be accomplished with melt

df.melt(id_vars = ['Date'], value_vars = df.columns.drop('Date').tolist())

Append multiple columns to single column

Try:

single_column_frame = pd.concat([df[col] for col in df.columns])

If you want to create a single column and get rid of month names:

df_new = df.melt()['value'].to_frame()

Or you can do:

single_column_frame = single_column_frame.reset_index().drop(columns=['index'])

You can also do:

single_column_frame = df.stack().reset_index().loc[:,0]

Pandas: sum up multiple columns into one column without last column

You can first select by iloc and then sum:

df['Fruit Total']= df.iloc[:, -4:-1].sum(axis=1)
print (df)
Apples Bananas Grapes Kiwis Fruit Total
0 2.0 3.0 NaN 1.0 5.0
1 1.0 3.0 7.0 NaN 11.0
2 NaN NaN 2.0 3.0 2.0

For sum all columns use:

df['Fruit Total']= df.sum(axis=1)

how to re-arrange multiple columns into one column with same index

This looks a little like the melt function in pandas, with the only difference being the index.

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.melt.html

Here is some code you can run to test:

import pandas as pd
df = pd.DataFrame({'A': {0: 'a', 1: 'b', 2: 'c'},'B': {0: 1, 1: 3, 2: 5},'C': {0: 2, 1: 4, 2: 6}})
pd.melt(df)

Image of the my jupter notebook

With a little manipulation, you could solve for the indexing issue.

This is not particularly pythonic, but if you have a limited number of columns, you could make due with:

molten = pd.melt(df)
a = molten.merge(df, left_on='value', right_on = 'A')
b = molten.merge(df, left_on='value', right_on = 'B')
c = molten.merge(df, left_on='value', right_on = 'C')
merge = pd.concat([a,b,c])

Merge of the melted data frames

Combine different values of multiple columns into one column

Thanks to the comments (@ polars issues) from @cannero and @ritchie46, I was able to make it work.

This is a working version (Float64):

use polars::prelude::*;

fn my_black_box_function(a: f64, b: f64) -> f64 {
// do something
a
}

fn apply_multiples(lf: LazyFrame) -> Result<DataFrame> {

let ergebnis = lf
.select([col("struct_col").map(
|s| {
let ca = s.struct_()?;

let b = ca.field_by_name("a")?;
let a = ca.field_by_name("b")?;
let a = a.f64()?;
let b = b.f64()?;

let out: Float64Chunked = a
.into_iter()
.zip(b.into_iter())
.map(|(opt_a, opt_b)| match (opt_a, opt_b) {
(Some(a), Some(b)) => Some(my_black_box_function(a, b)),
_ => None,
})
.collect();

Ok(out.into_series())
},
GetOutput::from_type(DataType::Float64),
)])
.collect();

ergebnis
}

fn main() {
// We start with a normal DataFrame
let df = df![
"a" => [1.0, 2.0, 3.0],
"b" => [3.0, 5.1, 0.3]
]
.unwrap();

// We CONVERT the df into a StructChunked and WRAP this into a new LazyFrame
let lf = df![
"struct_col" => df.into_struct("StructChunked")
]
.unwrap()
.lazy();

let processed = apply_multiples(lf);

match processed {
Ok(..) => println!("We did it"),
Err(e) => println!("{:?}", e),
}
}

Here is a version for my initial question (String):

use polars::prelude::*;

fn my_fruit_box(fruit: String, color: String) -> String {
// do something
format!("{} has {} color", fruit, color)
}

fn apply_multiples(lf: LazyFrame) -> Result<DataFrame> {

let ergebnis = lf
.select([col("struct_col").map(
|s| {
let ca = s.struct_()?;

let fruit = ca.field_by_name("Fruit")?;
let color = ca.field_by_name("Color")?;
let color = color.utf8()?;
let fruit = fruit.utf8()?;

let out: Utf8Chunked = fruit
.into_iter()
.zip(color.into_iter())
.map(|(opt_fruit, opt_color)| match (opt_fruit, opt_color) {
(Some(fruit), Some(color)) => {
Some(my_fruit_box(fruit.to_string(), color.to_string()))
}
_ => None,
})
.collect();

Ok(out.into_series())
},
GetOutput::from_type(DataType::Utf8),
)])
.collect();

ergebnis
}

fn main() {
// We start with a normal DataFrame
let s1 = Series::new("Fruit", &["Apple", "Apple", "Pear"]);
let s2 = Series::new("Color", &["Red", "Yellow", "Green"]);

let df = DataFrame::new(vec![s1, s2]).unwrap();

// We CONVERT the df into a StructChunked and WRAP this into a new LazyFrame
let lf = df![
"struct_col" => df.into_struct("StructChunked")
]
.unwrap()
.lazy();

let processed = apply_multiples(lf);

match processed {
Ok(..) => println!("We did it"),
Err(e) => println!("{:?}", e),
}
}



Related Topics



Leave a reply



Submit