Concatenate rows of two dataframes in pandas
call concat
and pass param axis=1
to concatenate column-wise:
In [5]:
pd.concat([df_a,df_b], axis=1)
Out[5]:
AAseq Biorep Techrep Treatment mz AAseq1 Biorep1 Techrep1 \
0 ELVISLIVES A 1 C 500.0 ELVISLIVES A 1
1 ELVISLIVES A 1 C 500.5 ELVISLIVES A 1
2 ELVISLIVES A 1 C 501.0 ELVISLIVES A 1
Treatment1 inte1
0 C 1100
1 C 1050
2 C 1010
There is a useful guide to the various methods of merging, joining and concatenating online.
For example, as you have no clashing columns you can merge
and use the indices as they have the same number of rows:
In [6]:
df_a.merge(df_b, left_index=True, right_index=True)
Out[6]:
AAseq Biorep Techrep Treatment mz AAseq1 Biorep1 Techrep1 \
0 ELVISLIVES A 1 C 500.0 ELVISLIVES A 1
1 ELVISLIVES A 1 C 500.5 ELVISLIVES A 1
2 ELVISLIVES A 1 C 501.0 ELVISLIVES A 1
Treatment1 inte1
0 C 1100
1 C 1050
2 C 1010
And for the same reasons as above a simple join
works too:
In [7]:
df_a.join(df_b)
Out[7]:
AAseq Biorep Techrep Treatment mz AAseq1 Biorep1 Techrep1 \
0 ELVISLIVES A 1 C 500.0 ELVISLIVES A 1
1 ELVISLIVES A 1 C 500.5 ELVISLIVES A 1
2 ELVISLIVES A 1 C 501.0 ELVISLIVES A 1
Treatment1 inte1
0 C 1100
1 C 1050
2 C 1010
How to concatenate combinations of rows from two different dataframes?
Use itertools.product()
:
import itertools
pd.DataFrame(list(itertools.product(df1.A,df2.B)),columns=['A','B'])
A B
0 1 a
1 1 b
2 1 c
3 2 a
4 2 b
5 2 c
How to concatenate two Dataframe rows using a mapping index
One option is to perform a double merge:
(df1.merge(df2.merge(MAP, left_on='C', right_on='C_index'),
left_on='A', right_on='A_index')
.filter(regex=r'^((?!_index).)*$') # remove the "X_index" columns
.drop(columns='C')
)
NB. I used MAP as name for the mapping dataframe as map
is a python builtin
Alternative, more linear, syntax:
(df1.merge(MAP, left_on='A', right_on='A_index')
.merge(df2, left_on='C_index', right_on='C')
.filter(regex=r'^((?!_index).)*$')
.drop(columns='C')
)
output:
A B D
0 2 bike blue
1 3 pedestrian red
In Python Pandas, How do I concatenate rows of a df based on two columns? and in the order of a third one?
Setup:
Here is a short example and some code that moves the 'Sales' data into separate columns for each hour. You can change the value in the range from 3 to 24 for your case.
import pandas as pd
df = pd.DataFrame([['Dave', 1, 0, 10],['Dave', 1, 1, 20],['Dave', 1, 2, 30],
['Dave', 2, 0, 40],['Dave', 2, 1, 50],['Dave', 2, 2, 60],
['Carl', 1, 0, 15],['Carl', 1, 1, 25],['Carl', 1, 2, 35],
['Carl', 2, 0, 45],['Carl', 2, 1, 55],['Carl', 2, 2, 65]],
columns=['ID', 'Date', 'Hour', 'Sales'])
Output (df):
ID Date Hour Sales
0 Dave 1 0 10
1 Dave 1 1 20
2 Dave 1 2 30
3 Dave 2 0 40
4 Dave 2 1 50
5 Dave 2 2 60
6 Carl 1 0 15
7 Carl 1 1 25
8 Carl 1 2 35
9 Carl 2 0 45
10 Carl 2 1 55
11 Carl 2 2 65
'Where' and 'Merge':
The key here is using the pandas.merge
function with the on
argument to choose which columns to use as an index for merging.
df.where, df.merge, and df.dropna, are very versitile pieces of Pandas that are good to learn.
new = pd.DataFrame(columns=['ID','Date'])
for hour in range(3):
tmp = df.where(df.Hour == hour).dropna(axis=0, how='all')
tmp[hour] = tmp['Sales']
tmp.drop(['Hour','Sales'], axis=1, inplace=True)
new = new.merge(tmp, how='outer', on=['ID','Date'])
new.set_index(['ID','Date'], inplace=True)
Output (new):
0 1 2
ID Date
Dave 1.0 10.0 20.0 30.0
2.0 40.0 50.0 60.0
Carl 1.0 15.0 25.0 35.0
2.0 45.0 55.0 65.0
Pivot Tables:
For this specific problem, you can use pivot tables to do all that work for you.
dfp = df.pivot(index=['ID','Date'], columns='Hour', values='Sales')
Output (dfp):
Hour 0 1 2
ID Date
Carl 1 15 25 35
2 45 55 65
Dave 1 10 20 30
2 40 50 60
Pandas: Combining Two DataFrames Horizontally
concat
is indeed what you're looking for, you just have to pass it a different value for the "axis" argument than the default. Code sample below:
import pandas as pd
df1 = pd.DataFrame({
'A': [1,2,3,4,5],
'B': [1,2,3,4,5]
})
df2 = pd.DataFrame({
'C': [1,2,3,4,5],
'D': [1,2,3,4,5]
})
df_concat = pd.concat([df1, df2], axis=1)
print(df_concat)
With the result being:
A B C D
0 1 1 1 1
1 2 2 2 2
2 3 3 3 3
3 4 4 4 4
4 5 5 5 5
Concatenate row values in Pandas DataFrame
merge
does not concatenate the dfs as you want, use append
instead.
ndf = df1.append(df2).sort_values('name')
You can also use concat:
ndf = pd.concat([df1, df2]).sort_values('name')
concatenate rows on dataframe one by one
One way is to change the indices of your input dataframes. Then concatenate and sort by index. This will also handle situations where your dataframes have mismatched lengths.
df1.index = df1.index*2
df2.index = df2.index*2 + 1
res = pd.concat([df1, df2]).sort_index()
print(res)
data type
0 a 1
1 v 2
2 b 1
3 w 2
4 c 1
5 x 2
6 d 1
7 y 2
8 e 1
9 z 2
If you need to normalize your index when your dataframes have inconsistent lengths, you can use reset_index
as a final step:
res = res.reset_index(drop=True)
Joining two dataframes then combining data in fields with same name using Pandas
Instead of merging, concatenate
# concatenate and groupby to join the strings
df = pd.concat([data1, data2]).groupby('State', as_index=False).agg(lambda x: '; '.join(el for el in x if pd.notna(el)))
print(df)
State Product Cashier Type
0 CA Banana; Shirt Sally;
1 MN Apple; Shoe Gretta; Trish
2 NM Socks Paula Hourly
3 NV Orange Samantha
Related Topics
What Is the Pythonic Way to Avoid Default Parameters That Are Empty Lists
Plotting Dates on the X-Axis with Python's Matplotlib
Run a Linux System Command as a Superuser, Using a Python Script
R and Python in One Jupyter Notebook
How to Make Sure If Some HTML Elements Are Loaded for Selenium + Python
Why Does Pyimport_Import Fail to Load a Module from the Current Directory
Python Udisks - Enumerating Device Information
How to Find All Comments with Beautiful Soup
How to Exit Linux Terminal Using Python Script
How to Use Python2.7 Pip Instead of Default Pip
How to Use the Same Python Virtualenv on Both Windows and Linux
What Is Different Between Makedirs and Mkdir of Os
Priority of the Logical Operators Not, And, or in Python
Str' Object Does Not Support Item Assignment
How to Detect Whether a Python Variable Is a Function
Simple Way to Encode a String According to a Password