Concatenate Rows of Two Dataframes in Pandas

Concatenate rows of two dataframes in pandas

call concat and pass param axis=1 to concatenate column-wise:

In [5]:

pd.concat([df_a,df_b], axis=1)
Out[5]:
        AAseq Biorep  Techrep Treatment     mz      AAseq1 Biorep1  Techrep1  \
0  ELVISLIVES      A        1         C  500.0  ELVISLIVES       A         1   
1  ELVISLIVES      A        1         C  500.5  ELVISLIVES       A         1   
2  ELVISLIVES      A        1         C  501.0  ELVISLIVES       A         1   

  Treatment1  inte1  
0          C   1100  
1          C   1050  
2          C   1010

There is a useful guide to the various methods of merging, joining and concatenating online.

For example, as you have no clashing columns you can merge and use the indices as they have the same number of rows:

In [6]:

df_a.merge(df_b, left_index=True, right_index=True)
Out[6]:
        AAseq Biorep  Techrep Treatment     mz      AAseq1 Biorep1  Techrep1  \
0  ELVISLIVES      A        1         C  500.0  ELVISLIVES       A         1   
1  ELVISLIVES      A        1         C  500.5  ELVISLIVES       A         1   
2  ELVISLIVES      A        1         C  501.0  ELVISLIVES       A         1   

  Treatment1  inte1  
0          C   1100  
1          C   1050  
2          C   1010

And for the same reasons as above a simple join works too:

In [7]:

df_a.join(df_b)
Out[7]:
        AAseq Biorep  Techrep Treatment     mz      AAseq1 Biorep1  Techrep1  \
0  ELVISLIVES      A        1         C  500.0  ELVISLIVES       A         1   
1  ELVISLIVES      A        1         C  500.5  ELVISLIVES       A         1   
2  ELVISLIVES      A        1         C  501.0  ELVISLIVES       A         1   

  Treatment1  inte1  
0          C   1100  
1          C   1050  
2          C   1010

How to concatenate combinations of rows from two different dataframes?

Use itertools.product():

import itertools
pd.DataFrame(list(itertools.product(df1.A,df2.B)),columns=['A','B'])

How to concatenate two Dataframe rows using a mapping index

One option is to perform a double merge:

(df1.merge(df2.merge(MAP, left_on='C', right_on='C_index'),
           left_on='A', right_on='A_index')
    .filter(regex=r'^((?!_index).)*$') # remove the "X_index" columns
    .drop(columns='C')
)

NB. I used MAP as name for the mapping dataframe as map is a python builtin

Alternative, more linear, syntax:

(df1.merge(MAP, left_on='A', right_on='A_index')        
    .merge(df2, left_on='C_index', right_on='C')
    .filter(regex=r'^((?!_index).)*$')
    .drop(columns='C')
)

output:

   A           B     D
0  2        bike  blue
1  3  pedestrian   red

In Python Pandas, How do I concatenate rows of a df based on two columns? and in the order of a third one?

Setup:

Here is a short example and some code that moves the 'Sales' data into separate columns for each hour. You can change the value in the range from 3 to 24 for your case.

import pandas as pd
df = pd.DataFrame([['Dave', 1, 0, 10],['Dave', 1, 1, 20],['Dave', 1, 2, 30],
                   ['Dave', 2, 0, 40],['Dave', 2, 1, 50],['Dave', 2, 2, 60],
                   ['Carl', 1, 0, 15],['Carl', 1, 1, 25],['Carl', 1, 2, 35],
                   ['Carl', 2, 0, 45],['Carl', 2, 1, 55],['Carl', 2, 2, 65]],
                  columns=['ID', 'Date', 'Hour', 'Sales'])

Output (df):

      ID  Date  Hour  Sales
0   Dave     1     0     10
1   Dave     1     1     20
2   Dave     1     2     30
3   Dave     2     0     40
4   Dave     2     1     50
5   Dave     2     2     60
6   Carl     1     0     15
7   Carl     1     1     25
8   Carl     1     2     35
9   Carl     2     0     45
10  Carl     2     1     55
11  Carl     2     2     65

'Where' and 'Merge':

The key here is using the pandas.merge function with the on argument to choose which columns to use as an index for merging.

df.where, df.merge, and df.dropna, are very versitile pieces of Pandas that are good to learn.

new = pd.DataFrame(columns=['ID','Date'])
for hour in range(3):
    tmp = df.where(df.Hour == hour).dropna(axis=0, how='all')
    tmp[hour] = tmp['Sales']
    tmp.drop(['Hour','Sales'], axis=1, inplace=True)
    new = new.merge(tmp, how='outer', on=['ID','Date'])
new.set_index(['ID','Date'], inplace=True)

Output (new):

              0     1     2
ID   Date                  
Dave 1.0   10.0  20.0  30.0
     2.0   40.0  50.0  60.0
Carl 1.0   15.0  25.0  35.0
     2.0   45.0  55.0  65.0

Pivot Tables:

For this specific problem, you can use pivot tables to do all that work for you.

dfp = df.pivot(index=['ID','Date'], columns='Hour', values='Sales')

Output (dfp):

Hour        0   1   2
ID   Date            
Carl 1     15  25  35
     2     45  55  65
Dave 1     10  20  30
     2     40  50  60

Pandas: Combining Two DataFrames Horizontally

concat is indeed what you're looking for, you just have to pass it a different value for the "axis" argument than the default. Code sample below:

import pandas as pd

df1 = pd.DataFrame({
    'A': [1,2,3,4,5],
    'B': [1,2,3,4,5]
})

df2 = pd.DataFrame({
    'C': [1,2,3,4,5],
    'D': [1,2,3,4,5]
})

df_concat = pd.concat([df1, df2], axis=1)

print(df_concat)

With the result being:

   A  B  C  D
0  1  1  1  1
1  2  2  2  2
2  3  3  3  3
3  4  4  4  4
4  5  5  5  5

Concatenate row values in Pandas DataFrame

merge does not concatenate the dfs as you want, use append instead.

ndf = df1.append(df2).sort_values('name')

You can also use concat:

ndf = pd.concat([df1, df2]).sort_values('name')

concatenate rows on dataframe one by one

One way is to change the indices of your input dataframes. Then concatenate and sort by index. This will also handle situations where your dataframes have mismatched lengths.

df1.index = df1.index*2
df2.index = df2.index*2 + 1

res = pd.concat([df1, df2]).sort_index()

print(res)

  data  type
0    a     1
1    v     2
2    b     1
3    w     2
4    c     1
5    x     2
6    d     1
7    y     2
8    e     1
9    z     2

If you need to normalize your index when your dataframes have inconsistent lengths, you can use reset_index as a final step:

res = res.reset_index(drop=True)

Joining two dataframes then combining data in fields with same name using Pandas

Instead of merging, concatenate

# concatenate and groupby to join the strings
df = pd.concat([data1, data2]).groupby('State', as_index=False).agg(lambda x: '; '.join(el for el in x if pd.notna(el)))
print(df)
  State        Product        Cashier    Type
0    CA  Banana; Shirt        Sally;         
1    MN    Apple; Shoe  Gretta; Trish        
2    NM          Socks          Paula  Hourly
3    NV         Orange       Samantha

Concatenate Rows of Two Dataframes in Pandas