Pandas Version of Rbind

Pandas version of rbind

Ah, this is to do with how I created the DataFrame, not with how I was combining them. The long and the short of it is, if you are creating a frame using a loop and a statement that looks like this:

Frame = Frame.append(pandas.DataFrame(data = SomeNewLineOfData))

You must ignore the index

Frame = Frame.append(pandas.DataFrame(data = SomeNewLineOfData), ignore_index=True)

Or you will have issues later when combining data.

Pandas equivalent rbind operation

>>> df1
a b
0 -1.417866 -0.828749
1 0.212349 0.791048
2 -0.451170 0.628584
3 0.612671 -0.995330
4 0.078460 -0.322976
5 1.244803 1.576373
6 1.169629 -1.135926
7 -0.652443 0.506388
8 0.549604 -0.691054
9 -0.512829 -0.959398

>>> df2
a b
0 -0.652161 0.940932
1 2.495067 0.004833
2 -2.187792 1.692402
3 1.900738 0.372425
4 0.245976 1.894527
5 0.627297 0.029331
6 -0.828628 -1.600014
7 -0.991835 -0.061202
8 0.543389 0.703457
9 -0.755059 1.239968

>>> pd.concat([df1, df2])
a b
0 -1.417866 -0.828749
1 0.212349 0.791048
2 -0.451170 0.628584
3 0.612671 -0.995330
4 0.078460 -0.322976
5 1.244803 1.576373
6 1.169629 -1.135926
7 -0.652443 0.506388
8 0.549604 -0.691054
9 -0.512829 -0.959398
0 -0.652161 0.940932
1 2.495067 0.004833
2 -2.187792 1.692402
3 1.900738 0.372425
4 0.245976 1.894527
5 0.627297 0.029331
6 -0.828628 -1.600014
7 -0.991835 -0.061202
8 0.543389 0.703457
9 -0.755059 1.239968

Unless I'm misinterpreting what you need, this is what you need.

Equivalent of R rbind.fill in Python Pandas

You are looking for the function concat:

import pandas as pd

df1 = pd.DataFrame({'col1':['a','b'],'col2':[33,44]})

df2 = pd.DataFrame({'col3':['dog'],'col2':[32], 'col4':[1]})

In [8]: pd.concat([df1, df2])
Out[8]:
col1 col2 col3 col4
0 a 33 NaN NaN
1 b 44 NaN NaN
0 NaN 32 dog 1

pandas equivalent of R's cbind (concatenate/stack vectors vertically)

test3 = pd.concat([test1, test2], axis=1)
test3.columns = ['a','b']

(But see the detailed answer by @feng-mai, below)

rbindlist equivalent R's function in python

I found a solution for my problem:

ests_list=[]
for i in list(range(1,num_vars)):
ests_list.append(df1.merge(df2,how='left',on=eval("combine%s"%i+"_lvl")))
pd.concat(ests_list)

I am creating an empty list and and I added each loop output to it.

Then I am combining all the list by using the pd.concat function, so it gives me the output in pandas data frame format.

Pandas column bind (cbind) two data frames

If you're sure the index row values are the same then to avoid the index alignment order then just call reset_index(), this will reset your index values back to start from 0:

df_c = pd.concat([df_a.reset_index(drop=True), df_b], axis=1)

R rbind a dataframe of dataframes

The issue here seems to be not another data.frame within a data frame, but the non-unique rownames in the result. If you made sure that rownames are unique after rbind - it should work:

df1 <- data.frame(a=1:3)
df2 <- data.frame(a=1:3)
df1$df <- data.frame(a=1:3, row.names=letters[1:3])
df2$df <- data.frame(a=1:3, row.names=LETTERS[1:3])

> res <- rbind(df1, df2)
> res
a a
1 1 1
2 2 2
3 3 3
4 1 1
5 2 2
6 3 3

> res$df
a
a 1
b 2
c 3
A 1
B 2
C 3

The problem seems to be that rbind adjusts the rownames for the two data.frames being merged, but does not adjust the rownames for data.frames within data.frames.

Append multiple pandas data frames at once

Have you simply tried using a list as argument of append? Or am I missing anything?

import numpy as np
import pandas as pd

dates = np.asarray(pd.date_range('1/1/2000', periods=8))
df1 = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])
df2 = df1.copy()
df3 = df1.copy()
df = df1.append([df2, df3])

print df

Issue concating two dataframes ontop of each other

If you look at the example of the documentation of pandas: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.append.html

df = pd.DataFrame([[1, 2], [3, 4]], columns=list('AB'))
df
A B
0 1 2
1 3 4
df2 = pd.DataFrame([[5, 6], [7, 8]], columns=list('AB'))
df.append(df2)
A B
0 1 2
1 3 4
0 5 6
1 7 8

You will see that the index will be stacked.

So like the comment suggested, use ignore_index=True to reset the index to numeric order:

df.append(df2, ignore_index=True)
A B
0 1 2
1 3 4
2 5 6
3 7 8


Related Topics



Leave a reply



Submit