Pandas version of rbind
Ah, this is to do with how I created the DataFrame, not with how I was combining them. The long and the short of it is, if you are creating a frame using a loop and a statement that looks like this:
Frame = Frame.append(pandas.DataFrame(data = SomeNewLineOfData))
You must ignore the index
Frame = Frame.append(pandas.DataFrame(data = SomeNewLineOfData), ignore_index=True)
Or you will have issues later when combining data.
Pandas equivalent rbind operation
>>> df1
a b
0 -1.417866 -0.828749
1 0.212349 0.791048
2 -0.451170 0.628584
3 0.612671 -0.995330
4 0.078460 -0.322976
5 1.244803 1.576373
6 1.169629 -1.135926
7 -0.652443 0.506388
8 0.549604 -0.691054
9 -0.512829 -0.959398
>>> df2
a b
0 -0.652161 0.940932
1 2.495067 0.004833
2 -2.187792 1.692402
3 1.900738 0.372425
4 0.245976 1.894527
5 0.627297 0.029331
6 -0.828628 -1.600014
7 -0.991835 -0.061202
8 0.543389 0.703457
9 -0.755059 1.239968
>>> pd.concat([df1, df2])
a b
0 -1.417866 -0.828749
1 0.212349 0.791048
2 -0.451170 0.628584
3 0.612671 -0.995330
4 0.078460 -0.322976
5 1.244803 1.576373
6 1.169629 -1.135926
7 -0.652443 0.506388
8 0.549604 -0.691054
9 -0.512829 -0.959398
0 -0.652161 0.940932
1 2.495067 0.004833
2 -2.187792 1.692402
3 1.900738 0.372425
4 0.245976 1.894527
5 0.627297 0.029331
6 -0.828628 -1.600014
7 -0.991835 -0.061202
8 0.543389 0.703457
9 -0.755059 1.239968
Unless I'm misinterpreting what you need, this is what you need.
Equivalent of R rbind.fill in Python Pandas
You are looking for the function concat
:
import pandas as pd
df1 = pd.DataFrame({'col1':['a','b'],'col2':[33,44]})
df2 = pd.DataFrame({'col3':['dog'],'col2':[32], 'col4':[1]})
In [8]: pd.concat([df1, df2])
Out[8]:
col1 col2 col3 col4
0 a 33 NaN NaN
1 b 44 NaN NaN
0 NaN 32 dog 1
pandas equivalent of R's cbind (concatenate/stack vectors vertically)
test3 = pd.concat([test1, test2], axis=1)
test3.columns = ['a','b']
(But see the detailed answer by @feng-mai, below)
rbindlist equivalent R's function in python
I found a solution for my problem:
ests_list=[]
for i in list(range(1,num_vars)):
ests_list.append(df1.merge(df2,how='left',on=eval("combine%s"%i+"_lvl")))
pd.concat(ests_list)
I am creating an empty list and and I added each loop output to it.
Then I am combining all the list by using the pd.concat
function, so it gives me the output in pandas data frame format.
Pandas column bind (cbind) two data frames
If you're sure the index row values are the same then to avoid the index alignment order then just call reset_index()
, this will reset your index values back to start from 0
:
df_c = pd.concat([df_a.reset_index(drop=True), df_b], axis=1)
R rbind a dataframe of dataframes
The issue here seems to be not another data.frame
within a data frame, but the non-unique rownames
in the result. If you made sure that rownames are unique after rbind - it should work:
df1 <- data.frame(a=1:3)
df2 <- data.frame(a=1:3)
df1$df <- data.frame(a=1:3, row.names=letters[1:3])
df2$df <- data.frame(a=1:3, row.names=LETTERS[1:3])
> res <- rbind(df1, df2)
> res
a a
1 1 1
2 2 2
3 3 3
4 1 1
5 2 2
6 3 3
> res$df
a
a 1
b 2
c 3
A 1
B 2
C 3
The problem seems to be that rbind
adjusts the rownames for the two data.frames being merged, but does not adjust the rownames for data.frames within data.frames.
Append multiple pandas data frames at once
Have you simply tried using a list as argument of append? Or am I missing anything?
import numpy as np
import pandas as pd
dates = np.asarray(pd.date_range('1/1/2000', periods=8))
df1 = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])
df2 = df1.copy()
df3 = df1.copy()
df = df1.append([df2, df3])
print df
Issue concating two dataframes ontop of each other
If you look at the example of the documentation of pandas: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.append.html
df = pd.DataFrame([[1, 2], [3, 4]], columns=list('AB'))
df
A B
0 1 2
1 3 4
df2 = pd.DataFrame([[5, 6], [7, 8]], columns=list('AB'))
df.append(df2)
A B
0 1 2
1 3 4
0 5 6
1 7 8
You will see that the index will be stacked.
So like the comment suggested, use ignore_index=True
to reset the index to numeric order:
df.append(df2, ignore_index=True)
A B
0 1 2
1 3 4
2 5 6
3 7 8
Related Topics
Style Active Navigation Element with a Flask/Jinja2 MACro
How Is the Feature Score(/Importance) in the Xgboost Package Calculated
How to Integrate a Standalone Python Script into a Rails Application
Is There a Python Equivalent for Rspec to Do Tdd
How to Link Pycharm with Pyspark
How to Create a Large Pandas Dataframe from an SQL Query Without Running Out of Memory
Does Python Optimize Modules When They Are Imported Multiple Times
Fastapi Runs API-Calls in Serial Instead of Parallel Fashion
How to Use the Ellipsis Slicing Syntax in Python
What's the Difference Between "Pip Install" and "Python -M Pip Install"
Python Script for Minifying CSS
R, Python: Install Packages on Rpy2
Does Python Have an "Or Equals" Function Like ||= in Ruby
The Simplest Possible Reverse Proxy