pandas concat generates nan values
I think there is problem with different index values, so where concat
cannot align get NaN
:
aaa = pd.DataFrame([0,1,0,1,0,0], columns=['prediction'], index=[4,5,8,7,10,12])
print(aaa)
prediction
4 0
5 1
8 0
7 1
10 0
12 0
bbb = pd.DataFrame([0,0,1,0,1,1], columns=['groundTruth'])
print(bbb)
groundTruth
0 0
1 0
2 1
3 0
4 1
5 1
print (pd.concat([aaa, bbb], axis=1))
prediction groundTruth
0 NaN 0.0
1 NaN 0.0
2 NaN 1.0
3 NaN 0.0
4 0.0 1.0
5 1.0 1.0
7 1.0 NaN
8 0.0 NaN
10 0.0 NaN
12 0.0 NaN
Solution is reset_index
if indexes values are not necessary:
aaa.reset_index(drop=True, inplace=True)
bbb.reset_index(drop=True, inplace=True)
print(aaa)
prediction
0 0
1 1
2 0
3 1
4 0
5 0
print(bbb)
groundTruth
0 0
1 0
2 1
3 0
4 1
5 1
print (pd.concat([aaa, bbb], axis=1))
prediction groundTruth
0 0 0
1 1 0
2 0 1
3 1 0
4 0 1
5 0 1
EDIT: If need same index like aaa
and length of DataFrames is same use:
bbb.index = aaa.index
print (pd.concat([aaa, bbb], axis=1))
prediction groundTruth
4 0 0
5 1 0
8 0 1
7 1 0
10 0 1
12 0 1
Pandas concat resulting in NaN rows?
I used to have the same problem , when I generated the training and testing set.This is my solution , However , I do not know why pd.concat
does not work in this situation too ...
l1=df.values.tolist()
l2=df_resolved.values.tolist()
for i in range(len(l1)):
l1[i].extend(l2[i])
df=pd.DataFrame(l1,columns=df.columns.tolist()+df_resolved.columns.tolist())
pandas concat two dataframes of different row size without nan values
Here's how I did it and I don't get any additional NaNs.
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'a':[1,2,3,4,5,6],
'b':['a','b','c','d',np.nan,np.nan],
'c':['x',np.nan,np.nan,np.nan,'y','z']})
df2 = pd.DataFrame(np.random.randint(0,10,(3,3)), columns = list('abc'))
print (df1)
print (df2)
df = pd.concat([df1,df2]).reset_index(drop=True)
print (df)
The output of this is:
DF1:
a b c
0 1 a x
1 2 b NaN
2 3 c NaN
3 4 d NaN
4 5 NaN y
5 6 NaN z
DF2:
a b c
0 4 8 4
1 8 4 4
2 2 8 1
DF: after concat
a b c
0 1 a x
1 2 b NaN
2 3 c NaN
3 4 d NaN
4 5 NaN y
5 6 NaN z
6 4 8 4
7 8 4 4
8 2 8 1
Pandas concat producing NaN
I think you want concat(df, axis=1)
.
Why am I getting NaN values when I concat two panda dataframes
if you simple want to add the column of your second dataframe(ma5xdf) at the ending of your first dataframe(dfmas) you can do this.
ma5xdf['ma5x'] = ma5xdf['ma5x'].astype(float)
dfmas['ma5x'] = ma5xdf['ma5x']
A simple and precise solution.
I see that your index is set for date
column in dfmas.
so another approach.
dfmas.reset_index(drop=True,inplace=True)
ma5xdf['ma5x'] = ma5xdf['ma5x'].astype(float)
dfmas['ma5x'] = ma5xdf['ma5x']
dfmas = dfmas.set_index('Date')
How to complete NaN cells based on another Pandas dataframe in Python
You can drop the NaN
values from df2
, then update with concat
and groupby
:
pd.concat([df2.dropna(), df1]).groupby('id', as_index=False).first()
Output:
id col1 col2
0 1 13.0 23.0
1 2 14.0 24.0
2 3 150.0 250.0
3 4 NaN NaN
Related Topics
How to Install Pip3 on Windows
Why Use Os.Path.Join Over String Concatenation
Equivalent of Numpy.Argsort() in Basic Python
How to Compare Dates in Django Templates
Why Can't Environmental Variables Set in Python Persist
Print Current Call Stack from a Method in Code
Web Scraping Program Cannot Find Element Which I Can See in the Browser
Flask-Sqlalchemy Import/Context Issue
How to Concatenate Two Layers in Keras
How to Perform Two-Dimensional Interpolation Using Scipy
How Come a File Doesn't Get Written Until I Stop the Program
Optimizing Database Queries in Django Rest Framework
Get the Position of the Largest Value in a Multi-Dimensional Numpy Array
How to Check If a Column Exists in Pandas
How to Merge Two Dataframes Side-By-Side