Pandas Concat Generates Nan Values

pandas concat generates nan values

I think there is problem with different index values, so where concat cannot align get NaN:

aaa  = pd.DataFrame([0,1,0,1,0,0], columns=['prediction'], index=[4,5,8,7,10,12])
print(aaa)
    prediction
4            0
5            1
8            0
7            1
10           0
12           0

bbb  = pd.DataFrame([0,0,1,0,1,1], columns=['groundTruth'])
print(bbb)
   groundTruth
0            0
1            0
2            1
3            0
4            1
5            1

print (pd.concat([aaa, bbb], axis=1))
    prediction  groundTruth
0          NaN          0.0
1          NaN          0.0
2          NaN          1.0
3          NaN          0.0
4          0.0          1.0
5          1.0          1.0
7          1.0          NaN
8          0.0          NaN
10         0.0          NaN
12         0.0          NaN

Solution is reset_index if indexes values are not necessary:

aaa.reset_index(drop=True, inplace=True)
bbb.reset_index(drop=True, inplace=True)

print(aaa)
   prediction
0           0
1           1
2           0
3           1
4           0
5           0

print(bbb)
   groundTruth
0            0
1            0
2            1
3            0
4            1
5            1

print (pd.concat([aaa, bbb], axis=1))
   prediction  groundTruth
0           0            0
1           1            0
2           0            1
3           1            0
4           0            1
5           0            1

EDIT: If need same index like aaa and length of DataFrames is same use:

bbb.index = aaa.index
print (pd.concat([aaa, bbb], axis=1))
    prediction  groundTruth
4            0            0
5            1            0
8            0            1
7            1            0
10           0            1
12           0            1

Pandas concat resulting in NaN rows?

I used to have the same problem , when I generated the training and testing set.This is my solution , However , I do not know why pd.concat does not work in this situation too ...

l1=df.values.tolist()
l2=df_resolved.values.tolist()
for i in range(len(l1)):
    l1[i].extend(l2[i])

df=pd.DataFrame(l1,columns=df.columns.tolist()+df_resolved.columns.tolist())

pandas concat two dataframes of different row size without nan values

Here's how I did it and I don't get any additional NaNs.

import pandas as pd
import numpy as np
df1 = pd.DataFrame({'a':[1,2,3,4,5,6],
                    'b':['a','b','c','d',np.nan,np.nan],
                    'c':['x',np.nan,np.nan,np.nan,'y','z']})
df2 = pd.DataFrame(np.random.randint(0,10,(3,3)), columns = list('abc'))
print (df1)
print (df2)
df = pd.concat([df1,df2]).reset_index(drop=True)
print (df)

The output of this is:

DF1:

   a    b    c
0  1    a    x
1  2    b  NaN
2  3    c  NaN
3  4    d  NaN
4  5  NaN    y
5  6  NaN    z

DF2:

DF: after concat

   a    b    c
0  1    a    x
1  2    b  NaN
2  3    c  NaN
3  4    d  NaN
4  5  NaN    y
5  6  NaN    z
6  4    8    4
7  8    4    4
8  2    8    1

Pandas concat producing NaN

I think you want concat(df, axis=1).

Why am I getting NaN values when I concat two panda dataframes

if you simple want to add the column of your second dataframe(ma5xdf) at the ending of your first dataframe(dfmas) you can do this.

ma5xdf['ma5x'] = ma5xdf['ma5x'].astype(float)
dfmas['ma5x'] = ma5xdf['ma5x']

A simple and precise solution.

I see that your index is set for date column in dfmas.

so another approach.

dfmas.reset_index(drop=True,inplace=True)
ma5xdf['ma5x'] = ma5xdf['ma5x'].astype(float)
dfmas['ma5x'] = ma5xdf['ma5x']
dfmas = dfmas.set_index('Date')

How to complete NaN cells based on another Pandas dataframe in Python

You can drop the NaN values from df2, then update with concat and groupby:

pd.concat([df2.dropna(), df1]).groupby('id', as_index=False).first()

Output:

   id   col1   col2
0   1   13.0   23.0
1   2   14.0   24.0
2   3  150.0  250.0
3   4    NaN    NaN

Pandas Concat Generates Nan Values