How to remove nan value while combining two column in Panda Data frame?
You can use combine_first
or fillna
:
print df['feedback_id'].combine_first(df['_id'])
0 568a8c25cac4991645c287ac
1 568df45b177e30c6487d3603
2 568df434832b090048f34974
3 568cd22e9e82dfc166d7dff1
4 568df3f0832b090048f34711
5 568e5a38b4a797c664143dda
Name: feedback_id, dtype: object
print df['feedback_id'].fillna(df['_id'])
0 568a8c25cac4991645c287ac
1 568df45b177e30c6487d3603
2 568df434832b090048f34974
3 568cd22e9e82dfc166d7dff1
4 568df3f0832b090048f34711
5 568e5a38b4a797c664143dda
Name: feedback_id, dtype: object
Combine multiple columns in Pandas excluding NaNs
You can apply ",".join()
on each row by passing axis=1
to the apply method. You first need to drop the NaNs though. Otherwise you will get a TypeError.
df.apply(lambda x: ','.join(x.dropna()), axis=1)
Out:
0 a,d,f
1 e
2 c,b,g
dtype: object
You can assign this back to the original DataFrame with
df["keywords_all"] = df.apply(lambda x: ','.join(x.dropna()), axis=1)
Or if you want to specify columns as you did in the question:
cols = ['keywords_0', 'keywords_1', 'keywords_2', 'keywords_3']
df["keywords_all"] = df[cols].apply(lambda x: ','.join(x.dropna()), axis=1)
pandas combine two columns with null values
Use fillna
on one column with the fill values being the other column:
df['foodstuff'].fillna(df['type'])
The resulting output:
0 apple-martini
1 apple-pie
2 strawberry-tart
3 dessert
4 None
Concatenate two columns in pandas with NaN
Idea is add _
to second column with _
, so after replace missing value by empty string is not added _
for missing values:
df['colC'] = df['colA'] + ('_' + df['colB']).fillna('')
print (df)
ID colA colB colC
0 ID1 A D A_D
1 ID2 B NaN B
2 ID3 C E C_E
If not sure where are missing values (in colA
or colB
):
df['colC'] = (df['colA'].fillna('') + '_' + df['colB'].fillna('')).str.strip('_')
Also is possible test each column separately:
m1 = df['colA'].isna()
m2 = df['colB'].isna()
df['colC'] = np.select([m1, m2, m1 & m2],
[df['colB'], df['colA'], np.nan],
default=df['colA'] + '_' + df['colB'])
print (df)
ID colA colB colC
0 ID1 A D A_D
1 ID2 B NaN B
2 ID3 NaN E E
3 ID4 NaN NaN NaN
Python - Drop row if two columns are NaN
Any one of the following two:
df.dropna(subset=[1, 2], how='all')
or
df.dropna(subset=[1, 2], thresh=1)
pandas combine two strings ignore nan values
Call fillna
and pass an empty str as the fill value and then sum
with param axis=1
:
In [3]:
df = pd.DataFrame({'a':['asd',np.NaN,'asdsa'], 'b':['asdas','asdas',np.NaN]})
df
Out[3]:
a b
0 asd asdas
1 NaN asdas
2 asdsa NaN
In [7]:
df['a+b'] = df.fillna('').sum(axis=1)
df
Out[7]:
a b a+b
0 asd asdas asdasdas
1 NaN asdas asdas
2 asdsa NaN asdsa
Related Topics
How to Create Multiline Comments in Python
How to Access the Child Classes of an Object in Django Without Knowing the Name of the Child Class
Combining Two Sorted Lists in Python
Setting Different Color for Each Series in Scatter Plot on Matplotlib
Cmd Opens Windows Store When I Type 'Python'
Does Python Support MySQL Prepared Statements
Pandas: Merge (Join) Two Data Frames on Multiple Columns
How Are Glob.Glob()'s Return Values Ordered
Sqlite/Sqlalchemy: How to Enforce Foreign Keys
How to Set Max_Retries for Requests.Request
How to Quantify Difference Between Two Images
How to Verify If One List Is a Subset of Another
Remove Punctuation from Unicode Formatted Strings
Differencebetween Drawing Plots Using Plot, Axes or Figure in Matplotlib
Given a Url to a Text File, What Is the Simplest Way to Read the Contents of the Text File
Subprocess.Call Using String VS Using List