Pandas does not fill nan values with empty string
Accessing with square brackets and a list of columns creates a copy, so you modify a temporary object, not the original dataframe.
You have three possible solutions, either pass a dict of column -> replacement for each column, assign or loop over the columns.
Looping
for col in (col_buyername, col_product):
df[col].fillna('', inplace=True)
Assignment
df[[col_buyername, col_product]] = df[[col_buyername, col_product]].fillna('')
dict
df.fillna({col_buyername: '', col_product: ''}, inplace=True)
The loop and the dict approach should be a little more efficient than the reassignment.
For more info on when pandas created copies and when not, see https://stackoverflow.com/a/53954986/3838691
How to replace empty strings in a dataframe with NA (missing value) not NA string
By specifying just NA
, according to ?NA
-"NA is a logical constant of length 1 which contains a missing value."
The class
can be checked
class(NA)
#[1] "logical"
class(NA_character_)
#[1] "character"
and both of them is identified by standard functions such as is.na
is.na(NA)
#[1] TRUE
is.na(NA_character_)
#[1] TRUE
The if_else
is type sensitive, so instead of specifying as NA
which returns a logical output, it can specified as either NA_real_
, NA_integer_
, NA_character_
depending on the type of the 'boat' column. Assuming that the 'boat' is character
class, we may need NA_character_
titanic %>%
mutate(boat = if_else(boat=="", NA_character_ ,boat))
How to replace None only with empty string using pandas?
It looks like None
is being promoted to NaN
and so you cannot use replace
like usual, the following works:
In [126]:
mask = df.applymap(lambda x: x is None)
cols = df.columns[(mask).any()]
for col in df[cols]:
df.loc[mask[col], col] = ''
df
Out[126]:
A B C D E
0 A 2014-01-02 02:00:00 A 1
1 B 2014-01-02 03:00:00 B B 2
2 2014-01-02 04:00:00 C C NaN
3 C NaT C 4
So we generate a mask of the None
values using applymap
, we then use this mask to iterate over each column of interest and using the boolean mask set the values.
Replace null with empty string when writing Spark dataframe
check this out. you can when
and otherwise
.
df.show()
#InputDF
# +-------------+----------+
# |UNIQUE_MEM_ID| DATE|
# +-------------+----------+
# | 1156| null|
# | 3787|2016-07-05|
# | 1156| null|
# +-------------+----------+
df.withColumn("DATE", F.when(F.col("DATE").isNull(), '').otherwise(F.col("DATE"))).show()
#OUTPUTDF
# +-------------+----------+
# |UNIQUE_MEM_ID| DATE|
# +-------------+----------+
# | 1156| |
# | 3787|2016-07-05|
# | 1156| |
# +-------------+----------+
To apply the above logic to all the columns of dataframe. you can use for loop and iterate through columns and fill empty string when column value is null.
df.select( *[ F.when(F.col(column).isNull(),'').otherwise(F.col(column)).alias(column) for column in df.columns]).show()
Related Topics
Python 3D Polynomial Surface Fit, Order Dependent
How to Convert .Dat to .Csv Using Python
Conversion of String to Upper Case Without Inbuilt Methods
How to Change Dd-Mm-Yyyy Date Format to Yyyy-Dd-Mm in Pandas
Sum Numbers of Each Row of a Matrix Python
How to Resolve Modulenotfounderror: No Module Named 'Google.Colab'
How to Iterate Over a Timespan After Days, Hours, Weeks and Months
Swap First and Last Digits of a Number( Using Loops)
How to Get Value from Json List Within Robot Framework
How to Truncate the Time on a Datetime Object
How to Skip Blank Line While Reading CSV File Using Python
How to Pad a String With Leading Zeros in Python 3
Replacing Values in a Dataframe for Given Indices
Populating a List in Python Using for Loop
Python: Split a List into Multiple Lists Based on a Subset of Elements
How to Make Tkinter Frames in a Loop and Update Object Values