Replacing Blank Values (White Space) With Nan in Pandas

how to replace a lengthy white space to Nan in a column under Object format? in python

Series.str.contains

FGS_data[~FGS_data['Ins ISIN code'].str.contains(r'^\s*$')]


     Ins ISIN code  Quantity
51 JP3970300004 30500.0
143 JP3970300004 37000.0
176 JP3982800009 11500.0

Regex details:

  • ^ : Assert position at start of line
  • \s* : Matches a whitespace character between zero or unlimited times
  • $ : Asserts position at the end of line

replacing empty strings with NaN in Pandas

If you are reading a csv file and want to convert all empty strings to nan while reading the file itself then you can use the option

skipinitialspace=True

Example code

pd.read_csv('Sample.csv', skipinitialspace=True)

This will remove any white spaces that appear after the delimiters, Thus making all the empty strings as nan

From the documentation http://pandas.pydata.org/pandas-docs/stable/io.html

enter image description here

Note: This option will remove preceding white spaces even from valid data, if for any reason you want to retain the preceding white space then this option is not a good choice.

Replacing nan values in a Pandas data frame with lists

You have to handle the three cases (empty string, NaN, NaN in list) separately.

For the NaN in list you need to loop over each occurrence and replace the elements one by one.

NB. applymap is slow, so if you know in advance the columns to use you can subset them

For the empty string, replace them to NaN, then fillna.

sub = 'X'
(df.applymap(lambda x: [sub if (pd.isna(e) or e=='')
else e for e in x]
if isinstance(x, list) else x)
.replace('', float('nan'))
.fillna(sub)
)

Output:

  col1  col2       col3    col4
0 X Jhon [X, 1, 2] [k, j]
1 1.0 X [1, 1, 5] 3
2 2.0 X X X
3 3.0 Samy [1, 1, X] [b, X]

Used input:

from numpy import nan
df = pd.DataFrame({'col1': {0: nan, 1: 1.0, 2: 2.0, 3: 3.0},
'col2': {0: 'Jhon', 1: nan, 2: '', 3: 'Samy'},
'col3': {0: [nan, 1, 2], 1: [1, 1, 5], 2: nan, 3: [1, 1, nan]},
'col4': {0: ['k', 'j'], 1: '3', 2: nan, 3: ['b', '']}})


Related Topics



Leave a reply



Submit