Converting Pandas Column of Comma-Separated Strings into Integers

Converting pandas column of comma-separated strings into integers

I think your solution should actually be:

df['col3'] = df.col3.str.split(',').str.join('').astype(int)

col1 col2 col3
0 1 x 12123
1 2 x 1123
2 3 y 45998

As str.strip only strips from the left and right sides.

Explanation

  • str: Allows for vectorized string functions for Series
  • split: Will split each element in the list according to some pattern, , in this case
  • join: will join elements in the now Series of lists with a passed delimeter, '' here as you want to create ints.

And finally .astype(int) to turn each string into an integer

how to convert comma seperated values to integer in pandas

Here is a way, go through float type first:

df['no'].str.replace(',','').astype(float).astype(int)

Output:

0    1234450445
1 1234450446
2 1234450447
Name: no, dtype: int64

Or slice '.00' off then end of all rows:

df['no'].str.strip('.00').str.replace(',','').astype(int)

How to convert comma separated numbers from a dataframe to to numbers and get the avg value

You can simply define a function that unpack those values and then get the mean of those.

def get_mean(x):
#split into list of strings
splited = x.split(',')
#Transform into numbers
y = [float(n) for n in splited]
return sum(y)/len(y)

#Apply on desired column
df['col'] = df['col'].apply(get_mean)

How to split comma separated strings in a column into different columns if they're not of same length using python or pandas in jupyter notebook

We can use a regular expression pattern to find all the matching key-value pairs from each row of column_A , then map the list of pairs from each row to dictionary in order to create records then construct a dataframe from these records

pd.DataFrame(map(dict, df['column_A'].str.findall(r'\s*([^:,]+):\s*([^,]+)')))

See the online regex demo

        Garbage Organics          Recycle   Junk
0 Tissues Milk Cardboards NaN
1 Paper Towels Eggs Glass Feces
2 cups NaN Plastic bottles NaN

Here is an alternate approach in case you don't want to use regular expression patterns

df['column_A'].str.split(', ').explode()\
.str.split(': ', expand=True)\
.set_index(0, append=True)[1].unstack()

Splitting two integer values in a cell separated by a comma using Pandas

df[['Price A', 'Price B']] = df['Price'].str.split(',', expand=True)

Outcome

   Price       Price A  Price B
0 79.9,99.9 79.9 99.9
1 59.9 59.9 None
2 49.9,89.9 49.9 89.9
3 59.9 59.9 None


Related Topics



Leave a reply



Submit