Pandas Make New Column from String Slice of Another Column

Pandas make new column from string slice of another column

You can call the str method and apply a slice, this will be much quicker than the other method as this is vectorised (thanks @unutbu):

df['New_Sample'] = df.Sample.str[:1]

You can also call a lambda function on the df but this will be slower on larger dataframes:

In [187]:

df['New_Sample'] = df.Sample.apply(lambda x: x[:1])
df
Out[187]:
Sample Value New_Sample
0 AAB 23 A
1 BAB 25 B

Create column from a substring of another column

You can use pd.Series.str.split for this:

df[['want1', 'want2', 'want3']] = df['variable'].str.split(' - ', expand=True)

Pandas: creating a new column conditional on substring searches of one column and inverse of another column

You don't want to index into df here, so just do this:

Just change: (df[~df['Color'].str.contains('Red', na=False)])

to: ~df['Color'].str.contains('Red', na=False)

and it should work.

Also, if you want to break this up for readability and to eliminate some repetition, I would suggest something like this:

# define the parameters that define the Country variable in another table
df_countries = pd.DataFrame(
{'letters': ['ABC', 'DEF', 'ABC'],
'is_red': [True, False, False],
'Country': ['United States', 'Canada', 'England']})

# add those identifying parameters to your current table as temporary columns
df['letters'] = df.Manufacturer.str.replace('-.*', '')
df['is_red'] = df.Color.str.contains('Red', na=False)

# merge the tables together and drop the temporary key columns
df = df.merge(df_countries, how='left', on=['letters', 'is_red'])
df = df.drop(columns=['letters', 'is_red'])

Or more concise:

in_col = lambda col, string: df[col].str.contains(string, na=False)

conds = {'United States': in_col('Manufacturer', 'ABC') & in_col('Color', 'Red'),
'Canada': in_col('Manufacturer', 'DEF'),
'England': in_col('Manufacturer', 'ABC') & ~in_col('Color', 'Red')}

df['Country'] = np.select(condlist=conds.values(), choicelist=conds.keys())

Add new column to pandas data frame based on string + value from another column in the data frame

Use:

df['axis'] = 'up to ' + df['end'].astype(str)

Pandas make new column from substring slice based on the number in a substring of another column

The following should work

table['NUMBER'] = table.STRING.apply(lambda x: int(''.join(filter(str.isdigit, x))))

How to create column in pandas with part of another column?

Try:

segura['NEW_ID'] = segura['cnpj'].str[:8]

Make new column from slice of string from one column pandas python

I think first column is index, so use .index, also for year change 4:5 slicing to 3:5, 0 is possible omit in 0:3:

df['month']=df.index.str[:3]
df['year']=df.index.str[3:5]
print (df)
Noodles FaceCream BodyWash Powder Soap month year
SKU
Jan10_Sales 122 100 50 200 300 Jan 10
Feb10_Sales 100 50 80 90 250 Feb 10
Mar10_sales 40 30 100 10 11 Mar 10

Pandas DataFrame: use column value to slice string in another column

Use apply, because each row has to be process separately:

my_df['new_col'] = my_df.apply(lambda x: x['col3'][x['col1']-1:x['col2']], 1)  
print (my_df)
col1 col2 col3 new_col
0 1 3 ABCDEFG ABC
1 1 5 HIJKLMNO HIJKL
2 1 2 PQRSTUV PQ


Related Topics



Leave a reply



Submit