Quickest Way to Make a Get_Dummies Type Dataframe from a Column with a Multiple of Strings

Quickest way to make a get_dummies type dataframe from a column with a multiple of strings

You can use:

>>> df['col2'].str.get_dummies(sep=',')
A B C G
0 1 1 0 0
1 1 0 1 1
2 0 1 0 0

To join the Dataframes:

>>> pd.concat([df, df['col2'].str.get_dummies(sep=',')], axis=1)
col1 col2 A B C G
0 6 A,B 1 1 0 0
1 15 C,G,A 1 0 1 1
2 25 B 0 1 0 0

Pandas Split String in colum into colums with 0/1 ; get dummies on all characters of a string

You can use str.get_dummies with an empty separator to get all letters:

df['Code'].str.get_dummies(sep='')

joining to original data:

df2 = df.drop('Code', axis=1).join(df['Code'].str.get_dummies(sep=''))

output:

  other columns   A  B  C  R
0 ... 1 1 1 0
1 ... 1 1 1 0
2 ... 0 0 0 1

Apply pd.get_dummies() on string type columns of pandas dataframe?

After some investigation, I have no idea why this might be occurring, especially since it works for single columns. I'm guessing it's a bug, because there seem to be quite a few of them centered around the pd.NA type (which convert_dtypes is in support of.)

I recommend opening a bug report at https://github.com/pandas/pandas-dev.

Running get_dummies on several DataFrame columns?

With pandas 0.19, you can do that in a single line :

pd.get_dummies(data=df, columns=['A', 'B'])

Columns specifies where to do the One Hot Encoding.

>>> df
A B C
0 a c 1
1 b c 2
2 a b 3

>>> pd.get_dummies(data=df, columns=['A', 'B'])
C A_a A_b B_b B_c
0 1 1.0 0.0 0.0 1.0
1 2 0.0 1.0 0.0 1.0
2 3 1.0 0.0 1.0 0.0

Create dummies from column with multiple values in pandas

I know it's been a while since this question was asked, but there is (at least now there is) a one-liner that is supported by the documentation:

In [4]: df
Out[4]:
label
0 (a, c, e)
1 (a, d)
2 (b,)
3 (d, e)

In [5]: df['label'].str.join(sep='*').str.get_dummies(sep='*')
Out[5]:
a b c d e
0 1 0 1 0 1
1 1 0 0 1 0
2 0 1 0 0 0
3 0 0 0 1 1

If the name of the column is in string return 1 in Python

How about str.get_dummies?

df = pd.concat([df, df['LanguageHaveWorkedWith'].str.get_dummies(sep=';')], axis=1)

How to create dummies for certain columns with pandas.get_dummies()

It can be done without concatenation, using get_dummies() with required parameters

In [294]: pd.get_dummies(df, prefix=['A', 'D'], columns=['A', 'D'])
Out[294]:
B C A_x A_y D_j D_l
0 z 1 1.0 0.0 1.0 0.0
1 u 2 0.0 1.0 0.0 1.0
2 z 3 1.0 0.0 1.0 0.0


Related Topics



Leave a reply



Submit