Combine Two or More Columns in a Dataframe into a New Column With a New Name

Pandas - combine column values into a list in a new column

try this :

t['combined']= t.values.tolist()

t
Out[50]: 
         A         B     C        D                       combined
0    hello         1  GOOD  long.kw      [hello, 1, GOOD, long.kw]
1     1.20  chipotle   NaN    bingo    [1.2, chipotle, nan, bingo]
2  various       NaN  3000   123.46  [various, nan, 3000, 123.456]

Combine two columns with same name pandas

You could do:

df.T.reset_index().groupby('index').agg(','.join).T

index             city country house_number  ...           road state     unit
0      greensboro,7611      us         3200  ...  northline ave    nc  ste

How to add multiple columns to pandas dataframe in one assignment?

I would have expected your syntax to work too. The problem arises because when you create new columns with the column-list syntax (df[[new1, new2]] = ...), pandas requires that the right hand side be a DataFrame (note that it doesn't actually matter if the columns of the DataFrame have the same names as the columns you are creating).

Your syntax works fine for assigning scalar values to existing columns, and pandas is also happy to assign scalar values to a new column using the single-column syntax (df[new1] = ...). So the solution is either to convert this into several single-column assignments, or create a suitable DataFrame for the right-hand side.

Here are several approaches that will work:

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'col_1': [0, 1, 2, 3],
    'col_2': [4, 5, 6, 7]
})

Then one of the following:

1) Three assignments in one, using list unpacking:

df['column_new_1'], df['column_new_2'], df['column_new_3'] = [np.nan, 'dogs', 3]

2) `DataFrame` conveniently expands a single row to match the index, so you can do this:

df[['column_new_1', 'column_new_2', 'column_new_3']] = pd.DataFrame([[np.nan, 'dogs', 3]], index=df.index)

3) Make a temporary data frame with new columns, then combine with the original data frame later:

df = pd.concat(
    [
        df,
        pd.DataFrame(
            [[np.nan, 'dogs', 3]], 
            index=df.index, 
            columns=['column_new_1', 'column_new_2', 'column_new_3']
        )
    ], axis=1
)

4) Similar to the previous, but using `join` instead of `concat` (may be less efficient):

df = df.join(pd.DataFrame(
    [[np.nan, 'dogs', 3]], 
    index=df.index, 
    columns=['column_new_1', 'column_new_2', 'column_new_3']
))

5) Using a dict is a more "natural" way to create the new data frame than the previous two, but the new columns will be sorted alphabetically (at least before Python 3.6 or 3.7):

df = df.join(pd.DataFrame(
    {
        'column_new_1': np.nan,
        'column_new_2': 'dogs',
        'column_new_3': 3
    }, index=df.index
))

6) Use `.assign()` with multiple column arguments.

I like this variant on @zero's answer a lot, but like the previous one, the new columns will always be sorted alphabetically, at least with early versions of Python:

df = df.assign(column_new_1=np.nan, column_new_2='dogs', column_new_3=3)

7) This is interesting (based on https://stackoverflow.com/a/44951376/3830997), but I don't know when it would be worth the trouble:

new_cols = ['column_new_1', 'column_new_2', 'column_new_3']
new_vals = [np.nan, 'dogs', 3]
df = df.reindex(columns=df.columns.tolist() + new_cols)   # add empty cols
df[new_cols] = new_vals  # multi-column assignment works for existing cols

8) In the end it's hard to beat three separate assignments:

df['column_new_1'] = np.nan
df['column_new_2'] = 'dogs'
df['column_new_3'] = 3

Note: many of these options have already been covered in other answers: Add multiple columns to DataFrame and set them equal to an existing column, Is it possible to add several columns at once to a pandas DataFrame?, Add multiple empty columns to pandas DataFrame

Merge two different dataframes on different column names

Well, if you declare column A as index, it works:

Both_DFs = pd.merge(df1.set_index('A', drop=True),df2.set_index('A', drop=True), how='left',left_on=['B'],right_on=['CC'], left_index=True, right_index=True).dropna().reset_index()

This results in:

    A    B   C  BB   CC  DD
0  A1  123  K0  B0  121  D0
1  A1  345  K1  B0  121  D0
2  A3  146  K1  B3  345  D1

EDIT

You just needed:

Both_DFs = pd.merge(df1,df2, how='left',left_on=['A','B'],right_on=['A','CC']).dropna()

Which gives:

    A    B   C  BB   CC  DD
0  A1  121  K0  B0  121  D0

pandas: merge (join) two data frames on multiple columns

Try this

new_df = pd.merge(A_df, B_df,  how='left', left_on=['A_c1','c2'], right_on = ['B_c1','c2'])

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html

left_on : label or list, or array-like Field names to join on in left
DataFrame. Can be a vector or list of vectors of the length of the
DataFrame to use a particular vector as the join key instead of
columns
right_on : label or list, or array-like Field names to join on
in right DataFrame or vector/list of vectors per left_on docs

Pandas: control new column names when merging two dataframes?

The suffixes option in the merge function does this. The defaults are suffixes=('_x', '_y').

In general, renaming columns can be done with the rename method.

Combine Two or More Columns in a Dataframe into a New Column With a New Name

Pandas - combine column values into a list in a new column

Combine two columns with same name pandas

How to add multiple columns to pandas dataframe in one assignment?

1) Three assignments in one, using list unpacking:

2) `DataFrame` conveniently expands a single row to match the index, so you can do this:

3) Make a temporary data frame with new columns, then combine with the original data frame later:

4) Similar to the previous, but using `join` instead of `concat` (may be less efficient):

5) Using a dict is a more "natural" way to create the new data frame than the previous two, but the new columns will be sorted alphabetically (at least before Python 3.6 or 3.7):

6) Use `.assign()` with multiple column arguments.

7) This is interesting (based on https://stackoverflow.com/a/44951376/3830997), but I don't know when it would be worth the trouble:

8) In the end it's hard to beat three separate assignments:

Merge two different dataframes on different column names

pandas: merge (join) two data frames on multiple columns

Pandas: control new column names when merging two dataframes?

Related Topics

Leave a reply

Pandas - combine column values into a list in a new column

Combine two columns with same name pandas

How to add multiple columns to pandas dataframe in one assignment?

1) Three assignments in one, using list unpacking:

2) DataFrame conveniently expands a single row to match the index, so you can do this:

3) Make a temporary data frame with new columns, then combine with the original data frame later:

4) Similar to the previous, but using join instead of concat (may be less efficient):

5) Using a dict is a more "natural" way to create the new data frame than the previous two, but the new columns will be sorted alphabetically (at least before Python 3.6 or 3.7):

6) Use .assign() with multiple column arguments.

7) This is interesting (based on https://stackoverflow.com/a/44951376/3830997), but I don't know when it would be worth the trouble:

8) In the end it's hard to beat three separate assignments:

Merge two different dataframes on different column names

pandas: merge (join) two data frames on multiple columns

Pandas: control new column names when merging two dataframes?

Related Topics

Leave a reply

2) `DataFrame` conveniently expands a single row to match the index, so you can do this:

4) Similar to the previous, but using `join` instead of `concat` (may be less efficient):

6) Use `.assign()` with multiple column arguments.