Comparing strings in two different dataframe and adding a column
Using merge()
should solve the problem.
df3 = pd.merge(df1, df2, on='Name')
Outcome:
import pandas as pd
df1 = pd.DataFrame({ "Name":["Bob1", "Bob2", "Bob3"], "Age":[20,21,22]})
df2 = pd.DataFrame({ "Country":["US", "UK", "US", "Canada", "Canada", "US", "UK", "UK", "UK", "Canada"],
"Name":["Bob1", "Bob123", "Bob234", "Bob2", "Bob987", "Bob3", "Mary1", "Mary2", "Mary3", "Mary65"]})
df3 = pd.merge(df1, df2, on='Name')
df3
pandas: compare string columns from two different dataframes of different sizes
Use Series.isin
if need boolean True/False
:
df1['result'] = df1['text'].isin(df2['text'])
print (df1)
text result
0 the old man is here True
1 the young girl is there False
2 the old woman is here False
3 the young boy is there True
4 the young girl is here False
5 the old girl is here False
working like:
#removed '' from 'True', 'False' for boolean
df1['result'] = np.where(df1['text'].isin(df2['text']), True, False)
Your solution create strings, so if need use for filtering it fail:
df1['result'] = np.where(df1['text'].isin(df2['text']), 'True', 'False')
how to compare two data frame on one string column that the number of samples are different pandas
You can apply on the smallest DataFrame like dftest
then check in unique()
values in largest DataFrame like dftrain
like below :
>>> dftrain = pd.DataFrame({'col1': ['text', 'Hello', 'How are you?', 'Hello', 'Hello' , 'Hello']})
>>> dftest = pd.DataFrame({'col2': ['text', 'hello', 'How are you?', 'hello']})
>>> dftest.loc[dftest['col2'].apply(lambda x : x in dftrain.col1.unique()), 'col2']
0 text
2 How are you?
Name: col2, dtype: object
>>> dftest.loc[dftest['col2'].apply(lambda x : x in dftrain.col1.unique()), 'col2'].tolist()
['text', 'How are you?']
Create a new dataframe column by comparing two other columns in different dataframes
Use map
after converting alpha2
to a mappable object.
First we make our map:
>> country_map = alpha2.set_index('Code')['Name'].to_dict()
>> # country_map = dict(alpha2[['Code', 'Name']].values)
>> # country_map = alpha2.set_index('Code')['Name']
>> print(country_map)
{'ES': 'Spain', 'UK': 'United Kingdom', 'GH': 'Ghana', 'SL': 'Sierra Leone'}
Then we map it on the Country Code
column:
>> cube_data['Country'] = cube_data['Country Code'].map(country_map)
>> print(cube_data)
Country Code Country
0 UK United Kingdom
1 ES Spain
2 SL Sierra Leone
Comparing columns of two Data Frames and returning the values of a different column using Pandas
You can use a dataframe merge for this
import pandas as pd
df_1 = pd.DataFrame({
'product_id': ['p1', 'p2', 'p3', 'p4'],
'product_price': [100, 200, 300, 400],
'invoice_total': [200, 300, 600, 700]
})
df_2 = pd.DataFrame({
'product_id': ['p1', 'p6', 'p2'],
'quantity': [8, 3, 5],
'invoice_total': [700, 900, 600]
})
df_merged = df_1.merge(
df_2,
on='product_id',
suffixes=('_df1', '')
)
Contents of df_merged
product_id product_price invoice_total_df1 quantity invoice_total
0 p1 100 200 8 700
1 p2 200 300 5 600
Then filter to only the columns you need
df_merged = df_merged[['product_id', 'invoice_total']]
Final contents of df_merged
product_id invoice_total
0 p1 700
1 p2 600
Compare the values of two columns of different length in two different DataFrames and perform a math operation if matches a condition
Use MultiIndex
if unique MultiIndex
values:
df11 = df1.set_index(['ID','Class'])
df11['VALUE'] = df11['VALUE'].mul(df2.set_index(['ID','Class'])['NUMBER'])
df = df11.reset_index()
Or use left join in DataFrame.merge
and multiple column VALUE
with NUMBER
with DataFrame.pop
for remove after this operation:
df = df1.merge(df2, on=['ID','Class'], how='left')
df['VALUE'] *= df.pop('NUMBER')
Or:
df1['VALUE'] *= df1.merge(df2, on=['ID','Class'], how='left')['NUMBER']
How to compare two (2) unequal dataframes in Python and assign elements from the one to another?
Use df.merge()
:
In [240]: res = df1.merge(df2, on='number1')
In [241]: res
Out[241]:
number1 start end
0 10 17.8 17.8
1 20 25.0 28.0
2 30 18.4 19.5
Related Topics
Python, Anaconda, Spyder - Uninstalling Python Package Using Pip Does Not Work in Spyder + Ipython
Winerror 10049: the Requested Address Is Not Valid in Its Context
How to Solve and Equation With Inputs in Python
How to Call a Classes Method from Another Class Without Initialising the First Class
Python Executable Not Finding Libpython Shared Library
Finding the Value of the Min and Max Pixel
Overlay a Smaller Image on a Larger Image Python Opencv
Possible to Get User Input Without Inserting a New Line
How to Download Outlook Attachment from Python Script
How to Clear All Widgets from a Tkinter Window in One Go Without Referencing Them All Directly
How to Remove a Single Quotes from a List
Discord.Py | Add Role to Someone
How to Match a Newline Character in a Raw String
Python Pip Install Fails: Invalid Command Egg_Info
Python 3D Polynomial Surface Fit, Order Dependent
Python | Count Number of False Statements in 3 Rows