Check If Value from One Dataframe Exists in Another Dataframe

Check if value from one dataframe exists in another dataframe

Use isin

Df1.name.isin(Df2.IDs).astype(int)

0 1
1 1
2 0
3 0
Name: name, dtype: int32

Show result in data frame

Df1.assign(InDf2=Df1.name.isin(Df2.IDs).astype(int))

name InDf2
0 Marc 1
1 Jake 1
2 Sam 0
3 Brad 0

In a Series object

pd.Series(Df1.name.isin(Df2.IDs).values.astype(int), Df1.name.values)

Marc 1
Jake 1
Sam 0
Brad 0
dtype: int32

Check if column pair in one dataframe exists in another?

I'm not sure why you don't like merge, but you can use isin with tuple:

df2['check'] = df2[['id','ref']].apply(tuple, axis=1)\
.isin(df1[['id','ref']].apply(tuple, axis=1))

Output:

  id     ref  check
0 a apple True
1 b orange True
2 d banana False

Check if value from one dataframe exists in another dataframe in R

Using the same data and outcome as the original Python example

Df1 <- data.frame(name =  c('Marc', 'Jake', 'Sam', 'Brad'))
Df2 <- data.frame(IDs = c('Jake', 'John', 'Marc', 'Tony', 'Bob'))
Df1$presentinDf2 <- as.integer(Df1$name %in% Df2$IDs)
Df1
#> name presentinDf2
#> 1 Marc 1
#> 2 Jake 1
#> 3 Sam 0
#> 4 Brad 0

Check if value from one dataframe exists in another dataframe and create column

df.join(df2.groupby('StartDate')['User'].apply('; '.join), how='left', on='Dates').fillna('')

Output:

>>> df
Dates User
0 2021-10-01
1 2021-10-02
2 2021-10-03
3 2021-10-04 Test1
4 2021-10-05
5 2021-10-06
6 2021-10-07 Test2, Test1

Check for a value from one dataframe exists in another

First Row

Option 1

You can use df.isin:

first_flag = df2[df2.score.isin([df1.marks[0]])].flag
print(first_flag)
0 T

To get the values, use .values.tolist():

print(first_flag.values.tolist())
['T']

To get a single the value as a single item, use .item:

print(first_flag.item())
'T'

Option 2

Using df.eval:

score = df1.marks[0]
first_flag = df2[df2.eval('score == {}'.format(score))].flag
print(first_flag)
0 T

Option 3

Using df.eq

score = df1.marks[0]
first_flag = df2[df2.score.eq(score)].flag
print(first_flag)
0 T

All Rows

Use df.merge.

flags = df1.merge(df2, left_on='marks', right_on='score').flag
print(flags)
0 T
1 F
Name: flag, dtype: object

If you want to retrieve NaN for rows where no flag exists, you can do a left join:

flags = df1.merge(df2, left_on='marks', right_on='score', how='left').flag
print(flags)
0 T
1 NaN
2 F


Related Topics



Leave a reply



Submit