Existing Function for Seeing If a Row Exists in a Data Frame

Existing function for seeing if a row exists in a data frame?

For data from @Marek answer.

nrow(merge(row_to_find,X))>0 # TRUE if exists

Check if a row in one data frame exist in another data frame

You can use merge with parameter indicator, then remove column Rating and use numpy.where:

df = pd.merge(df1, df2, on=['User','Movie'], how='left', indicator='Exist')
df.drop('Rating', inplace=True, axis=1)
df['Exist'] = np.where(df.Exist == 'both', True, False)
print (df)
   User  Movie  Exist
0     1    333  False
1     1   1193   True
2     1      3  False
3     2    433  False
4     3     54   True
5     3    343  False
6     3     76   True

Check if a row exists in pandas

I think you need compare index values - output is True and False numpy array.
And for scalar need any - check at least one True or all for check if all values are Trues:

(df.index == 'entry').any()

(df.index == 'entry').all()

Another solution from comment of John Galt:

'entry' in df.index

If need check substring:

df.index.str.contains('en').any()

Sample:

df = pd.DataFrame({'Apr 2013':[1,2,3]}, index=['entry','pdf','sum'])
print(df)
       Apr 2013
entry         1
pdf           2
sum           3

print (df.index == 'entry')
[ True False False]

print ((df.index == 'entry').any())
True
print ((df.index == 'entry').all())
False

#check columns values
print ('entry' in df)
False
#same as explicitely call columns (better readability)
print ('entry' in df.columns)
False
#check index values
print ('entry' in df.index)
True
#check columns values
print ('Apr 2013' in df)
True
#check columns values
print ('Apr 2013' in df.columns)
True

df = pd.DataFrame({'Apr 2013':[1,2,3]}, index=['entry','entry','entry'])
print(df)
       Apr 2013
entry         1
entry         2
entry         3

print (df.index == 'entry')
[ True  True  True]

print ((df.index == 'entry').any())
True
print ((df.index == 'entry').all())
True

Pandas Check if a Row Exists Anywhere in a Column and Return True or False

You can use Series.isin method against a list of values. So you need a proper list of Description column values:

In [915]: vals = [x.split() for x in df.Description.values][0]
In [917]: df['Check'] = df.Keyword.isin(vals)

In [918]: df
Out[918]: 
  Keyword        Description  Check
0    spam  eggs spam foo bar   True
1    eggs                      True
2   house                     False
3     foo                      True
4     bar                      True
5  turtle                     False

How to check if values in one dataframe exist in another dataframe in R?

Try this using %in% and a vector for all values:

#Code
df1$reply <- df1$user_name %in% c(df2$name,df2$organisation)

Output:

df1
  id reply user_name
1  1  TRUE      John
2  2  TRUE    Amazon
3  3 FALSE       Bob

Some data used:

#Data1
df1 <- structure(list(id = 1:3, reply = c(NA, NA, NA), user_name = c("John", 
"Amazon", "Bob")), class = "data.frame", row.names = c(NA, -3L
))

#Data2
df2 <- structure(list(name = c("John", "Pat"), organisation = c("Amazon", 
"Apple")), class = "data.frame", row.names = c(NA, -2L))

How to quickly check if row exists in PySpark Dataframe?

It would be better to create a spark dataframe from the entries that you want to look up, and then do a semi join or an anti join to get the rows that exist or do not exist in the lookup dataframe. This should be more efficient than checking the entries one by one.

import pyspark.sql.functions as F

df = spark.createDataFrame([[2,5],[2,10]],['A','B'])

result1 = df.join(lookup, ['A','B'], 'semi').withColumn('exists', F.lit(True))

result2 = df.join(lookup, ['A','B'], 'anti').withColumn('exists', F.lit(False))

result = result1.unionAll(result2)

result.show()
+---+---+------+
|  A|  B|exists|
+---+---+------+
|  2|  5|  true|
|  2| 10| false|
+---+---+------+

Pandas check if row exist in another dataframe and append index

you can do it this way:

Data (pay attention at the index in the B DF):

In [276]: cols = ['SampleID', 'ParentID']

In [277]: A
Out[277]:
   Real_ID  SampleID  ParentID Something AnotherThing
0      NaN        10        11         a            b
1      NaN        20        21         a            b
2      NaN        40        51         a            b

In [278]: B
Out[278]:
   SampleID  ParentID
3        10        11
5        20        21

Solution:

In [279]: merged = pd.merge(A[cols], B, on=cols, how='outer', indicator=True)

In [280]: merged
Out[280]:
   SampleID  ParentID     _merge
0        10        11       both
1        20        21       both
2        40        51  left_only

In [281]: B = pd.concat([B, merged.ix[merged._merge=='left_only', cols]])

In [282]: B
Out[282]:
   SampleID  ParentID
3        10        11
5        20        21
2        40        51

In [285]: A['Real_ID'] = pd.merge(A[cols], B.reset_index(), on=cols)['index']

In [286]: A
Out[286]:
   Real_ID  SampleID  ParentID Something AnotherThing
0        3        10        11         a            b
1        5        20        21         a            b
2        2        40        51         a            b

Check if row with correct values in dataframe exists and append if not

Idea is use DataFrame.loc for set values by 89 - if not exist is added new row, if exist is overwrite value. There is also added DataFrame.astype for convert to original dtypes, if is appended new row:

df2 = pd.DataFrame({'id':[1,2,3,4] ,                  
               'value':[23,34,45,56]})

df = pd.DataFrame({'id':[1,2,3,4,5] ,                  
               'value':[23,34,45,56,67]})

def test(df, value_to_check):
    df = df.set_index('id')
    dtypes = df.dtypes
    df.loc[value_to_check, ['value']] = 89
    return df.astype(dtypes).reset_index()

df1 = test(df, 5)
print (df1)
   id  value
0   1     23
1   2     34
2   3     45
3   4     56
4   5     89

df1 = test(df2, 5)
print (df1)
   id  value
0   1     23
1   2     34
2   3     45
3   4     56
4   5     89

Existing Function for Seeing If a Row Exists in a Data Frame