Finding Rows Containing a Value (Or Values) in Any Column

Finding rows containing a value (or values) in any column

How about

apply(df, 1, function(r) any(r %in% c("M017", "M018")))

The ith element will be TRUE if the ith row contains one of the values, and FALSE otherwise. Or, if you want just the row numbers, enclose the above statement in which(...).

Pandas find rows with value in any column

You can use for match first ID/id/Id substring in all columns by filter output rows before and then convert first row to columns:

mask = (df.select_dtypes(object)
          .apply(lambda x: x.str.contains('id', case=False))
          .any(axis=1)
          .cumsum()
          .gt(0))

df = df[mask].copy()
df.columns = df.iloc[0].rename(None)
df = df.iloc[1:].reset_index(drop=True)

Another idea for test not subtrings:

mask = df.isin(['id','ID','Id']).any(axis=1).cumsum().gt(0)

df = df[mask].copy()
df.columns = df.iloc[0].rename()
df = df.iloc[1:].reset_index(drop=True)

Select rows containing certain values from pandas dataframe

Introduction

At the heart of selecting rows, we would need a 1D mask or a pandas-series of boolean elements of length same as length of df, let's call it mask. So, finally with df[mask], we would get the selected rows off df following boolean-indexing.

Here's our starting df :

In [42]: df
Out[42]: 
        A       B      C
1   apple  banana   pear
2    pear    pear  apple
3  banana    pear   pear
4   apple   apple   pear

I. Match one string

Now, if we need to match just one string, it's straight-foward with elementwise equality :

In [42]: df == 'banana'
Out[42]: 
       A      B      C
1  False   True  False
2  False  False  False
3   True  False  False
4  False  False  False

If we need to look ANY one match in each row, use .any method :

In [43]: (df == 'banana').any(axis=1)
Out[43]: 
1     True
2    False
3     True
4    False
dtype: bool

To select corresponding rows :

In [44]: df[(df == 'banana').any(axis=1)]
Out[44]: 
        A       B     C
1   apple  banana  pear
3  banana    pear  pear

II. Match multiple strings

1. Search for ANY match

Here's our starting df :

In [42]: df
Out[42]: 
        A       B      C
1   apple  banana   pear
2    pear    pear  apple
3  banana    pear   pear
4   apple   apple   pear

NumPy's np.isin would work here (or use pandas.isin as listed in other posts) to get all matches from the list of search strings in df. So, say we are looking for 'pear' or 'apple' in df :

In [51]: np.isin(df, ['pear','apple'])
Out[51]: 
array([[ True, False,  True],
       [ True,  True,  True],
       [False,  True,  True],
       [ True,  True,  True]])

# ANY match along each row
In [52]: np.isin(df, ['pear','apple']).any(axis=1)
Out[52]: array([ True,  True,  True,  True])

# Select corresponding rows with masking
In [56]: df[np.isin(df, ['pear','apple']).any(axis=1)]
Out[56]: 
        A       B      C
1   apple  banana   pear
2    pear    pear  apple
3  banana    pear   pear
4   apple   apple   pear

2. Search for ALL match

Here's our starting df again :

In [42]: df
Out[42]: 
        A       B      C
1   apple  banana   pear
2    pear    pear  apple
3  banana    pear   pear
4   apple   apple   pear

So, now we are looking for rows that have BOTH say ['pear','apple']. We will make use of NumPy-broadcasting :

In [66]: np.equal.outer(df.to_numpy(copy=False),  ['pear','apple']).any(axis=1)
Out[66]: 
array([[ True,  True],
       [ True,  True],
       [ True, False],
       [ True,  True]])

So, we have a search list of 2 items and hence we have a 2D mask with number of rows = len(df) and number of cols = number of search items. Thus, in the above result, we have the first col for 'pear' and second one for 'apple'.

To make things concrete, let's get a mask for three items ['apple','banana', 'pear'] :

In [62]: np.equal.outer(df.to_numpy(copy=False),  ['apple','banana', 'pear']).any(axis=1)
Out[62]: 
array([[ True,  True,  True],
       [ True, False,  True],
       [False,  True,  True],
       [ True, False,  True]])

The columns of this mask are for 'apple','banana', 'pear' respectively.

Back to 2 search items case, we had earlier :

In [66]: np.equal.outer(df.to_numpy(copy=False),  ['pear','apple']).any(axis=1)
Out[66]: 
array([[ True,  True],
       [ True,  True],
       [ True, False],
       [ True,  True]])

Since, we are looking for ALL matches in each row :

In [67]: np.equal.outer(df.to_numpy(copy=False),  ['pear','apple']).any(axis=1).all(axis=1)
Out[67]: array([ True,  True, False,  True])

Finally, select rows :

In [70]: df[np.equal.outer(df.to_numpy(copy=False),  ['pear','apple']).any(axis=1).all(axis=1)]
Out[70]: 
       A       B      C
1  apple  banana   pear
2   pear    pear  apple
4  apple   apple   pear

How do I select rows from a DataFrame based on column values?

To select rows whose column value equals a scalar, some_value, use ==:

df.loc[df['column_name'] == some_value]

To select rows whose column value is in an iterable, some_values, use isin:

df.loc[df['column_name'].isin(some_values)]

Combine multiple conditions with &:

df.loc[(df['column_name'] >= A) & (df['column_name'] <= B)]

Note the parentheses. Due to Python's operator precedence rules, & binds more tightly than <= and >=. Thus, the parentheses in the last example are necessary. Without the parentheses

df['column_name'] >= A & df['column_name'] <= B

is parsed as

df['column_name'] >= (A & df['column_name']) <= B

which results in a Truth value of a Series is ambiguous error.

To select rows whose column value does not equal some_value, use !=:

df.loc[df['column_name'] != some_value]

isin returns a boolean Series, so to select rows whose value is not in some_values, negate the boolean Series using ~:

df.loc[~df['column_name'].isin(some_values)]

For example,

import pandas as pd
import numpy as np
df = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),
                   'B': 'one one two three two two one three'.split(),
                   'C': np.arange(8), 'D': np.arange(8) * 2})
print(df)
#      A      B  C   D
# 0  foo    one  0   0
# 1  bar    one  1   2
# 2  foo    two  2   4
# 3  bar  three  3   6
# 4  foo    two  4   8
# 5  bar    two  5  10
# 6  foo    one  6  12
# 7  foo  three  7  14

print(df.loc[df['A'] == 'foo'])

yields

     A      B  C   D
0  foo    one  0   0
2  foo    two  2   4
4  foo    two  4   8
6  foo    one  6  12
7  foo  three  7  14

If you have multiple values you want to include, put them in a
list (or more generally, any iterable) and use isin:

print(df.loc[df['B'].isin(['one','three'])])

yields

     A      B  C   D
0  foo    one  0   0
1  bar    one  1   2
3  bar  three  3   6
6  foo    one  6  12
7  foo  three  7  14

Note, however, that if you wish to do this many times, it is more efficient to
make an index first, and then use df.loc:

df = df.set_index(['B'])
print(df.loc['one'])

yields

       A  C   D
B              
one  foo  0   0
one  bar  1   2
one  foo  6  12

or, to include multiple values from the index use df.index.isin:

df.loc[df.index.isin(['one','two'])]

yields

       A  C   D
B              
one  foo  0   0
one  bar  1   2
two  foo  2   4
two  foo  4   8
two  bar  5  10
one  foo  6  12

SQL Server SELECT where any column contains 'x'

First Method(Tested)
First get list of columns in string variable separated by commas and then you can search 'foo' using that variable by use of IN

Check stored procedure below which first gets columns and then searches for string:

DECLARE @TABLE_NAME VARCHAR(128)
DECLARE @SCHEMA_NAME VARCHAR(128)

-----------------------------------------------------------------------

-- Set up the name of the table here :
SET @TABLE_NAME = 'testing'
-- Set up the name of the schema here, or just leave set to 'dbo' :
SET @SCHEMA_NAME = 'dbo'

-----------------------------------------------------------------------

DECLARE @vvc_ColumnName VARCHAR(128)
DECLARE @vvc_ColumnList VARCHAR(MAX)

IF @SCHEMA_NAME =''
  BEGIN
  PRINT 'Error : No schema defined!'
  RETURN
  END

IF NOT EXISTS (SELECT * FROM sys.tables T JOIN sys.schemas S
      ON T.schema_id=S.schema_id
      WHERE T.Name=@TABLE_NAME AND S.name=@SCHEMA_NAME)
  BEGIN
  PRINT 'Error : The table '''+@TABLE_NAME+''' in schema '''+
      @SCHEMA_NAME+''' does not exist in this database!' 
  RETURN
 END

DECLARE TableCursor CURSOR FAST_FORWARD FOR
SELECT   CASE WHEN PATINDEX('% %',C.name) > 0 
     THEN '['+ C.name +']' 
     ELSE C.name 
     END
FROM     sys.columns C
JOIN     sys.tables T
ON       C.object_id  = T.object_id
JOIN     sys.schemas S
ON       S.schema_id  = T.schema_id
WHERE    T.name    = @TABLE_NAME
AND      S.name    = @SCHEMA_NAME
ORDER BY column_id


SET @vvc_ColumnList=''

OPEN TableCursor
FETCH NEXT FROM TableCursor INTO @vvc_ColumnName

WHILE @@FETCH_STATUS=0
  BEGIN
   SET @vvc_ColumnList = @vvc_ColumnList + @vvc_ColumnName

    -- get the details of the next column
   FETCH NEXT FROM TableCursor INTO @vvc_ColumnName

  -- add a comma if we are not at the end of the row
   IF @@FETCH_STATUS=0
    SET @vvc_ColumnList = @vvc_ColumnList + ','
   END

 CLOSE TableCursor
 DEALLOCATE TableCursor

-- Now search for `foo`


SELECT *
FROM testing 
WHERE 'foo' in (@vvc_ColumnList );

2nd Method
In sql server you can get object id of table then using that object id you can fetch columns. In that case it will be as below:

Step 1: First get Object Id of table

select * from sys.tables order by name

Step 2: Now get columns of your table and search in it:

 select * from testing where 'foo' in (select name from sys.columns  where  object_id =1977058079)

Note: object_id is what you get fetch in first step for you relevant table

Finding Rows Containing a Value (Or Values) in Any Column