How to Get the Column Name in Pandas Based on Row Values

Get column name where value is something in pandas dataframe

Here is one, perhaps inelegant, way to do it:

df_result = pd.DataFrame(ts, columns=['value'])

Set up a function which grabs the column name which contains the value (from ts):

def get_col_name(row):    
    b = (df.ix[row.name] == row['value'])
    return b.index[b.argmax()]

for each row, test which elements equal the value, and extract column name of a True.

And apply it (row-wise):

In [3]: df_result.apply(get_col_name, axis=1)
Out[3]: 
1979-01-01 00:00:00    col5
1979-01-01 06:00:00    col3
1979-01-01 12:00:00    col1
1979-01-01 18:00:00    col1

i.e. use df_result['Column'] = df_result.apply(get_col_name, axis=1).

Note: there is quite a lot going on in get_col_name so perhaps it warrants some further explanation:

In [4]: row = df_result.irow(0) # an example row to pass to get_col_name

In [5]: row
Out[5]: 
value    1181.220328
Name: 1979-01-01 00:00:00

In [6]: row.name # use to get rows of df
Out[6]: <Timestamp: 1979-01-01 00:00:00>

In [7]: df.ix[row.name]
Out[7]: 
col5    1181.220328
col4     912.154923
col3     648.848635
col2     390.986156
col1     138.185861
Name: 1979-01-01 00:00:00

In [8]: b = (df.ix[row.name] == row['value'])
        #checks whether each elements equal row['value'] = 1181.220328  

In [9]: b
Out[9]: 
col5     True
col4    False
col3    False
col2    False
col1    False
Name: 1979-01-01 00:00:00

In [10]: b.argmax() # index of a True value
Out[10]: 0

In [11]: b.index[b.argmax()] # the index value (column name)
Out[11]: 'col5'

It might be there is more efficient way to do this...

Pandas selecting the column name based on row information

General solution - working if not match row or val:

val = 70
row = 10

val = df.reindex(index=[row]).eq(val).squeeze()
col = next(iter(val.index[val]), 'no match')
print (col)
no match

Another general solution:

def get_col(row, val):
    try:
        a = df.loc[row].eq(val)
        c = a.index[a][0]
    except KeyError:
        c = 'not matched row'
    except IndexError:
        c = 'not matched value'
    return c

print (get_col(1, 7))
name3
print (get_col(10, 7))
not matched row
print (get_col(1, 70))
not matched value
print (get_col(10, 70))
not matched row

Solution if always exist val and row values in DataFrame, because if not exist and all Falses are returned from df.loc[row].eq(val) then idxmax return first False - first column name.

val = 7
row = 1
col = df.loc[row].eq(val).idxmax()
#if want seelct by pocition use iloc
#col = df.iloc[row].eq(val).idxmax()
print (col)
name3

Explanation:

First select row by DataFrame.loc:

print (df.loc[row])
name1    5
name2    6
name3    7
name4    8
Name: 1, dtype: int64

Then compare by eq

print (df.loc[row].eq(val))
name1    False
name2    False
name3     True
name4    False
Name: 1, dtype: bool

And last get index value of first True by idxmax:

print (df.loc[row].eq(val).idxmax())
name3

How to get the column name in pandas based on row values?

Using dot

df.dot(df.columns+',').str[:-1]
Out[168]: 
0    id_0,id_2
1         id_0
2         id_1
3    id_0,id_1
4         id_2
5             
dtype: object

Pandas get column value based on row value

Per this page:

idx, cols = pd.factorize(df['flag'])
df['COl_VAL'] = df.reindex(cols, axis=1).to_numpy()[np.arange(len(df)), idx]

Output:

>>> df
       flag  col1  col2  col3  col4  COl_VAL
index                                       
A      col3     1     5     6     0        6
B      col2     3     2     3     4        2
C      col2     2     4     6     4        4

Find column name in Pandas that contains a specific value in the row from another column

I want to search only columns A through F and find the column name for the first instance (leftmost) the value exists

You can use idxmax on axis=1 after comparing Value column with the slice of the datframe (using .loc[])

df['Value_Col'] = df.loc[:,'A':'F'].isin(df['Value']).idxmax(1)
print(df)

   Date  Time    A    B    C    D    E    F  Value Value_Col
0  Jan1  1245  3.0  3.2  4.6  5.7  2.1  8.0    5.7         D
1  Jan2  1045  4.5  8.4  3.9  2.2  9.4  8.3    3.9         C
2  Jan3  1350  1.4  3.3  4.5  8.9  1.4  0.4    1.4         A

If there are chances that none of the column may contain the df['Value] value , you can use:

m = df.loc[:,'A':'F']
df['Value_Col'] = m.isin(df['Value']).dot(m.columns).str[0]

Pandas transform dataframe to get column names based on row condition

Just use DataFrame.apply on axis=1, then join the columns by , which has the value less than or greater than the given value.

df.assign(normal_speed_t=df.apply(lambda x:','.join(x[x<100].index), axis=1),
          high_speed_t=df.apply(lambda x:','.join(x[x>=100].index), axis=1)
          )

OUTPUT:

        speed_t1  speed_t2  speed_t3  speed_t4              normal_speed_t                high_speed_t
car id                                                                                                 
1.0           90        80       120        34  speed_t1,speed_t2,speed_t4                    speed_t3 
2.0          110       130       140        99                    speed_t4  speed_t1,speed_t2,speed_t3 
3.0           40       110        20       110           speed_t1,speed_t3           speed_t2,speed_t4

Break-Down:

assign just lets you assign a new column with given values
.apply allows you to apply some function to the dataframes columnwise for axis=0, and row-wise for axis=1
x[x<100].index will filter the values that are less than 100, and get the index i.e. column index/column names
','.join(....) it is joining the columns that are coming from above step

Get column names of a data frame based on values from a list in pandas python

General soluion for multiple rows - tested if at least one value or if all values per columns has values from val.

You can test membership by DataFrame.isin and then test by DataFrame.any or DataFrame.all:

#added new row for see difference
print (df)
  col1 col2 col3 col4 col5
0   a1   b1  c_d   d1  e10
1   a1   d1  c_e   f1  e10

val = ['a1', 'c_d', 'e10']

#tested membership
print (df.isin(val))
   col1   col2   col3   col4  col5
0  True  False   True  False  True
1  True  False  False  False  True

#test if at least one True per column
print (df.isin(val).any())
col1     True
col2    False
col3     True
col4    False
col5     True
dtype: bool

#test if all Trues per column
print (df.isin(val).all())
col1     True
col2    False
col3    False
col4    False
col5     True
dtype: bool

names = df.columns[df.isin(val).any()]
print (names)
Index(['col1', 'col3', 'col5'], dtype='object')

names = df.columns[df.isin(val).all()]
print (names)
Index(['col1', 'col5'], dtype='object')

If DataFrame has only one row is possible seelct first row for Series by DataFrame.iloc and then test membership by Series.isin:

names = df.columns[df.iloc[0].isin(val)]

EDIT: If not help upgdare to last version of pandas here is one solution for repalce all object columns with no strings to missing values:

data = [
    {'id': 1, 'content': [{'values': 3}]},
    {'id': 2, 'content': 'a1'},
    {'id': 3, 'content': 'c_d'},
    {'id': 4, 'content': np.array([4,5])}
]

df = pd.DataFrame(data)

mask1 = ~df.columns.isin(df.select_dtypes(object).columns)
mask2 = df.applymap(lambda x: isinstance(x, str))

df = df.where(mask2 | mask1)
print (df)
   id content
0   1     NaN
1   2      a1
2   3     c_d
3   4     NaN

val = ['a1', 'c_d', 'e10']
print (df.isin(val))
      id  content
0  False    False
1  False     True
2  False     True
3  False    False