Find Column Whose Name Contains a Specific String

Find column whose name contains a specific string

Just iterate over DataFrame.columns, now this is an example in which you will end up with a list of column names that match:

import pandas as pd

data = {'spike-2': [1,2,3], 'hey spke': [4,5,6], 'spiked-in': [7,8,9], 'no': [10,11,12]}
df = pd.DataFrame(data)

spike_cols = [col for col in df.columns if 'spike' in col]
print(list(df.columns))
print(spike_cols)

Output:

['hey spke', 'no', 'spike-2', 'spiked-in']
['spike-2', 'spiked-in']

Explanation:

df.columns returns a list of column names
[col for col in df.columns if 'spike' in col] iterates over the list df.columns with the variable col and adds it to the resulting list if col contains 'spike'. This syntax is list comprehension.

If you only want the resulting data set with the columns that match you can do this:

df2 = df.filter(regex='spike')
print(df2)

Output:

   spike-2  spiked-in
0        1          7
1        2          8
2        3          9

Find column whose name contains a specific value that is in a fixed column

I don't know whether you want to get the column names which contain the string you want or the columns name of the columns which have at least one value that contains the string you want.

if the dataframe is:

In [1]: import pandas as pd 
   ...: df = pd.DataFrame({'a_1': ['b_1', 'b_2'], 'b_1': ['a_1', 'a_2']})                                                                             
In [2]: df                                                                                                                                            
Out[2]: 
   a_1  b_1
0  b_1  a_1
1  b_2  a_2

for the first case, if you want to find all the column name that match a_*:

In [3]: import re                                                                                                                                     
In [4]: columns = [col for col in df.columns if isinstance(col, str) and re.match('a_.*', col)]                                                       
In [5]: columns                                                                                                                                       
Out[5]: ['a_1']

for the second case, if you want to find all the columns in which there is at least one value that match a_.*:

In [6]: columns = [col for col, ser in df.iteritems() if ser.str.match('a_.*').any()]                                                                 
In [7]: columns                                                                                                                                       
Out[7]: ['b_1']

in which:

df.iteritems: return a iterator of (column name, column values (series)) pairs.

Series.any: return True if any value in the series is True.

creating new column from columns whose name contains a specific string

You can use:

# get columns with "Time" in the name
cols = list(df.filter(like='Time'))
# ['Run_Time', 'Rest_Time']

# add the value of df['Temp']
df[cols] = df[cols].add(df['Temp'], axis=0)

output:

   Run_Time  Temp  Rest_Time
0        70    10         15
1        40    20         25
2        60    30         35
3        95    50         55
4       130    60         65
5       200   100        105

select columns based on columns names containing a specific string in pandas

alternative methods:

In [13]: df.loc[:, df.columns.str.startswith('alp')]
Out[13]:
       alp1      alp2
0  0.357564  0.108907
1  0.341087  0.198098
2  0.416215  0.644166
3  0.814056  0.121044
4  0.382681  0.110829
5  0.130343  0.219829
6  0.110049  0.681618
7  0.949599  0.089632
8  0.047945  0.855116
9  0.561441  0.291182

In [14]: df.loc[:, df.columns.str.contains('alp')]
Out[14]:
       alp1      alp2
0  0.357564  0.108907
1  0.341087  0.198098
2  0.416215  0.644166
3  0.814056  0.121044
4  0.382681  0.110829
5  0.130343  0.219829
6  0.110049  0.681618
7  0.949599  0.089632
8  0.047945  0.855116
9  0.561441  0.291182

Find column name containing a string after a punctuation

For getting list of results containing mango after an underscore _ in your dataframe df, you can either do

mango_list = [word for word in df.columns if '_mango' in word]

mango_list = [word for word in df.columns if word.split("_")[1]=="mango"]

Pandas Find name of column in which a string is found

Definitely not the best/elegant answer but it does the trick

word = 'Giraffe'
df.columns[df[df==word].notna().sum()>0][0]

returns 'Animal' as a string.

This does only work if we assume there is only one column which can contain the word.

Find Column Whose Name Contains a Specific String