How to Get a List of Keywords in Python

Is it possible to get a list of keywords in Python?

You asked about statements, while showing keywords in your output example.

If you're looking for keywords, they're all listed in the keyword module:

>>> import keyword
>>> keyword.kwlist
['and', 'as', 'assert', 'break', 'class', 'continue', 'def', 'del', 'elif',
'else', 'except', 'exec', 'finally', 'for', 'from', 'global', 'if', 'import',
'in', 'is', 'lambda', 'not', 'or', 'pass', 'print', 'raise', 'return', 'try',
'while', 'with', 'yield']

From the keyword.kwlist doc:

Sequence containing all the keywords defined for the interpreter. If any keywords are defined to only be active when particular __future__ statements are in effect, these will be included as well.

Search for list of keywords in python

For a minimal change to get this working, you can change any(keywords) in string_tags to the following:

any(keyword in string_tags for keyword in keywords)

Or an alternative using sets:

keywords = set(['diy','decorate', 'craft', 'home decor', 'food'])

def get_tags(blog_soup):
tags_html = blog_soup.find('div', attrs = {'style': 'margin-left: 60px; margin-bottom: 15px;'})
tags = [tag.string for tag in tags_html.findAll('a')]
if keywords.intersection(tags):
print url

Python: How to use list of keywords to search for a string in a text

You could replace

if (keywords in text):
...

with

if any(keyword in text for keyword in keywords):
...

How to get all elements in a python list between key word elements?

You could use two indices (e.g. ind1, ind2) to find the key words. Then, slice the original list with the aid of these two indices.

test_list = ['garbage','######## KEY WORD ONE ####', 'data1', 'data2', '### KEY WORD TWO ######', 'junk']

ind1 = -1
ind2 = -1
for ind, item in enumerate(test_list):
if "KEY WORD ONE" in item:
ind1 = ind
if "KEY WORD TWO" in item:
ind2 = ind
if ind1!=-1 and ind2!=-1:
break

result_list = test_list[ind1+1:ind2]

print(result_list)

Print:

['data1', 'data2']

To find the indices, with one loop is enough and you can break the loop once you have found both key words. I guess it is faster than calling the builtin index function twice, which would run two loops to find both indices. However, you need less lines with the aforementioned function.

List of python keywords

You are better of using the keyword module

>>> import keyword
>>> keyword.kwlist
['and', 'as', 'assert', 'break', 'class', 'continue', 'def', 'del', 'elif', 'else', 'except', 'exec', 'finally', 'for', 'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'not', 'or', 'pass', 'print', 'raise', 'return', 'try', 'while', 'with', 'yield']

How to search for a keyword in a list of strings, and return that string?

Try this :

list_ = ["The man walked the dog", "The lady walked the dog","Dogs are cool", "Cats are interesting creatures", "Cats and Dogs was an interesting movie", "The man has a brown dog"]
l1 = [k for k in list_ if 'man' in k and 'dog' in k]

OUTPUT :

['The man walked the dog', 'The man has a brown dog']

Note : Refrain from assigning variable name as list.

search keywords in a list and create new list

The problem with your solution is that you are searching for the records one by one and for each record you appending negative result for all the other values, not only the one being searched.

What you should do here is to see if the keyword is there and get its value, or append negative value for only that particular value.

Here is a possible solution

data = ['CUSTOMER/client1', 'DC/Dc1', 'OS/windows', 'PRODUCT/p1', '']
newdata = []

customer = [s for s in data if "CUSTOMER" in s]
newdata.append(customer[0] if customer else "No Product")

dc = [s for s in data if "DC" in s]
newdata.append(dc[0] if dc else "No DC")

os = [s for s in data if "OS" in s]
newdata.append(os[0] if os else "No Product")

product = [s for s in data if "PRODUCT" in s]
newdata.append(product[0] if product else "No Product")

type = [s for s in data if "TYPE" in s]
newdata.append(type[0] if type else "No Type")

Here is the output

['CUSTOMER/client1', 'DC/Dc1', 'OS/windows', 'PRODUCT/p1', 'No Type']

This solution though has the complexity of O(n2), as for each of the n values in the list you iterate over the whole list (in total n iterations).


To reduce the complexity of the algorithm you may convert the list into a dictionary. Those have O(1) lookup.

To convert the list to dict you may do the following.

data = dict(d.split('/') for d in data if "/" in d)

You will have the following

{'CUSTOMER': 'client1', 'DC': 'Dc1', 'OS': 'windows', 'PRODUCT': 'p1'}

Now you can iterate through the keywords and get whatever you want:

data = dict(d.split('/') for d in data if "/" in d)

keywords = ['CUSTOMER', 'DC', 'OS', 'PRODUCT', 'TYPE']
newdata = []

for k in keywords:
newdata.append(f"{k}/{data[k]}" if k in data else f"No {k}")

You'll get:

['CUSTOMER/client1', 'DC/Dc1', 'OS/windows', 'PRODUCT/p1', 'No TYPE']

In this case you iterate once over the initial data list once (performing n iterations) and you iterate over the keywords once (overall n iterations), thus the final complexity is O(n).


In this particular example the number of elements in the list is not that much, so you shouldn't worry too much about complexity, but in similar cases with more values the latter answer would save some time.



Related Topics



Leave a reply



Submit