How to Check If a Word Is an English Word with Python

How to check if a word is an English word with Python?

For (much) more power and flexibility, use a dedicated spellchecking library like PyEnchant. There's a tutorial, or you could just dive straight in:

>>> import enchant
>>> d = enchant.Dict("en_US")
>>> d.check("Hello")
True
>>> d.check("Helo")
False
>>> d.suggest("Helo")
['He lo', 'He-lo', 'Hello', 'Helot', 'Help', 'Halo', 'Hell', 'Held', 'Helm', 'Hero', "He'll"]
>>>

PyEnchant comes with a few dictionaries (en_GB, en_US, de_DE, fr_FR), but can use any of the OpenOffice ones if you want more languages.

There appears to be a pluralisation library called inflect, but I've no idea whether it's any good.

how to check if a input is a valid english word using pydictionary

if you have to use PyDictionary you can check the meaning of a word and wrap the return in a bool function:

from PyDictionary import PyDictonary
dictionary = PyDictionary()
valid_word = bool(dictionary.meaning("hdaoij")) # False
valid_word = bool(dictionary.meaning("hello")) # True

or

valid_word = bool(dictionary.meaning(input("enter a word: ")))

Otherwise, I would use the check function in enchant, in @RoadJDK's answer

Is there a way to check with python if a string from a list is a real word used in common English language?

First step, install nltk

Then:

import nltk
nltk.download('words')

from nltk.corpus import words

samplewords=['apple','a%32','j & quod','rectangle','house','fsdfdsoij','fdfd']

[i for i in samplewords if i in words.words()]

['apple', 'rectangle', 'house']

How to determine if a string is an English word?

U can simply use the pyenchant library as mentioned in this post:

import enchant
d = enchant.Dict("en_US")
print(d.check("Hello"))

Output:

True

U can install it by typing pip install pyenchant in ur command line. In ur case, u have to loop through all strings in the string and check whether the current string is an english word or not. Here is the full code to do it:

import enchant
d = enchant.Dict("en_US")

string = "Taking the derivative of: f(x) = \int_{0}^{1} z^3, we can see that we always get x^2 = y_2 + 4 which is the same as taking the double integral of g(x)"

stringlst = string.split(' ')
wordlst = []

for string in stringlst:
if d.check(string):
wordlst.append(string)

print(wordlst)

Output:

['Taking', 'the', 'derivative', 'we', 'can', 'see', 'that', 'we', 'always', 'get', '4', 'which', 'is', 'the', 'same', 'as', 'taking', 'the', 'double', 'integral', 'of']

Hope that this helps!

How to check to see if a string is contained in any english word?

Based on solution to the linked answer.

We can define next utility function using Dict.suggest method

def is_part_of_existing_word(string, words_dictionary):
suggestions = words_dictionary.suggest(string)
return any(string in suggestion
for suggestion in suggestions)

then simply

>>> import enchant
>>> english_dictionary = enchant.Dict("en")
>>> is_part_of_existing_word('wat', words_dictionary=english_dictionary)
True
>>> is_part_of_existing_word('wate', words_dictionary=english_dictionary)
True
>>> is_part_of_existing_word('way', words_dictionary=english_dictionary)
True
>>> is_part_of_existing_word('wayt', words_dictionary=english_dictionary)
False
>>> is_part_of_existing_word('wayter', words_dictionary=english_dictionary)
False
>>> is_part_of_existing_word('wayterlx', words_dictionary=english_dictionary)
False
>>> is_part_of_existing_word('lackjack', words_dictionary=english_dictionary)
True
>>> is_part_of_existing_word('ucumber', words_dictionary=english_dictionary)
True

How to check for English words in a list

Use the following code:

  • from nltk.corpus import wordnet with wordnet.synsets did not successfully identify English words.

    • All of the words in word_list identified as True
  • Successfully identifying an English word, depends on the dictionary in use.
from nltk.corpus import words

def check_words(word_list: list):
for word in word_list:
print(word in words.words())

word_list = ['poisson', 'stark', 'nihongo', 'abstract', 'pedo']

Output:

check_words(word_list)

False
True
False
True
False


Related Topics



Leave a reply



Submit