Extracting Nouns and Verbs from Text

Extracting all Nouns from a text file using nltk

If you are open to options other than NLTK, check out TextBlob. It extracts all nouns and noun phrases easily:

>>> from textblob import TextBlob
>>> txt = """Natural language processing (NLP) is a field of computer science, artificial intelligence, and computational linguistics concerned with the inter
actions between computers and human (natural) languages."""
>>> blob = TextBlob(txt)
>>> print(blob.noun_phrases)
[u'natural language processing', 'nlp', u'computer science', u'artificial intelligence', u'computational linguistics']

Extracting most common nouns and verbs from category using numpy and NLTK

Thanks, this is what I ended up with which served my purpose. Thanks for your help

ByTripType = text_reviews.groupby("Trip Type")

def findtags(tag_prefix, tagged_text):
    cfd = nltk.ConditionalFreqDist((tag, word) for (word, tag) in tagged_text if tag.startswith(tag_prefix))
    return dict((tag, cfd[tag].most_common(10)) for tag in cfd.conditions())

for name, group in ByTripType:
    sentences = group['text'].str.cat(sep = ' ')
    sentences = sentences.lower()
    remove_punctuation(sentences)
    sentences = '"' + sentences + '"'
    text = word_tokenize(sentences)
    sentences = nltk.pos_tag(text)
    for i in ('NN', 'VBP'):
        tagdict = findtags(i, sentences)
        print(name, tagdict)

extract nouns and verbs using NLTK

Use nltk pos-tagger

>>> import nltk
>>> text = nltk.word_tokenize("They refuse to permit us to obtain the refuse permit")
>>> pos_tagged = nltk.pos_tag(text)
>>> pos_tagged
[('They', 'PRP'), ('refuse', 'VBP'), ('to', 'TO'), ('permit', 'VB'), ('us', 'PRP'),
('to', 'TO'), ('obtain', 'VB'), ('the', 'DT'), ('refuse', 'NN'), ('permit', 'NN')]
>>> nouns = filter(lambda x:x[1]=='NN',pos_tagged)
>>> nouns
[('refuse', 'NN'), ('permit', 'NN')]

Nouns are marked by NN and verbs are by VB, so you can use them accordingly.

NOTE:
If you have not setup/downloaded punkt and averaged_perceptron_tagger with nltk, you might have to do that using:

import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

Extract verbs from sentence in R?

you can get it by using udpipe_annotate function from udpipe library:

library(udpipe)
ud_model <- udpipe_download_model(language = "english")
ud_model <- udpipe_load_model(ud_model$file_model)
system.time(
  x <- udpipe_annotate(ud_model, x = df$recipe_name, doc_id = df$id)
)
x <- as.data.frame(x)
abc <- c("NN","VB")
stats <- dplyr::filter(x,grepl(pattern = paste(abc, collapse = "|"), x = xpos, ignore.case = T))

you can also use list of word types from this list.