Twitter Sentiment Analysis W R Using German Language Set Sentiws

Twitter Sentiment Analysis w R using german language set SentiWS

This may work for you:

readAndflattenSentiWS <- function(filename) { 
words = readLines(filename, encoding="UTF-8")
words <- sub("\\|[A-Z]+\t[0-9.-]+\t?", ",", words)
words <- unlist(strsplit(words, ","))
words <- tolower(words)
return(words)
}
pos.words <- c(scan("positive-words.txt",what='character', comment.char=';', quiet=T),
readAndflattenSentiWS("SentiWS_v1.8c_Positive.txt"))
neg.words <- c(scan("negative-words.txt",what='character', comment.char=';', quiet=T),
readAndflattenSentiWS("SentiWS_v1.8c_Negative.txt"))

score.sentiment = function(sentences, pos.words, neg.words, .progress='none') {
# ... see OP ...
}

sample <- c("ich liebe dich. du bist wunderbar",
"Ich hasse dich, geh sterben!",
"i love you. you are wonderful.",
"i hate you, die.")
(test.sample <- score.sentiment(sample,
pos.words,
neg.words))
# score text
# 1 2 ich liebe dich. du bist wunderbar
# 2 -2 ich hasse dich, geh sterben!
# 3 2 i love you. you are wonderful.
# 4 -2 i hate you, die.

Sentiment analysis of non-English texts

As Andy has pointed about above, the best approach would be to train your own classifier. Another, more quick and dirty approach would be to use a German sentiment lexicon such as the SentiWS, and compute the polarity of a sentence simply on the basis of the polarity values of its individual words (for example by summing them). This method isn't foolproof (it doesn't take negation into account, for example), but it would give reasonable results relatively quickly.



Related Topics



Leave a reply



Submit