Count Occurrences of Words in Arraylist

Count occurrences of words in ArrayList

If you don't have a huge list of strings the shortest way to implement it is by using Collections.frequency method, like this:

List<String> list = new ArrayList<String>();
list.add("aaa");
list.add("bbb");
list.add("aaa");

Set<String> unique = new HashSet<String>(list);
for (String key : unique) {
System.out.println(key + ": " + Collections.frequency(list, key));
}

Output:

aaa: 2
bbb: 1

Count occurrence of word in ArrayList of String

Do not call subList and return after iterate the whole list:

public static int counter(List<String> comments) {
int count = 0;
String word = "the";
for (String comment : comments) {
String a[] = comment.split(" ");
for (int j = 0; j < a.length; j++) {
if (word.equals(a[j])) {
count++;
}
}
System.out.println(comment);
}
System.out.println("sefsfsfseesfesCount occurrences of words in ArrayList Count occurrence of word in ArrayList of String How do I count the occurrences of each word in each of sentences stored in arrayCount occurrences of words in ArrayList Count occurrence of word in ArrayList of String How do I count the occurrences of each word in each of sentences stored in arrayeeeeeee");
return count;
}

How do I count the occurrences of each word in each of sentences stored in arraylists?

  1. Lets call your ArrayList<String> list.
  2. Let's make a list list2 of String[]
    3, Split Sentences to the array.
  3. Count occurrences

The code:

ArrayList<String> list = new ArrayList<>();
//add sentences here
list.add("My first sentence sentence");
list.add("My second sentence1 sentence1");

ArrayList<String[]> list2 = new ArrayList<>();
for (String s : list) { list2.add(s.split(" "));};
for (String[] s : list2) {
Map<String, Integer> wordCounts = new HashMap<String, Integer>();

for (String word : s) {
Integer count = wordCounts.get(word);
if (count == null) {
count = 0;
}
wordCounts.put(word, count + 1);
}
for (String key : wordCounts.keySet()) {
System.out.println(key + ": " + wordCounts.get(key).toString());
}

How to count the number of elements of an arraylist with a certain term/word?

Map<String, Integer> wordCounts = new HashMap<String, Integer>();

//making list of all words
for (String s : allDocuments)
for ( String s2 : s.split(" "))
if( ! wordCounts.containsKey(s2) )
wordCounts.put(s2,0);

//counting occurence of all words in whole strings
for (String k : wordCounts.keySet())
for (String s : allDocuments)
if(s.indexOf(k) != -1)
wordCounts.put(k, wordCounts.get(k)+1);

Counting occurrences of strings in an ArrayList in java - Processing

Seems like Multisets from the Guava library would be perfect for this job. You could store all the words you've read into a Multiset and when you want to get occurrences (counts) out, you could simply iterate over the copy returned by Multisets.copyHighestCountFirst(myMultiset):

import com.google.common.collect.*;
...

// data contains the words from the text file
Multiset<String> myMultiset = ImmutableMultiset.copyOf(data);

for (String word : Multisets.copyHighestCountFirst(myMultiset).elementSet()) {
System.out.println(word + ": " + myMultiset.count(word));
}

That should do it.

Count the occurence of word in a list containing sentences

This is a working solution. I did not take care of the printing. The result is a Map -> Word, Array. Where Array contains the count of Word in each sentence indexed from 0. Runs in O(N) time. Play here: https://repl.it/Bg6D

    List<List<String>> sort = new ArrayList<>();
Map<String, ArrayList<Integer>> res = new HashMap<>();

// split by sentence
for (String sentence : someText.split("[.?!]\\s*")) {
sort.add(Arrays.asList(sentence.split("[ ,;:]+"))); //put each sentences in list
}

// put all word in a hashmap with 0 count initialized
final int sentenceCount = sort.size();
sort.stream().forEach(sentence -> sentence.stream().forEach(s -> res.put(s, new ArrayList<Integer>(Collections.nCopies(sentenceCount, 0)))));

int index = 0;
// count the occurrences of each word for each sentence.
for (List<String> sentence: sort) {
for (String s : sentence) {
res.get(s).set(index, res.get(s).get(index) + 1);
}
index++;
}

EDIT:
In answer to your comment.

  List<Integer> getSentence(int sentence, Map<String, ArrayList<Integer>> map) {
return map.entrySet().stream().map(e -> e.getValue().get(sentence)).collect(Collectors.toList());
}

Then you can call

List<Integer> sentence0List = getSentence(0, res);

However be aware that this approach is not optimal since it runs in O(K) time with K being the number of sentences. For small K it is totally fine but it does not scale. You have to clarify yourself what will you do with the result. If you need to call getSentence many times, this is not the correct approach. In that case you will need the data structured differently. Something like

Sentences = [
{'word1': N, 'word2': N},... // sentence 1
{'word1': N, 'word2': N},... // sentence 2

]

So you are able to easily access the word count per each sentence.

EDIT 2:
Call this method:

  Map<String, Float> getFrequency(Map<String, ArrayList<Integer>> stringMap) {
Map<String, Float> res = new HashMap<>();
stringMap.entrySet().stream().forEach(e -> res.put(e.getKey()
, e.getValue().stream().mapToInt(Integer::intValue).sum() / (float)e.getValue().size()));
return res;
}

Will return something like this:

{standard=0.25, but=0.25, industry's=0.25, been=0.25, 1500s=0.25, software=0.25, release=0.25, type=0.5, when=0.25, dummy=0.5, Aldus=0.25, only=0.25, passages=0.25, text=0.5, has=0.5, 1960s=0.25, Ipsum=1.0, five=0.25, publishing=0.25, took=0.25, centuries=0.25, including=0.25, in=0.25, like=0.25, containing=0.25, printer=0.25, is=0.25, t


Related Topics



Leave a reply



Submit