Find All Instances of Word Occurring in a File

Find all instances of word occurring in a file

You can just use grep with right regex:

grep '^ *$temp' test.conf
$temp_test
$temp_234

UPDATE: As per comments:

while read -r l; do
echo "$l"
done < <(sed -n '/^ *$temp_/s/^ *\$temp_//p' t.conf)

test
234

Finding number of occurrences of a word in a file using R functions

As pointed by @andrew, my previous answer would give wrong results if a word repeats on the same line. Based on other answers/comments, this one seems ok:

names = scan('http://pastebin.com/raw.php?i=kC9aRvfB', what=character(), quote=NULL )
idxs = grep("memory", names, ignore.case = TRUE)

length(idxs)
# [1] 10

Count all occurrences of a string in lots of files with grep

cat * | grep -c string

python - find the occurrence of the word in a file

Use the update method of Counter. Example:

from collections import Counter

data = '''\
ashwin programmer india
amith programmer india'''

c = Counter()
for line in data.splitlines():
c.update(line.split())
print(c)

Output:

Counter({'india': 2, 'programmer': 2, 'amith': 1, 'ashwin': 1})

How to find files with multiple occurrences of a specific string

@echo off
for %%a in (*.txt) do (
for /f %%b in ('type "%%a"^|find /c "myword"') do (
if %%b geq 2 echo %%a [actual count: %%b]
)
)

Notes:

  • find /c doesn't count occurrences of a string, but lines that contain that word (one or several times) which isn't the same but might be good enough for you.
  • you might want to find /i /c to make it case insensitive (finding "myword" as well as "MyWord")
  • echo %%a [actual count: %%b] is for troubleshooting only, you want to replace it with the copy command in your final code.

How to count occurrences of a word in all the files of a directory?

grep -roh aaa . | wc -w

Grep recursively all files and directories in the current dir searching for aaa, and output only the matches, not the entire line. Then, just use wc to count how many words are there.

Finding occurrences of specific word line by line from text file

Maybe you need to write a strword() function like this. I'm assuming you can use the classification functions (macros) from <ctype.h>, but there are workarounds if that isn't allowed either.

#include <assert.h>
#include <ctype.h>
#include <stdio.h>

char *strword(char *haystack, char *needle);

char *strword(char *haystack, char *needle)
{
char *pos = haystack;
char old_ch = ' ';
while (*pos != '\0')
{
if (!isalpha(old_ch) && *pos == *needle)
{
char *txt = pos + 1;
char *str = needle + 1;
while (*txt == *str)
{
if (*str == '\0')
return pos; // Exact match at end of haystack
txt++, str++;
}
if (*str == '\0' && !isalpha(*txt))
return pos;
}
old_ch = *pos++;
}
return 0;
}

int main(void)
{
/*
** Note that 'the' appears in the haystack as a prefix to a word,
** wholly contained in a word, and at the end of a word - and is not
** counted in any of those places. And punctuation is OK.
*/
char haystack[] =
"the way to blithely count the occurrences (tithe)"
" of 'the' in their line is the";
char needle[] = "the";

char *curpos = haystack;
char *word;
int count = 0;
while ((word = strword(curpos, needle)) != 0)
{
count++;
printf("Found <%s> at [%.20s]\n", needle, word);
curpos = word + 1;
}

printf("Found %d occurrences of <%s> in [%s]\n", count, needle, haystack);

assert(strword("the", "the") != 0);
assert(strword("th", "the") == 0);
assert(strword("t", "t") != 0);
assert(strword("", "t") == 0);
assert(strword("if t fi", "t") != 0);
assert(strword("if t fi", "") == 0);
return 0;
}

When run, this produces:

Found <the> at [the way to blithely ]
Found <the> at [the occurrences (tit]
Found <the> at [the' in their line i]
Found <the> at [the]
Found 4 occurrences of <the> in [the way to blithely count the occurrences (tithe) of 'the' in their line is the]

Is there a way to do the strword function without <ctype.h>?

Yes. I said as much in the opening paragraph. Since the only function/macro used is isalpha(), you can make some assumptions (that you're not on a system using EBCDIC) so that the Latin alphabet is contiguous, and you can use this is_alpha() in place of isalpha() — and omit <ctype.h> from the list of included headers:

static inline int is_alpha(int c)
{
return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z');
}

Taking words that I have in a list and searching for them within a Text File and getting a count for each word

To count the number of occurrences of a specific word in a text file, read the content of text file to a string and use String.count() function with the word passed as argument to the count() function.

Syntax:

n = String.count(word)

where word is the string, and count() returns the number of occurrences of word in this String.

So you can read the file and make use of count() method.

#get file object reference to the file
with open("file.txt", "r") as file:
#read content of file to string
data = file.read()

words = ['apple', 'orange']

for word in words:
print('{} occurred {} times.'.format(word, data.count(word)))

Hopefully, this should work fine.

Note:
You can even loop through each and every word and increment the counter. But using a high-level programming language like Python, it would be beneficial to make use of such built-in methods.

How to find all occurrences of a substring?

There is no simple built-in string function that does what you're looking for, but you could use the more powerful regular expressions:

import re
[m.start() for m in re.finditer('test', 'test test test test')]
#[0, 5, 10, 15]

If you want to find overlapping matches, lookahead will do that:

[m.start() for m in re.finditer('(?=tt)', 'ttt')]
#[0, 1]

If you want a reverse find-all without overlaps, you can combine positive and negative lookahead into an expression like this:

search = 'tt'
[m.start() for m in re.finditer('(?=%s)(?!.{1,%d}%s)' % (search, len(search)-1, search), 'ttt')]
#[1]

re.finditer returns a generator, so you could change the [] in the above to () to get a generator instead of a list which will be more efficient if you're only iterating through the results once.



Related Topics



Leave a reply



Submit