How to Find a Whole Word in a String in Java

How to find a whole word in a String in Java?

The example below is based on your comments. It uses a List of keywords, which will be searched in a given String using word boundaries. It uses StringUtils from Apache Commons Lang to build the regular expression and print the matched groups.

String text = "I will come and meet you at the woods 123woods and all the woods";

List<String> tokens = new ArrayList<String>();

String patternString = "\\b(" + StringUtils.join(tokens, "|") + ")\\b";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(text);

while (matcher.find()) {

If you are looking for more performance, you could have a look at StringSearch: high-performance pattern matching algorithms in Java.

Finding the first matching whole word given a substring in a long text in Java

What could be wrong with my code?

because your regex is matching only overflow not the word that contains it

Use the following regex instead :


String token = "\\b\\S*overflow\\S*";
Pattern pattern = Pattern.compile(token);
Matcher matcher = pattern.matcher(fullText);
if (matcher.find())
System.out.println("Whole word is :";


  • \b matches word boundary

  • \\S* matches zero or more none space character

  • overflow mataches overflow literally

  • \\S* matches zero or more non space characters

Alternative two: using split and iterate through each word and break when find the word

String fullText="Stackoverflow is the best and rocks !!!";
String [] strWords = fullText.split("\\s");
for(String strWord:strWords){

Java match whole word in String

Since word boundary does not match between a word char and underscore you need

String pattern = "(?<=_|\\b)" + str + "(?=_|\\b)";

Here, (?<=_|\b) positive lookbehind requires a word boundary or an underscore to appear before the str, and the (?=_|\b) positive lookahead requires an underscore or a word boundary to appear right after the str.

See this regex demo.

If your word may have special chars inside, you might want to use a more straight-forward word boundary:

"(?<![^\\W_])" + Pattern.quote(str) + "(?![^\\W_])"

Here, the negative lookbehind (?<![^\\W_]) fails the match if there is a word character except an underscore ([^...] is a negated character class that matches any character other than the characters, ranges, etc. defined inside this class, thus, it matches all characters other than a non-word char \W and a _), and the (?![^\W_]) negative lookahead fails the match if there is a word char except the underscore after the str.

Note that the second example has a quoted search string, so that even AA.A_str.txt could be matched well with AA.A.

See another regex demo

How to find a whole word in java?

Basically, you want to find what the user entered and want to make sure that it won't match part of a word?

In that case, stay far away from \b and its cousin \w, as those are utterly useless for anything you would consider a word (it's a rough approximation what some programming languages thing as identifiers, nothing more). Best spell out explicitly what you want:


which means that preceding and following your search term is either whitespace or the beginning/end of the string. You may want to alter the lookahead to something like


maybe to allow for punctuation (and likewise, at least the opening parenthesis, in the lookbehind).

Java Regex : match whole word with word boundary

It appears you only want to match "words" enclosed with whitespace (or at the start/end of strings).


String pattern = "(?<!\\S)" + Pattern.quote(word) + "(?!\\S)";

The (?<!\S) negative lookbehind will fail all matches that are immediately preceded with a char other than a whitespace and (?!\s) is a negative lookahead that will fail all matches that are immediately followed with a char other than whitespace. Pattern.quote() is necessary to escape special chars that need to be treated as literal chars in the regex pattern.

Find the whole word from a Sentence with matching String

Do it with regex: Something like

about pro.*?\b

Will match about pro and then some characters and then a word boundary (a whitespace or punctuation mark). This way you don't have to make multiple substrings (which is a costly operation).

Related Topics

Leave a reply
