Finding @Mentions in String

Finding @mentions in string

Replace the question mark (?) quantifier ("optional") and add in a + ("one or more") after your character class:

@([^@ ]+)

How to list all mentions in a String

Use this Regex. It finds groups of word characters (letters, digits, underscores) that follow an @.

val atMentions: List<String> = "(?<=@)\\w+".toRegex().findAll(editText.text).map { it.value }

If you need to define a different set of word characters, replace the \\w above with [\\w] and put the other acceptable characters right after the w.

how to pull @ mentions out of strings like twitter in javascript

I have found that this is the best way to find mentions inside of a string in javascript.

var str = "@jpotts18 what is up man? Are you hanging out with @kyle_clegg";
var pattern = /\B@[a-z0-9_-]+/gi;
str.match(pattern);
["@jpotts18", "@kyle_clegg"]

I have purposefully restricted it to upper and lowercase alpha numeric and (-,_) symbols in order to avoid periods that could be confused for usernames like (@j.potts).

This is what twitter-text.js is doing behind the scenes.

// Mention related regex collection
twttr.txt.regexen.validMentionPrecedingChars = /(?:^|[^a-zA-Z0-9_!#$%&*@＠]|RT:?)/;
twttr.txt.regexen.atSigns = /[@＠]/;
twttr.txt.regexen.validMentionOrList = regexSupplant(
    '(#{validMentionPrecedingChars})' +  // $1: Preceding character
    '(#{atSigns})' +                     // $2: At mark
    '([a-zA-Z0-9_]{1,20})' +             // $3: Screen name
    '(\/[a-zA-Z][a-zA-Z0-9_\-]{0,24})?'  // $4: List (optional)
  , 'g');
twttr.txt.regexen.endMentionMatch = regexSupplant(/^(?:#{atSigns}|[#{latinAccentChars}]|:\/\/)/);

Please let me know if you have used anything that is more efficient, or accurate. Thanks!

How @mention works, how can I find mention during comment in .Net

Looks like a good fit for regular expressions. There are multiple ways to solve this.

Here's the simplest one:

 (?<mention>@[a-zA-Z0-9_.]+)[^a-zA-Z0-9_.]

it searches matching characters followed by non-matching character. [^ ... ] does the negation bit
(?<mention> ... ) declares an explictit group to capture mention without including the non-matching character immediately following the mention.
not that this pattern requires a non-matching character after mention, so if it matters work around that.

A cleaner pattern would use a feature called look-ahead:

@[a-zA-Z0-9_.]+?(?![a-zA-Z0-9_.])

(?!) is negative lookahead. Meaning "only match if it is NOT followed by this"
named capture not required as lookahead does not consume the lookahead part.
It supports multiple mention lookups by adding using non-greedy quantifier +?. This ensures that matched mention is as short as possible.

Lookaheads are a tad less known and may become a pain to read if pattern grows too long. But it is a useful tool to know.

Full example using C#:

string comment = "hi @fri.tara3^ @hjh not a mention @someone";
const String pattern = "@[a-zA-Z0-9_.]+?(?![a-zA-Z0-9_.])";
var matches = Regex.Matches(comment, pattern);

for (int i = 0; i < matches.Count; i++)
{
    Console.WriteLine(matches[i].Value);
}

Find all valid user mentions in text with regex

You may consider a good-enough pattern like

r'\B@(?!(?:[a-z0-9.]*_){2})(?!(?:[a-z0-9_]*\.){2})[._a-z0-9]{3,24}\b'

See the regex demo. The only drawback of the pattern is that if the valid mention can end with ., it will match up to that . (see demo).

Details

\B@ - a @ not preceded with a word char
(?!(?:[a-z0-9.]*_){2}) - no two _ chars anywhere after @
(?!(?:[a-z0-9_]*\.){2}) - no two . chars anywhere after @
[._a-z0-9]{3,24} - three to twenty-four letters, digits, . and _
\b - word boundary

Note you may actually use some Python code to filter your results obtained with \B(@[a-z_.]{3,24}):

import re
s = 'text @valid_username text @unvalid_username_ text @valid.username text @unvalid..username  @validusername.'
print([x for x in re.findall(r'\B@[._a-z0-9]{3,24}', s) if x.count('.') < 2 and x.count('_') < 2 ])
# => ['@valid_username', '@valid.username', '@validusername.']

PHP regex on mention (@name)

You can use this regex (\@(?P<name>[a-zA-Z\-\_]+)) :

<?php
$matches = [];
$text = "I recently saw @john-doe riding a bike, did you noticed that too @foo-bar?";
preg_match_all ("(\@(?P<names>[a-zA-Z\-\_]+))" ,$text, $matches);
var_dump($matches['names']);

In this example, I used the ?P<names> to name the capture groups, it's easier to get it.

I've made a Regex101 for you, and a PHP sandbox for test

https://regex101.com/r/ZFWvCG/1

http://sandbox.onlinephpfunctions.com/code/1d04ce64a2a290994bf0effd7cf8f0039f20277b

Regex Valid Twitter Mention

Here's a regex that should work:

/^(?!.*\bRT\b)(?:.+\s)?@\w+/i

Explanation:

/^             //start of the string
(?!.*\bRT\b)   //Verify that rt is not in the string.
(?:.*\s)?      //Find optional chars and whitespace the
                  //Note: (?: ) makes the group non-capturing.
@\w+           //Find @ followed by one or more word chars.
/i             //Make it case insensitive.

regex for mentions

You may use the following regex:

/\B@\w+/g

\B matches at a non-word boundary, thus, it requires a non-word (or start of string) to be right before @.

See the regex demo

var re = /\B@\w+/g; var str = 'The @dog went to the park.\nBut not here: The d@og went to the park.\nOr here: The@dog went to the park.';var res = str.match(re);document.body.innerHTML = "<pre>" + JSON.stringify(res, 0, 4) + "</pre>";

How to fix this regex for mentions and hashtags?

Try this pattern:

(?:^|\s+)(?:(?<mention>@)|(?<hash>#))(?<item>\w+)(?=\s+)

Here it is broken down:

(?: creates a non-capturing group
^|\s+ matches the beginning of the String or Whitespace
(?: creates a non-capturing group
(?<mention>@|(?<hash>#) creates a group to match @ or # and respectively named the groups mention and hash
(?<item>\w+) matches any alphanumeric character one or more times and helps pull the item from the group for easy usage.
(?=\s+) creates a positive look ahead to match any white-space

Fiddle: Live Demo

You would then need to use the underlying language to trim the returning match to remove any leading/trailing whitespace.

Update
Since you mentioned that you were using C#, I thought that I'd provide you with a .NET solution to solve your problem that does not require RegEx; while I did not test the results, I would guess that this would also be faster than using RegEx too.

Personally, my flavor of .NET is Visual Basic, so I'm providing you with a VB.NET solution, but you can just as easily run it through a converter since I never use anything that can't be used in C#:

Private Function FindTags(ByVal lead As Char, ByVal source As String) As String()
    Dim matches As List(Of String) = New List(Of String)
    Dim current_index As Integer = 0

    'Loop through all but the last character in the source
    For index As Integer = 0 To source.Length - 2
        'Reset the current index
        current_index = index

        'Check if the current character is a "@" or "#" and either we're starting at the beginning of the String or the last character was whitespace and then if the next character is a letter, digit, or end of the String
        If source(index) = lead AndAlso (index = 0 OrElse Char.IsWhiteSpace(source, index - 1)) AndAlso (Char.IsLetterOrDigit(source, index + 1) OrElse index + 1 = source.Length - 1) Then
            'Loop until the next character is no longer a letter or digit
            Do
                current_index += 1
            Loop While current_index + 1 < source.Length AndAlso Char.IsLetterOrDigit(source, current_index + 1)

            'Check if we're at the end of the line or the next character is whitespace
            If current_index = source.Length - 1 OrElse Char.IsWhiteSpace(source, current_index + 1) Then
                'Add the match to the collection
                matches.Add(source.Substring(index, current_index + 1 - index))
            End If
        End If
    Next

    Return matches.ToArray()
End Function

Fiddle: Live Demo

Finding @Mentions in String