Finding @mentions in string
Replace the question mark (?
) quantifier ("optional") and add in a +
("one or more") after your character class:
@([^@ ]+)
How to list all mentions in a String
Use this Regex. It finds groups of word characters (letters, digits, underscores) that follow an @.
val atMentions: List<String> = "(?<=@)\\w+".toRegex().findAll(editText.text).map { it.value }
If you need to define a different set of word characters, replace the \\w
above with [\\w]
and put the other acceptable characters right after the w.
how to pull @ mentions out of strings like twitter in javascript
I have found that this is the best way to find mentions inside of a string in javascript.
var str = "@jpotts18 what is up man? Are you hanging out with @kyle_clegg";
var pattern = /\B@[a-z0-9_-]+/gi;
str.match(pattern);
["@jpotts18", "@kyle_clegg"]
I have purposefully restricted it to upper and lowercase alpha numeric and (-,_) symbols in order to avoid periods that could be confused for usernames like (@j.potts).
This is what twitter-text.js is doing behind the scenes.
// Mention related regex collection
twttr.txt.regexen.validMentionPrecedingChars = /(?:^|[^a-zA-Z0-9_!#$%&*@@]|RT:?)/;
twttr.txt.regexen.atSigns = /[@@]/;
twttr.txt.regexen.validMentionOrList = regexSupplant(
'(#{validMentionPrecedingChars})' + // $1: Preceding character
'(#{atSigns})' + // $2: At mark
'([a-zA-Z0-9_]{1,20})' + // $3: Screen name
'(\/[a-zA-Z][a-zA-Z0-9_\-]{0,24})?' // $4: List (optional)
, 'g');
twttr.txt.regexen.endMentionMatch = regexSupplant(/^(?:#{atSigns}|[#{latinAccentChars}]|:\/\/)/);
Please let me know if you have used anything that is more efficient, or accurate. Thanks!
How @mention works, how can I find mention during comment in .Net
Looks like a good fit for regular expressions. There are multiple ways to solve this.
Here's the simplest one:
(?<mention>@[a-zA-Z0-9_.]+)[^a-zA-Z0-9_.]
- it searches matching characters followed by non-matching character.
[^ ... ]
does the negation bit (?<mention> ... )
declares an explictit group to capture mention without including the non-matching character immediately following the mention.- not that this pattern requires a non-matching character after mention, so if it matters work around that.
A cleaner pattern would use a feature called look-ahead:
@[a-zA-Z0-9_.]+?(?![a-zA-Z0-9_.])
- (?!) is negative lookahead. Meaning "only match if it is NOT followed by this"
- named capture not required as lookahead does not consume the lookahead part.
- It supports multiple mention lookups by adding using non-greedy quantifier
+?
. This ensures that matched mention is as short as possible.
Lookaheads are a tad less known and may become a pain to read if pattern grows too long. But it is a useful tool to know.
Full example using C#:
string comment = "hi @fri.tara3^ @hjh not a mention @someone";
const String pattern = "@[a-zA-Z0-9_.]+?(?![a-zA-Z0-9_.])";
var matches = Regex.Matches(comment, pattern);
for (int i = 0; i < matches.Count; i++)
{
Console.WriteLine(matches[i].Value);
}
Find all valid user mentions in text with regex
You may consider a good-enough pattern like
r'\B@(?!(?:[a-z0-9.]*_){2})(?!(?:[a-z0-9_]*\.){2})[._a-z0-9]{3,24}\b'
See the regex demo. The only drawback of the pattern is that if the valid mention can end with .
, it will match up to that .
(see demo).
Details
\B@
- a@
not preceded with a word char(?!(?:[a-z0-9.]*_){2})
- no two_
chars anywhere after@
(?!(?:[a-z0-9_]*\.){2})
- no two.
chars anywhere after@
[._a-z0-9]{3,24}
- three to twenty-four letters, digits,.
and_
\b
- word boundary
Note you may actually use some Python code to filter your results obtained with \B(@[a-z_.]{3,24})
:
import re
s = 'text @valid_username text @unvalid_username_ text @valid.username text @unvalid..username @validusername.'
print([x for x in re.findall(r'\B@[._a-z0-9]{3,24}', s) if x.count('.') < 2 and x.count('_') < 2 ])
# => ['@valid_username', '@valid.username', '@validusername.']
PHP regex on mention (@name)
You can use this regex (\@(?P<name>[a-zA-Z\-\_]+))
:
<?php
$matches = [];
$text = "I recently saw @john-doe riding a bike, did you noticed that too @foo-bar?";
preg_match_all ("(\@(?P<names>[a-zA-Z\-\_]+))" ,$text, $matches);
var_dump($matches['names']);
In this example, I used the ?P<names>
to name the capture groups, it's easier to get it.
I've made a Regex101 for you, and a PHP sandbox for test
https://regex101.com/r/ZFWvCG/1
http://sandbox.onlinephpfunctions.com/code/1d04ce64a2a290994bf0effd7cf8f0039f20277b
Regex Valid Twitter Mention
Here's a regex that should work:
/^(?!.*\bRT\b)(?:.+\s)?@\w+/i
Explanation:
/^ //start of the string
(?!.*\bRT\b) //Verify that rt is not in the string.
(?:.*\s)? //Find optional chars and whitespace the
//Note: (?: ) makes the group non-capturing.
@\w+ //Find @ followed by one or more word chars.
/i //Make it case insensitive.
regex for mentions
You may use the following regex:
/\B@\w+/g
\B
matches at a non-word boundary, thus, it requires a non-word (or start of string) to be right before @
.
See the regex demo
var re = /\B@\w+/g; var str = 'The @dog went to the park.\nBut not here: The d@og went to the park.\nOr here: The@dog went to the park.';var res = str.match(re);document.body.innerHTML = "<pre>" + JSON.stringify(res, 0, 4) + "</pre>";
How to fix this regex for mentions and hashtags?
Try this pattern:
(?:^|\s+)(?:(?<mention>@)|(?<hash>#))(?<item>\w+)(?=\s+)
Here it is broken down:
(?:
creates a non-capturing group^|\s+
matches the beginning of the String or Whitespace(?:
creates a non-capturing group(?<mention>@|(?<hash>#)
creates a group to match@
or#
and respectively named the groups mention and hash(?<item>\w+)
matches any alphanumeric character one or more times and helps pull the item from the group for easy usage.(?=\s+)
creates a positive look ahead to match any white-space
Fiddle: Live Demo
You would then need to use the underlying language to trim the returning match to remove any leading/trailing whitespace.
Update
Since you mentioned that you were using C#, I thought that I'd provide you with a .NET solution to solve your problem that does not require RegEx; while I did not test the results, I would guess that this would also be faster than using RegEx too.
Personally, my flavor of .NET is Visual Basic, so I'm providing you with a VB.NET solution, but you can just as easily run it through a converter since I never use anything that can't be used in C#:
Private Function FindTags(ByVal lead As Char, ByVal source As String) As String()
Dim matches As List(Of String) = New List(Of String)
Dim current_index As Integer = 0
'Loop through all but the last character in the source
For index As Integer = 0 To source.Length - 2
'Reset the current index
current_index = index
'Check if the current character is a "@" or "#" and either we're starting at the beginning of the String or the last character was whitespace and then if the next character is a letter, digit, or end of the String
If source(index) = lead AndAlso (index = 0 OrElse Char.IsWhiteSpace(source, index - 1)) AndAlso (Char.IsLetterOrDigit(source, index + 1) OrElse index + 1 = source.Length - 1) Then
'Loop until the next character is no longer a letter or digit
Do
current_index += 1
Loop While current_index + 1 < source.Length AndAlso Char.IsLetterOrDigit(source, current_index + 1)
'Check if we're at the end of the line or the next character is whitespace
If current_index = source.Length - 1 OrElse Char.IsWhiteSpace(source, current_index + 1) Then
'Add the match to the collection
matches.Add(source.Substring(index, current_index + 1 - index))
End If
End If
Next
Return matches.ToArray()
End Function
Fiddle: Live Demo
Related Topics
Access MySQL Field's Comments with PHP
Problems with Secure Bind to Active Directory Using PHP
Get Last Word from Url After a Slash in PHP
Macd Function Returning Incorrect Values
Working with Japanese Filenames in PHP 5.3 and Windows Vista
What Is the Significance of Application Key in a Laravel Application
Unix Permissions, Read VS. Execute (PHP Context)
Simplexmlelement and Xpath, Getting Empty Array()
Merge a Table and a Change Log into a View in Postgresql
"Premature End of Data" Error with PHP
Call to Undefined Function Oci_Connect, PHP_Oci8_12C.Dll, Windows 8.1, PHP5.6.6
Alternative for $_Server['Http_Referer'] PHP Variable in Msie
Best Debug Tool to Debug Ajax Request in PHP
Extension PHP5 Does Not Parse in Xampp