Regular expression for twitter username
This should do:^@?(\w){1,15}$
regex for Twitter username
(?<=^|(?<=[^a-zA-Z0-9-_\.]))@([A-Za-z]+[A-Za-z0-9-_]+)
I've used this as it disregards emails.
Here is a sample tweet:
@Hello how are @you doing @my_friend, email @000 me @ whats.up@example.com @shahmirj
Matches:
- @Hello
- @you
- @my_friend
- @shahmirj
It will also work for hashtags, I use the same expression with the @
changed to #
.
RegEx for twitter usernames
You can assert a whitespace boundary to the left, and assert not @ in the line to the right.
(?<!\S)@(\w{4,15})\b(?![^@\n]*@)
Explanation
(?<!\S)
Assert a whitespace boundary to the left@
Match literally(\w{4,15})\b
Capture 4-15 word chars followed by a word boundary(?![^@\n]*@)
Assert not optional repetitions of any char except @ or a newline to the right followed by @
Regex demo
Python Regex Twitter username including @
You basically need to put @
inside the capturing group you are returning. However, the pattern is highly cryptic and can be greatly simplified.
(?<![\w.-])@[A-Za-z][\w-]+
See the regex demo
Details
(?<![\w.-])
- a negative lookbehind that fails the match if, immediately to the left of the current location, there is a word char, or.
or-
@
- a@
char[A-Za-z]
- an ASCII letter[\w-]+
- 1 or more word chars or hyphens.
In Python 3, compile the pattern with re.ASCII
flag to make \w
only match ASCII letters and digits.
Regular Expression in R for twitter username
The ^@.[A-z0-9_].:$
pattern matches the start of string (^
), then a @
, then any char (with .
), then letters, digits, _
, `
, [
, \
, ]
, ^
, then any char again, a :
and end of string ($
). So, it can match, say, a @§`‘:
string.
You may use stringr str_extract_all
like this
str_extract_all(x, "(?<=@)[^\\s:]+")
If you must check for the :
presence, add a lookahead check:
str_extract_all(x, "(?<=@)[^\\s:]+(?=:)")
^^^^^
See the regex demo.
Details
(?<=@)
- a location in string that is immediately preceded with@
symbol[^\\s:]+
- 1 or more (due to+
) chars other than whitespace and:
(?=:)
- a positive lookahead that requires the presence of:
immediately to the right of the current location.
Regex for twitter usernames but for some matching a specified format - Python
You can use
(?<!\S)@(?!COMPANY|CEO|MEDIA)\b[^@\s]+
The pattern matches:
(?<!\S)@
Assert a whitespace boundary to the left, then match @(?!COMPANY|CEO|MEDIA)\b
Negative lookahead to assert not any of the alternatives directly to the right[^@\s]+
match 1+ times any char except @ or a whitspace char.
See a regex demo or a Python demo
In the replacement you could use "@MENTION"
import re
pattern = re.compile(r"(?<!\S)@(?!COMPANY|CEO|MEDIA)\b[^@\s]+")
test = ["hello @CEO said the @user in the @MEDIA", "there is a new @EMPLOYEE said the @user"]
for t in test:
test = re.sub(pattern, "@MENTION", t)
print(test)
Output
hello @CEO said the @MENTION in the @MEDIA
there is a new @MENTION said the @MENTION
How to validate a Twitter username using Regex
To validate if a string is a valid Twitter handle:
function validate_username($username)
{
return preg_match('/^[A-Za-z0-9_]{1,15}$/', $username);
}
If you are trying to match @username
within a string.
For example: RT @username: lorem ipsum @cjoudrey etc...
Use the following:
$string = 'RT @username: lorem ipsum @cjoudrey etc...';
preg_match_all('/@([A-Za-z0-9_]{1,15})/', $string, $usernames);
print_r($usernames);
You can use the latter with preg_replace_callback to linkify usernames in a string.
Edit: Twitter also open sourced text libraries for Java and Ruby for matching usernames, hash tags, etc.. You could probably look into the code and find the regex patterns they use.
Edit (2): Here is a PHP port of the Twitter Text Library: https://github.com/mzsanford/twitter-text-php#readme
Regular expression to search for specific twitter username
\b
matches a word boundary, but @
is not a word character, so if it occurs after a space, the match will fail. Try removing the word boundary there, and removing the extra groups, and add a character set at the end for [.?!]
to include the final punctuation, and you get:
[^.?!]*@A_Person\b.*?[^.?!]*[.?!]
You also might consider including a check for the start of the string or the end of the last sentence, otherwise the engine will go through a lot of steps while going through areas without any matches. Perhaps use
(?:^|(?<=[.?!])\s*)
which will match the start of the string, or will lookbehind for [.?!]
possibly followed by spaces. Put those together and you get
(?:^|(?<=[.?!])\s*)([^.?!]*@A_Person\b.*?[^.?!]*[.?!])
where the string you want is in the first group (no leading spaces).
https://regex101.com/r/447KsF/3
PHP Get Twitter username from URL (Regex)
preg_match('/https?:\/\/twitter\.com\/(?<name>[^\?]+)\??.*/', 'https://twitter.com/jack?lang=en', $m);
var_dump(trim($m['name']));
$path = parse_url('https://twitter.com/jack?lang=en',PHP_URL_PATH);
var_dump(str_replace('/','', $path));
string(4) "jack"
Related Topics
Python Regular Expressions - How to Capture Multiple Groups from a Wildcard Expression
Count Consecutive Occurences of Values Varying in Length in a Numpy Array
How to Request a Url in Python and Not Follow Redirects
Why Does Defining _Getitem_ on a Class Make It Iterable in Python
Python Numpy Arange Unexpected Results
Generate a Random Derangement of a List
Matplotlib Axes.Plot() VS Pyplot.Plot()
What's the Best Way to Find the Inverse of Datetime.Isocalendar()
Find Longest Repetitive Sequence in a String
List of Dicts To/From Dict of Lists
Django Aggregation: Summation of Multiplication of Two Fields
Not All Parameters Were Used in the SQL Statement (Python, MySQL)
Numpy.Unique with Order Preserved
Should I Call Close() After Urllib.Urlopen()
How to Set Folder Permissions in Windows
Is' Operator Behaves Differently When Comparing Strings with Spaces