Regex for Twitter Username

Regular expression for twitter username

This should do:
^@?(\w){1,15}$

regex for Twitter username

(?<=^|(?<=[^a-zA-Z0-9-_\.]))@([A-Za-z]+[A-Za-z0-9-_]+)

I've used this as it disregards emails.

Here is a sample tweet:

@Hello how are @you doing @my_friend, email @000 me @ whats.up@example.com @shahmirj

Matches:

  • @Hello
  • @you
  • @my_friend
  • @shahmirj

It will also work for hashtags, I use the same expression with the @ changed to #.

RegEx for twitter usernames

You can assert a whitespace boundary to the left, and assert not @ in the line to the right.

(?<!\S)@(\w{4,15})\b(?![^@\n]*@)

Explanation

  • (?<!\S) Assert a whitespace boundary to the left
  • @ Match literally
  • (\w{4,15})\b Capture 4-15 word chars followed by a word boundary
  • (?![^@\n]*@) Assert not optional repetitions of any char except @ or a newline to the right followed by @

Regex demo

Python Regex Twitter username including @

You basically need to put @ inside the capturing group you are returning. However, the pattern is highly cryptic and can be greatly simplified.

(?<![\w.-])@[A-Za-z][\w-]+

See the regex demo

Details

  • (?<![\w.-]) - a negative lookbehind that fails the match if, immediately to the left of the current location, there is a word char, or . or -
  • @ - a @ char
  • [A-Za-z] - an ASCII letter
  • [\w-]+ - 1 or more word chars or hyphens.

In Python 3, compile the pattern with re.ASCII flag to make \w only match ASCII letters and digits.

Regular Expression in R for twitter username

The ^@.[A-z0-9_].:$ pattern matches the start of string (^), then a @, then any char (with .), then letters, digits, _, `, [, \, ], ^, then any char again, a : and end of string ($). So, it can match, say, a @§`‘: string.

You may use stringr str_extract_all like this

str_extract_all(x, "(?<=@)[^\\s:]+")

If you must check for the : presence, add a lookahead check:

str_extract_all(x, "(?<=@)[^\\s:]+(?=:)")
^^^^^

See the regex demo.

Details

  • (?<=@) - a location in string that is immediately preceded with @ symbol
  • [^\\s:]+ - 1 or more (due to +) chars other than whitespace and :
  • (?=:) - a positive lookahead that requires the presence of : immediately to the right of the current location.

Regex for twitter usernames but for some matching a specified format - Python

You can use

(?<!\S)@(?!COMPANY|CEO|MEDIA)\b[^@\s]+

The pattern matches:

  • (?<!\S)@ Assert a whitespace boundary to the left, then match @
  • (?!COMPANY|CEO|MEDIA)\b Negative lookahead to assert not any of the alternatives directly to the right
  • [^@\s]+ match 1+ times any char except @ or a whitspace char.

See a regex demo or a Python demo

In the replacement you could use "@MENTION"

import re
pattern = re.compile(r"(?<!\S)@(?!COMPANY|CEO|MEDIA)\b[^@\s]+")
test = ["hello @CEO said the @user in the @MEDIA", "there is a new @EMPLOYEE said the @user"]
for t in test:
test = re.sub(pattern, "@MENTION", t)
print(test)

Output

hello @CEO said the @MENTION in the @MEDIA
there is a new @MENTION said the @MENTION

How to validate a Twitter username using Regex

To validate if a string is a valid Twitter handle:

function validate_username($username)
{
return preg_match('/^[A-Za-z0-9_]{1,15}$/', $username);
}

If you are trying to match @username within a string.

For example: RT @username: lorem ipsum @cjoudrey etc...

Use the following:

$string = 'RT @username: lorem ipsum @cjoudrey etc...';
preg_match_all('/@([A-Za-z0-9_]{1,15})/', $string, $usernames);
print_r($usernames);

You can use the latter with preg_replace_callback to linkify usernames in a string.

Edit: Twitter also open sourced text libraries for Java and Ruby for matching usernames, hash tags, etc.. You could probably look into the code and find the regex patterns they use.

Edit (2): Here is a PHP port of the Twitter Text Library: https://github.com/mzsanford/twitter-text-php#readme

Regular expression to search for specific twitter username

\b matches a word boundary, but @ is not a word character, so if it occurs after a space, the match will fail. Try removing the word boundary there, and removing the extra groups, and add a character set at the end for [.?!] to include the final punctuation, and you get:

[^.?!]*@A_Person\b.*?[^.?!]*[.?!]

You also might consider including a check for the start of the string or the end of the last sentence, otherwise the engine will go through a lot of steps while going through areas without any matches. Perhaps use

(?:^|(?<=[.?!])\s*)

which will match the start of the string, or will lookbehind for [.?!] possibly followed by spaces. Put those together and you get

(?:^|(?<=[.?!])\s*)([^.?!]*@A_Person\b.*?[^.?!]*[.?!])

where the string you want is in the first group (no leading spaces).

https://regex101.com/r/447KsF/3

PHP Get Twitter username from URL (Regex)

preg_match('/https?:\/\/twitter\.com\/(?<name>[^\?]+)\??.*/', 'https://twitter.com/jack?lang=en', $m);
var_dump(trim($m['name']));
$path = parse_url('https://twitter.com/jack?lang=en',PHP_URL_PATH);
var_dump(str_replace('/','', $path));
string(4) "jack"


Related Topics



Leave a reply



Submit