How to Get Domain from Email

How to select domain name from email address

Assuming that the domain is a single word domain like,, use

select (SUBSTRING_INDEX(SUBSTR(email, INSTR(email, '@') + 1),'.',1))

The inner SUBSTR gets the right part of the email address after @ and the outer SUBSTRING_INDEX will cut off the result at the first period.

otherwise if domain is expected to contain multiple words like, etc, use:

select (SUBSTR(email, INSTR(email, '@') + 1, LENGTH(email) - (INSTR(email, '@') + 1) - LENGTH(SUBSTRING_INDEX(email,'.',-1)))) 

LENGTH(email) - (INSTR(email, '@') + 1) - LENGTH(SUBSTRING_INDEX(email,'.',-1)) will get the length of the domain minus the TLD (.com, .biz etc. part) by using SUBSTRING_INDEX with a negative count which will calculate from right to left.

Regex get domain name from email

[^@] means "match one symbol that is not an @ sign. That is not what you are looking for - use lookbehind (?<=@) for @ and your (?=\.) lookahead for \. to extract server name in the middle:


The middle portion [^.]+ means "one or more non-dot characters".


How to extract domain from email address with Pandas

I believe you need split and select second value of lists by indexing:

df = pd.DataFrame({'email':['','']})

df['domain'] = df['email'].str.split('@').str[1]
#faster solution if no NaNs values
#df['domain'] = [x.split('@')[1] for x in df['email']]
print (df)
email domain

One-liner to extract domain from email address

Not one liner, and only works on 2.13. But this seems very clear to me.

def extractDomain(email: String): Option[String] = email match {
case s"${_}@${domain}" => Some(domain)
case _ => None

(Note, if there are more than one @ sign, this will just split on the first one).

How to get domain from email

>> "".split("@").last
=> ""

Spark: Extaract domain from email address in dataframe

You can simple use inbuilt regexp_extract function to get your domain name from email address.

//create an example dataframe
val df = Seq((1, ""),
(2, ""),
(3, ""))
.toDF("id", "email")

//original dataframe
// +---+----------------+
// |id |email |
// +---+----------------+
// |1 | |
// |2 | |
// |3 ||
// +---+----------------+

//using regex get the domain name
regexp_extract($"email", "(?<=@)[^.]+(?=\\.)", 0))

// +---+----------------+------+
// |id |email |domain|
// +---+----------------+------+
// |1 | |koko |
// |2 | |fsa |
// |3 ||mon |
// +---+----------------+------+

Related Topics

Leave a reply
