How to select domain name from email address
Assuming that the domain is a single word domain like gmail.com, yahoo.com, use
select (SUBSTRING_INDEX(SUBSTR(email, INSTR(email, '@') + 1),'.',1))
The inner SUBSTR
gets the right part of the email address after @
and the outer SUBSTRING_INDEX
will cut off the result at the first period.
otherwise if domain is expected to contain multiple words like mail.yahoo.com
, etc, use:
select (SUBSTR(email, INSTR(email, '@') + 1, LENGTH(email) - (INSTR(email, '@') + 1) - LENGTH(SUBSTRING_INDEX(email,'.',-1))))
LENGTH(email) - (INSTR(email, '@') + 1) - LENGTH(SUBSTRING_INDEX(email,'.',-1))
will get the length of the domain minus the TLD (.com, .biz etc. part)
by using SUBSTRING_INDEX
with a negative count which will calculate from right to left.
Regex get domain name from email
[^@]
means "match one symbol that is not an @
sign. That is not what you are looking for - use lookbehind (?<=@)
for @
and your (?=\.)
lookahead for \.
to extract server name in the middle:
(?<=@)[^.]+(?=\.)
The middle portion [^.]+
means "one or more non-dot characters".
Demo.
How to extract domain from email address with Pandas
I believe you need split
and select second value of lists by indexing:
df = pd.DataFrame({'email':['kkk@gmail.com','aa@yahoo.com']})
df['domain'] = df['email'].str.split('@').str[1]
#faster solution if no NaNs values
#df['domain'] = [x.split('@')[1] for x in df['email']]
print (df)
email domain
0 kkk@gmail.com gmail.com
1 aa@yahoo.com yahoo.com
One-liner to extract domain from email address
Not one liner, and only works on 2.13. But this seems very clear to me.
def extractDomain(email: String): Option[String] = email match {
case s"${_}@${domain}" => Some(domain)
case _ => None
}
(Note, if there are more than one @
sign, this will just split on the first one).
How to get domain from email
>> "hey@mycorp.com".split("@").last
=> "mycorp.com"
Spark: Extaract domain from email address in dataframe
You can simple use inbuilt regexp_extract
function to get your domain name from email address.
//create an example dataframe
val df = Seq((1, "ii@koko.com"),
(2, "lol@fsa.org"),
(3, "kokojambo@mon.eu"))
.toDF("id", "email")
//original dataframe
df.show(false)
//output
// +---+----------------+
// |id |email |
// +---+----------------+
// |1 |ii@koko.com |
// |2 |lol@fsa.org |
// |3 |kokojambo@mon.eu|
// +---+----------------+
//using regex get the domain name
df.withColumn("domain",
regexp_extract($"email", "(?<=@)[^.]+(?=\\.)", 0))
.show(false)
//output
// +---+----------------+------+
// |id |email |domain|
// +---+----------------+------+
// |1 |ii@koko.com |koko |
// |2 |lol@fsa.org |fsa |
// |3 |kokojambo@mon.eu|mon |
// +---+----------------+------+
Related Topics
Ruby Regex What Does the \1 Mean for Gsub
How to Run Ruby Tasks That Use My Rails Models
Shared Variable Among Ruby Processes
"Require File.Dirname(_File_)" -- How to Safely Undo Filesystem Dependency
Add Comment to User and Post Models (Ruby on Rails)
Rails Project Using Spork - Always Have to Use Spork
Most Concise Way to Test String Equality (Not Object Equality) for Ruby Strings or Symbols
How to Return Http 204 in a Rails Controller
Ruby Array Each_Slice_With_Index
Ruby: Initialize() VS Class Body
How to Dynamically Define a Class Method Which Will Refer to a Local Variable Outside
Bundle Install Doesn't Work from Capistrano
How to Use One Line Regular Expression to Get Matched Content