Remove Domain Extension

Remove domain extension

$subject = 'just-a.domain.com';
$result = preg_split('/(?=\.[^.]+$)/', $subject);

This produces the following array

$result[0] == 'just-a.domain';
$result[1] == '.com';

PHP Remove Domain Name Extension from String

To get the host part of a URL use parse_url:

$host = parse_url($url, PHP_URL_HOST);

And for the rest see my answer to Remove domain extension.

PowerShell: Remove domain extensions in email list

As Fairy points out you need to be aware of your regex meta characters like .

I would like to remove all domain extensions.

If that is the case then you should not need to be typing in each one that you want to remove. You should be able to remove all character after and including the last period.

Since -replace is an array operator you don't need to use ForEach-Object

(Get-Content $Sourcefile) -replace "(@.+?)\..*$",'$1' | Set-Content $Output

That will match everything after and including the "@". It replaces that with just the "@" and what is before the first period.

If you really want to replace certain domains they you might be better off keeping a string array and building a regex replacement string with that. Makes it easier to make changes and the code will stay clean.

$suffixesToRemove = "com","co.uk","nl","al"
$regex = "\.($(($suffixesToRemove|ForEach-Object{[regex]::Escape($_)}) -join "|"))$"
(Get-Content $Sourcefile) -replace $regex | Set-Content $Output

The computed regex string would look like this

\.(com|co\.uk|nl|al)$ 

So it uses a alternating group with meta-charaters escaped.

Removing domain extension from string using preg_replace?

You do have to escape the dot but also regular expressions have to be passed in the form of /pattern/ that is between slashes

preg_replace("/\.com/", "", $host); 

Side note, there are better recommended ways to do this than using the almighty regular expressions. For example:

$url = explode(".","http://static.facebook.com");   
echo $url[1]; // facebook

Regex to retrieve domain.extension from a url

You could achieve the same result with replace method but match is some how more suitable:

console.log(    window.location.hostname.match(/[^\s.]+\.[^\s.]+$/)[0]);

Extract domain name with no domain extension in Google Sheets

Since you may have only .com or .co.uk at the end of the strings, you may use

=REGEXEXTRACT(A4, "^(.+)\.(?:co\.uk|com)$")

See the regex demo.

Also, you may remove them at the end with

=REGEXREPLACE(A4, "\.(?:co\.uk|com)$", "")

See another regex demo

You may also consider a bit more generic patterns like

=REGEXEXTRACT(A4, "^(.+?)(?:\.co)?\.[^.]+$")
=REGEXREPLACE(A4, "(?:\.co)?\.[^.]+$", "")

Pattern details

  • ^ - start of string
  • (.+) - 1 or more chars other than line break chars, as many as possible
  • (.+?) - 1 or more chars other than line break chars, as few as possible (needed in the more generic patterns because the subsequent pattern is optional)
  • \.(?:co\.uk|com)$ - . and then co.uk or com at the end of the string
  • (?:\.co)?\.[^.]+$ - an optional .co char sequence and then . and 1 or more chars other than a . till the end of the string.

Extract domain name without suffix or subdomain

You're working with domain names, so you may want to use some tools that were designed to do so:

library(urltools)

df <- data.frame(site=c("Google.com", "yahoo.in", "facebook.com", "badge.net"))

suffix_extract(df$site)
## host subdomain domain suffix
## 1 Google.com <NA> google com
## 2 yahoo.in <NA> yahoo in
## 3 facebook.com <NA> facebook com
## 4 badge.net <NA> badge net

for @Sotos:

urltools::suffix_extract('www.bankofcyprus.com')
## host subdomain domain suffix
## 1 www.bankofcyprus.com www bankofcyprus com

Remove protocol, domainame, domain and file extension from URL

Everybody stand back! I know regular expressions!
Try this one:

var my_location = window.location.toString().match(/\/\/[^\/]+\/([^\.]+)/)[1];


Related Topics



Leave a reply



Submit