How to Get the Domain Name Without Www, Subdomain, and Com/Net/Org/Etc

Get domain name without subdomains using JavaScript?

Following is a solution to extract a domain name without any subdomains. This solution doesn't make any assumptions about the URL format, so it should work for any URL. Since some domain names have one suffix (.com), and some have two or more (.co.uk), to get an accurate result in all cases, we need to parse the hostname using the Public Suffix List, which contains a list of all public domain name suffixes.


Solution

First, include the public suffix list js api in a script tag in your HTML, then in JavaScript to get the hostname you can call:

var parsed = psl.parse('one.two.roothost.co.uk');
console.log(parsed.domain);

...which will return "roothost.co.uk". To get the name from the current page, you can use location.hostname instead of a static string:

var parsed = psl.parse(location.hostname);
console.log(parsed.domain);

Finally, if you need to parse a domain name directly out of a full URL string, you can use the following:

var url = "http://one.two.roothost.co.uk/page.html";
url = url.split("/")[2]; // Get the hostname
var parsed = psl.parse(url); // Parse the domain
document.getElementById("output").textContent = parsed.domain;

JSFiddle Example (it includes the entire minified library in the jsFiddle, so scroll down!): https://jsfiddle.net/6aqdbL71/2/

Get domain name (not subdomain) in php

Well you can use parse_url to get the host:

$info = parse_url($url);
$host = $info['host'];

Then, you can do some fancy stuff to get only the TLD and the Host

$host_names = explode(".", $host);
$bottom_host_name = $host_names[count($host_names)-2] . "." . $host_names[count($host_names)-1];

Not very elegant, but should work.


If you want an explanation, here it goes:

First we grab everything between the scheme (http://, etc), by using parse_url's capabilities to... well.... parse URL's. :)

Then we take the host name, and separate it into an array based on where the periods fall, so test.world.hello.myname would become:

array("test", "world", "hello", "myname");

After that, we take the number of elements in the array (4).

Then, we subtract 2 from it to get the second to last string (the hostname, or example, in your example)

Then, we subtract 1 from it to get the last string (because array keys start at 0), also known as the TLD

Then we combine those two parts with a period, and you have your base host name.

Getting domain name without TLD

Group the first part of your 2nd regex into /([^.]+)\.[^.]+$/ and $matches[1] will be php

How to get domain name from URL

I once had to write such a regex for a company I worked for. The solution was this:

  • Get a list of every ccTLD and gTLD available. Your first stop should be IANA. The list from Mozilla looks great at first sight, but lacks ac.uk for example so for this it is not really usable.
  • Join the list like the example below. A warning: Ordering is important! If org.uk would appear after uk then example.org.uk would match org instead of example.

Example regex:

.*([^\.]+)(com|net|org|info|coop|int|co\.uk|org\.uk|ac\.uk|uk|__and so on__)$

This worked really well and also matched weird, unofficial top-levels like de.com and friends.

The upside:

  • Very fast if regex is optimally ordered

The downside of this solution is of course:

  • Handwritten regex which has to be updated manually if ccTLDs change or get added. Tedious job!
  • Very large regex so not very readable.

How to get domain name only using javascript?

Use location.host and cut off subdomains and the TLD:

 var domain = (location.host.match(/([^.]+)\.\w{2,3}(?:\.\w{2})?$/) || [])[1]

update: as @demix pointed out, this fails for 2 and 3-letter domains. It also won't work for domains like aero, jobs and dozens others.

The only way around is to know valid TLDs in advance, so here is a more appropriate function:

// http://data.iana.org/TLD/tlds-alpha-by-domain.txt
var TLDs = ["ac", "ad", "ae", "aero", "af", "ag", "ai", "al", "am", "an", "ao", "aq", "ar", "arpa", "as", "asia", "at", "au", "aw", "ax", "az", "ba", "bb", "bd", "be", "bf", "bg", "bh", "bi", "biz", "bj", "bm", "bn", "bo", "br", "bs", "bt", "bv", "bw", "by", "bz", "ca", "cat", "cc", "cd", "cf", "cg", "ch", "ci", "ck", "cl", "cm", "cn", "co", "com", "coop", "cr", "cu", "cv", "cx", "cy", "cz", "de", "dj", "dk", "dm", "do", "dz", "ec", "edu", "ee", "eg", "er", "es", "et", "eu", "fi", "fj", "fk", "fm", "fo", "fr", "ga", "gb", "gd", "ge", "gf", "gg", "gh", "gi", "gl", "gm", "gn", "gov", "gp", "gq", "gr", "gs", "gt", "gu", "gw", "gy", "hk", "hm", "hn", "hr", "ht", "hu", "id", "ie", "il", "im", "in", "info", "int", "io", "iq", "ir", "is", "it", "je", "jm", "jo", "jobs", "jp", "ke", "kg", "kh", "ki", "km", "kn", "kp", "kr", "kw", "ky", "kz", "la", "lb", "lc", "li", "lk", "lr", "ls", "lt", "lu", "lv", "ly", "ma", "mc", "md", "me", "mg", "mh", "mil", "mk", "ml", "mm", "mn", "mo", "mobi", "mp", "mq", "mr", "ms", "mt", "mu", "museum", "mv", "mw", "mx", "my", "mz", "na", "name", "nc", "ne", "net", "nf", "ng", "ni", "nl", "no", "np", "nr", "nu", "nz", "om", "org", "pa", "pe", "pf", "pg", "ph", "pk", "pl", "pm", "pn", "pr", "pro", "ps", "pt", "pw", "py", "qa", "re", "ro", "rs", "ru", "rw", "sa", "sb", "sc", "sd", "se", "sg", "sh", "si", "sj", "sk", "sl", "sm", "sn", "so", "sr", "st", "su", "sv", "sy", "sz", "tc", "td", "tel", "tf", "tg", "th", "tj", "tk", "tl", "tm", "tn", "to", "tp", "tr", "travel", "tt", "tv", "tw", "tz", "ua", "ug", "uk", "us", "uy", "uz", "va", "vc", "ve", "vg", "vi", "vn", "vu", "wf", "ws", "xn--0zwm56d", "xn--11b5bs3a9aj6g", "xn--3e0b707e", "xn--45brj9c", "xn--80akhbyknj4f", "xn--90a3ac", "xn--9t4b11yi5a", "xn--clchc0ea0b2g2a9gcd", "xn--deba0ad", "xn--fiqs8s", "xn--fiqz9s", "xn--fpcrj9c3d", "xn--fzc2c9e2c", "xn--g6w251d", "xn--gecrj9c", "xn--h2brj9c", "xn--hgbk6aj7f53bba", "xn--hlcj6aya9esc7a", "xn--j6w193g", "xn--jxalpdlp", "xn--kgbechtv", "xn--kprw13d", "xn--kpry57d", "xn--lgbbat1ad8j", "xn--mgbaam7a8h", "xn--mgbayh7gpa", "xn--mgbbh1a71e", "xn--mgbc0a9azcg", "xn--mgberp4a5d4ar", "xn--o3cw4h", "xn--ogbpf8fl", "xn--p1ai", "xn--pgbs0dh", "xn--s9brj9c", "xn--wgbh1c", "xn--wgbl6a", "xn--xkc2al3hye2a", "xn--xkc2dl3a5ee0h", "xn--yfro4i67o", "xn--ygbi2ammx", "xn--zckzah", "xxx", "ye", "yt", "za", "zm", "zw"].join()

function getDomain(url){

var parts = url.split('.');
if (parts[0] === 'www' && parts[1] !== 'com'){
parts.shift()
}
var ln = parts.length
, i = ln
, minLength = parts[parts.length-1].length
, part

// iterate backwards
while(part = parts[--i]){
// stop when we find a non-TLD part
if (i === 0 // 'asia.com' (last remaining must be the SLD)
|| i < ln-2 // TLDs only span 2 levels
|| part.length < minLength // 'www.cn.com' (valid TLD as second-level domain)
|| TLDs.indexOf(part) < 0 // officialy not a TLD
){
return part
}
}
}

getDomain(location.host)

I hope I didn't miss too many corner cases. This should be available in the location object :(

Test cases: http://jsfiddle.net/hqBKd/4/

A list of TLDs can be found here: http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat?raw=1

PHP Getting Domain Name From Subdomain

Stackoverflow Question Archive:

  • How to get domain name from url?
  • Check if domain equals value?
  • How do I get the base url?


print get_domain("http://somedomain.co.uk"); // outputs 'somedomain.co.uk'

function get_domain($url)
{
$pieces = parse_url($url);
$domain = isset($pieces['host']) ? $pieces['host'] : '';
if (preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i', $domain, $regs)) {
return $regs['domain'];
}
return false;
}

Get The Current Domain Name With Javascript (Not the path, etc.)

How about:

window.location.hostname

The location object actually has a number of attributes referring to different parts of the URL



Related Topics



Leave a reply



Submit