Extract part of URL with Regex
I'll list both regex and non regex way. Surprisingly the regex way seems shorter.
Regex Way
The regex to find bar and boo is this /.*\/(.*)\/(.*)$/
which is short, precise and exactly what you need.
Let's put into practice,
const params = "http://www.sub.domain.tld/foo/bar/boo".match(/.*\/(.*)\/(.*)$/)
This results in,
params;
["http://www.sub.domain.tld/foo/bar/boo","bar","boo"]
Just access it like params[0]
and params[1]
.
Regex Explanation:
Extended Version:
The regex can be extended more to grab the /bar/foo/
pattern with a ending slash like this,
.*\/\b(.*)\/\b(.*)(\/?)$
Which means,
and it can be further extended, but let's keep it simple for now.
Non Regex Way
Use native methods like .split()
,
function getLastParam(str, targetIndex = 1) {
const arr = str
.split("/") // split by slash
.filter(e=>e); // remove empty array elements
return arr[arr.length - targetIndex];
}
Let's test it out quickly for different cases
[
"http://domain.tld/foo/bar/boo",
"http://www.domain.tld/foo/bar/boo",
"http://sub.domain.tld/foo/bar/boo",
"http://www.sub.domain.tld/foo/bar/boo",
"http://domain.tld/foo/bar/boo/",
".../bar/boo"
].map(e => {
console.log({ input: e, output: getLastParam(e, 1) });
});
This will yield in following,
{input: "http://domain.tld/foo/bar/boo", output: "boo"}
{input: "http://www.domain.tld/foo/bar/boo", output: "boo"}
{input: "http://sub.domain.tld/foo/bar/boo", output: "boo"}
{input: "http://www.sub.domain.tld/foo/bar/boo", output: "boo"}
{input: "http://domain.tld/foo/bar/boo/", output: "boo"}
{input: ".../bar/boo", output: "boo"}
If you want bar
, then use 2 for targetIndex
instead. It will get the second last. In which case, getLastParam(str, 2)
would result in bar
.
Speed stuff
Here is the small benchmark stuff, http://jsbench.github.io/#a6bcecaa60b7d668636f8f760db34483
getLastParamNormal: 5,203,853 ops/sec
getLastParamRegex: 6,619,590 ops/sec
Well, it doesn't matter. But nonetheless, it's interesting.
Regex to extract a part of an URL
You can use
^(?:https?://(?:www\.)?)*(.*)
See the regex demo. Details:
^
- start of string(?:https?://(?:www\.)?)*
- zero or more occurrences ofhttps?://
-http://
orhttps://
(?:www\.)?
- an optional sequence ofwww.
(.*)
- Group 1: the rest of the string.
With REGEXEXTRACT
, the output value is the text captured with Group 1.
Need to extract part of a url using Regex
I was able to eventually arrive at a solution that was closest to the format that I wanted as required in my question. I was able to do it by combining the solution of @sudhir-bastakoti and @wiktor-stribiżew as each individual answer did not address my question completely.
I am grateful to everyone that answered my question including @kooiinc. I checked out his last answer options and it worked. However, I wanted the answer in a certain format.
const s3bucket = 's3bucket';
const url = 's3://s3bucket/dynamodbtablename/05abd315-2e0b-4717-919d-1cc6576ebe19';
const migrationDataFileS3Key = url.match(new RegExp(String.raw`s3://${s3bucket}/(.*)`))[1];
How to extract a specific URL segment with Regex & C#
Although you should preferably go for URL related classes for parsing a URL as explained in another answer, as builtin functions are proven and well tested for handling even the corner cases, but as you mentioned you have some limitation and can only use a regex solution, you can try with following solution.
Finding sixth or Nth segment can be easily done using this regex,
(?:([^/]+)/){7}
which captures 6+1 (N+1 in general for Nth segment where +1 is for matching domain part of URL) segments and the group retains the last captured value which can be accessed using group1.
Here, ([^/]+)
matches one or more any characters except a /
and captures the content in group1 followed by /
and whole of it matching exactly 7 times.
Regex Demo
C# code demo
var pattern = "(?:([^/]+)/){7}";
var match = Regex.Match("/domain.com/segment1/segment2/segment3/segment4/segment5/segment6/segment7/filename.ext", pattern);
Console.WriteLine("Segment: " + match.Groups[1].Value);
match = Regex.Match("http://someother.com/segment1/segment2/segment3/segment4/segment5/segment6/segment7/filename.ext", pattern);
Console.WriteLine("Segment: " + match.Groups[1].Value);
Prints the value of sixth segment,
Segment: segment6
Segment: segment6
Extract first /part/ of url with Regex
There is way to extract the required part by using negative look-behind and a lazy quantifier:
const [,match] = "http://192.168.15.122:3000/adjusterAnalytics/individual/Xh7HTIgGw1RqnsK2TuJtiUIMahy2".match(/(?<![\/:])\/(.*?)\//);
console.log(match)
Related Topics
How to Make Mongoose Not Insert Empty Array or Object Fields into a Document
Res.Sendfile in Node Express With Passing Data Along
Regex - Get All Characters After Last Slash in Url
Node Js - Function to Return Array of Objects Read from Sequelize Database
How to Hide and Show Div by Id Based on the Value of Selected Drop Down -Jquery and JavaScript
Cors - Response to Preflight Request Doesn't Pass Access Control Check
How to Access Variables from Another File
Enable Utf-8 Encoding for JavaScript
How to Change Options Based on Another Select Option in a Table
Javascript: Get All Months Between Two Dates
Regex to Match Words With Hyphens And/Or Apostrophes
How to Check If Value Is in Array With Angularjs
Prevent Bootstrap Modal Window from Closing on Form Submission
Combining JavaScript and CSS into HTML File