Recursive matching with regular expressions in Javascript
Annoyingly, Javascript does not provide the PCRE recursive parameter (?R)
, so it is far from easy to deal with the nested issue. It can be done however.
I won't reproduce code, but if you check out Steve Levithan's blog, he has some good articles on the subject. He should do, he is probably the leading authority on RegExp in JS. He wrote XRegExp, which replaces most of the PCRE bits that are missing, there is even a Match Recursive plugin!
Recursive regex pattern in JavaScript
If you're intending on using a regular expression for this task, you can consider the following:
var r = s.replace(/((`(?:``)?)[^`]*\2)|>/g, '$1<')
.replace(/((`(?:``)?)[^`]*\2)|</g, '$1>')
.replace(/`[<>]+/g, '`');
Working Demo
Regex to group recursive pattern in string and return capture group as array
Added a ^ to match beginning of string only. Using lookaheads.
Note that lookaheads doesn't have good browser support.
Also, I don't know/haven't tested the performance characteristics of this versus exec looping. And to be entirely honest, it's not much cleaner looking than split. If you want a one-liner, this'll suffice I guess.
I remember for very long strings on other platforms using RegExs like this didn't do well. But I ended up resorting to other methods to searching anyways.
console.log("Hello:A1,A3,A4,AA04".match(/(^[A-Za-z]+(?=:))|((?=,?)([A-Za-z0-9]+))/g))
Recursive Regex expression to split expression by character
Use String#split
var array = '23;hello;2452;ad34aa;bye'.split(';');document.write('<pre>' + JSON.stringify(array, 0, 4) + '</pre>');
// as mentioned from Avinash Rajvar array2 = '23;hello;2452;ad34aa;bye'.match(/[^;]+/g);document.write('<pre>' + JSON.stringify(array2, 0, 4) + '</pre>');
RegExp to be recursive
This is not so much about recursion as getting all matches. In Javascript you have to make your regular expression global
/([^>]+)/g
This will match all sub strings in your string:
string
string25
string89 (including space at the end)
Or you could easily just split your string with >
delimiter and collect individuals:
yourString.split(">");
Edit
After you've written your desired result I'd suggest you go with @HamZa's solution that uses positive lookahead. And you'll get pairs back.
/(?=([^>]+>[^>]+))[^>]+>/g
Some explanation
Regular expressions parse strings from from left to right iterating over each character (to simplify the process). Positive lookaheads on the other hand don't progress the current parsing position but rather do what they say: they lookahead if their expression is found:
t(?=s)
will match second t
in streets
because it sees that s
is followed by t
. But parsing after this match will continue from t
on.
I hope this explains it a bit.
Actual solution expression
But to explain the actual regular expression it's a rather clever one how it progresses string parsing:
It first has a positive lookahead (it doesn't increment parsing position) to check if at current parsing position there's a pair you're looking for:
(?=([^>]+>[^>]+))
- If lookahead matches such a pair it stores it as a match (hence the inner parentheses)
- Then after the lookahead we have the single string expression
[^>]+>
that doesn't get stored as a match (not within parentheses) but rather takes care that parsing progresses for a single string up to and including the next>
character. - Because this regular expression is global it then starts to do a match all over again but this time from the next character position after
>
character as the previous parsing progressed/incremented/advanced to it.
Javascript parse multiple brackets recursively from a string
String is just a stream of character.
- At first we need a lexical analyzer. It take stream of character and generate token. Example:
Input = `"Hello world" (!)`
Output = Lex(Input) -> ["Hello world", "(" , "!", ")"]
- We can use a recursive function. where we can create a stack on top and return new array.
Tokens = ["Hello world", "(" , "!", ")"]
Result = [ "Hello world", ["!"] ]
let input = `this OR (this AND (this OR (that AND this))) "this is a text" OR this`
function* Lex(characters) {
let token = "", char;
while (char = characters.shift()) switch (char) {
case '(': case ')':
if (token) { yield token; token = '' }
yield char
break
case '"':
let index = characters.indexOf(char);
token = characters.splice(0, index).join('');
case ' ':
if (token) yield token
token = ''
break
default: token += char
}
}
function fnName(tokens) {
let result = [], item;
while (item = tokens.shift()) {
if (item == ')') return result
result.push(item == '(' ? fnName(tokens) : item)
}
return result
}
let tokens = [...Lex(input.split(''))];
console.log(fnName(tokens))
Related Topics
How to Check If an Element Is Really Visible with JavaScript
How to Use Redirect in Version 5 of React-Router-Dom of Reactjs
Caching a Promise Object in Angularjs Service
Checking If Jquery Is Loaded Using JavaScript
Sorting in JavaScript: Should Every Compare Function Have a "Return 0" Statement
New Es6 Syntax for Importing Commonjs/Amd Modules I.E. 'Import Foo = Require('Foo')'
Onclick Not Working on Mobile (Touch)
Basic JavaScript Promise Implementation Attempt
How to Add a Tooltip to an Svg Graphic
Call a Function After Previous Function Is Complete
What Is the Cleanest Way to Get the Progress of Jquery Ajax Request
How to Set Default Homepage in Ff and Chrome via JavaScript
Rails - Could Not Find a JavaScript Runtime
How to Get the Destination Url for the Onbeforeunload Event