Cancel Regex match if timeout
You could spawn a child process that does the regex matching and kill it off if it hasn't completed in 10 seconds. Might be a bit overkill, but it should work.
fork is probably what you should use, if you go down this road.
If you'll forgive my non-pure functions, this code would demonstrate the gist of how you could communicate back and forth between the forked child process and your main process:
index.js
const { fork } = require('child_process');
const processPath = __dirname + '/regex-process.js';
const regexProcess = fork(processPath);
let received = null;
regexProcess.on('message', function(data) {
console.log('received message from child:', data);
clearTimeout(timeout);
received = data;
regexProcess.kill(); // or however you want to end it. just as an example.
// you have access to the regex data here.
// send to a callback, or resolve a promise with the value,
// so the original calling code can access it as well.
});
const timeoutInMs = 10000;
let timeout = setTimeout(() => {
if (!received) {
console.error('regexProcess is still running!');
regexProcess.kill(); // or however you want to shut it down.
}
}, timeoutInMs);
regexProcess.send('message to match against');
regex-process.js
function respond(data) {
process.send(data);
}
function handleMessage(data) {
console.log('handing message:', data);
// run your regex calculations in here
// then respond with the data when it's done.
// the following is just to emulate
// a synchronous computational delay
for (let i = 0; i < 500000000; i++) {
// spin!
}
respond('return regex process data in here');
}
process.on('message', handleMessage);
This might just end up masking the real problem, though. You may want to consider reworking your regex like other posters have suggested.
How to stop regex matching after 1 match without using non-greedy character
The really bad degenerate pattern never match. And if you find a good way of finding the degenerate cases, well you will probably be due a lot of money. You are probably better off with a timeout. In Perl I would use alarm
combined with a block eval
.
You may also be looking for (*COMMIT) in Perl which prevents backtracking.
Lightweight long-running method cancel pattern for Java
I am not aware of such a mechanism. Since you have to track your work in order to be able to perform rollbackWork()
, a well-designed object-oriented solution is your best choice anyway, if you want to further evolve this logic! Typically, such a scenario could be implemented using the command pattern, which I still find pretty lightweight:
// Task or Command
public interface Command {
void redo();
void undo();
}
A scheduler or queue could then take care of executing such task / command implementations, and of rolling them back in order.
Limit a variety of regex patterns to execute by user on the server
Try/catch StackOverflowError
, plus wrapping the whole thing in an aggressive timeout (say, 1 second), is almost certainly your best bet. It's also by far the simplest option. As you develop your implementation you'll probably need to catch
other exception types as well.
Note that, for the timeout to work, you will need to use an interruptible CharSequence
implementation rather than a plain String
. I've used this stategy successfully with poorly-written, regex-heavy third party libraries before.
The Q&A linked above should help get you started: Cancelling a long running regex match?
The original approach you suggested – to try to detect "bad" patterns up-front – is a very hard problem to solve indeed. Are you familiar with the halting problem? It's only a little less hard than solving that (which is impossible to solve in the general case).
Regex Negation: Handling conditional if statements that cancel the match if fulfilled
Use
/(?<=(?<!\*)\*\*)\w+(?=\*\*(?!\*))|(?<=(?<!_)__)\w+(?=__(?!_))/gi
See proof.
Explanation
--------------------------------------------------------------------------------
(?<= look behind to see if there is:
--------------------------------------------------------------------------------
(?<! look behind to see if there is not:
--------------------------------------------------------------------------------
\* '*'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
\* '*'
--------------------------------------------------------------------------------
\* '*'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
\* '*'
--------------------------------------------------------------------------------
\* '*'
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
\* '*'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
(?<= look behind to see if there is:
--------------------------------------------------------------------------------
(?<! look behind to see if there is not:
--------------------------------------------------------------------------------
_ '_'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
__ '__'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
__ '__'
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
_ '_'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
) end of look-ahead
JavaScript code:
const string = 'hello world **ant*** lorem **cat** opposum** *** ***antelope*** *rabbit __dog__';
console.log(string.match(/(?<=(?<!\*)\*\*)\w+(?=\*\*(?!\*))|(?<=(?<!_)__)\w+(?=__(?!_))/gi))
Related Topics
Getting Xml Node Text Value with Java Dom
Functional Interface That Takes Nothing and Returns Nothing
Java String.Split() Sometimes Giving Blank Strings
@Valid When Creating Objects with Jackson Without Controller
Java - Scroll to Specific Text Inside Jtextarea
Logarithmic Axis Labels/Ticks Customization
How to Check If a String Starts with One of Several Prefixes
Get the Week Start and End Date Given a Current Date and Week Start
Populate Jfreechart Timeseriescollection from MySQL Db
Java.Io.Streamcorruptedexception: Invalid Type Code: 00
Stream Filter of 1 List Based on Another List
How to Ignore Pkix Path Building Failed: Sun.Security.Provider.Certpath.Suncertpathbuilderexception