Converting User Input String to Regular Expression

Converting user input string to regular expression

Use the RegExp object constructor to create a regular expression from a string:

var re = new RegExp("a|b", "i");
// same as
var re = /a|b/i;

Converting user input strings with Special characters to regular expression

Lets use the "change" event on <textarea> so that once the user changes the content and clicks outside, we just access the value property of it we can then construct the composite RegExp object. I haven't had the need to escape the \ characters at all.

Just copy paste the following to the text area and click outside.

/ab+c/
/Chapter (\d+)\./
/\d+/
/d(b+)d/

var myTextarea = document.getElementById("ta");
myTextarea.addEventListener("change", function(e) { var str = e.currentTarget.value.split(/[\r\n]+/) .map(s => s.slice(1, -1)) .join("|"); rgx = new RegExp(str, "i") console.log(`Derived RegExp object: ${rgx}`); console.log(`Testing for 'dbbbd': ${rgx.test('dbbbd')}`); // true console.log(`Testing for '256': ${rgx.test('256')}`); // true});
#ta {  width: 33vw;  height: 50vh;  margin-left: 33vw;}
<textarea id="ta"></textarea>

Python user input as regular expression, how to do it correctly?

Compile user input

I assume that the user input is a string, wherever it comes from your system:

user_input = input("Input regex:")  # check console, it is expecting your input
print("User typed: '{}'. Input type: {}.".format(user_input, type(user_input)))

This means that you need to transform it to a regex, and that is what the re.compile is for. If you use re.compile and you don't provide a valid str to be converted to a regex, it will throw an error.

Therefore, you can create a function to check if the input is valid or not. You used the re.escape, so I added a flag to the function to use re.escape or not.

def is_valid_regex(regex_from_user: str, escape: bool) -> bool:
try:
if escape: re.compile(re.escape(regex_from_user))
else: re.compile(regex_from_user)
is_valid = True
except re.error:
is_valid = False
return is_valid

print("If you don't use re.escape, the input is valid: {}.".format(is_valid_regex(user_input, escape=False)))
print("If you do use re.escape, the input is valid: {}.".format(is_valid_regex(user_input, escape=True)))

If your user input is: \t+, you will get:

>> If you don't use re.escape, the input is valid: True.
>> If you do use re.escape, the input is valid: True.

However, if your user input is: [\t+, you will get:

>> If you don't use re.escape, the input is valid: False.
>> If you do use re.escape, the input is valid: True.

Notice that it was indeed an invalid regex, however, by using re.escape your regex becomes valid. That is because re.escape escapes all your special characters, treating them as literal characters. So in the case that you have \t+, if you use re.escape you will be looking for a sequence of characters: \, t, + and not for a tab character.

Checking your lookup string

Take the string you want to look into.
For example, here is a string where the character between quotes is supposed to be a tab:

string_to_look_in = 'This is a string with a "  " tab character.'

You can manually check for tabs by using the repr function.

print(string_to_look_in)
print(repr(string_to_look_in))
>> This is a string with a "    " tab character.
>> 'This is a string with a "\t" tab character.'

Notice that by using repr the \t representation of the tab character gets displayed.

Test script

Here is a script for you to try all these things:

import re

string_to_look_in = 'This is a string with a " " tab character.'
print("String to look into:", string_to_look_in)
print("String to look into:", repr(string_to_look_in), "\n")

user_input = input("Input regex:") # check console, it is expecting your input

print("\nUser typed: '{}'. Input type: {}.".format(user_input, type(user_input)))


def is_valid_regex(regex_from_user: str, escape: bool) -> bool:
try:
if escape: re.compile(re.escape(regex_from_user))
else: re.compile(regex_from_user)
is_valid = True
except re.error:
is_valid = False
return is_valid

print("\nIf you don't use re.escape, the input is valid: {}.".format(is_valid_regex(user_input, escape=False)))
print("If you do use re.escape, the input is valid: {}.".format(is_valid_regex(user_input, escape=True)))

if is_valid_regex(user_input, escape=False):
regex = re.compile(user_input)
print("\nRegex compiled as '{}' with type {}.".format(repr(regex), type(regex)))

matches = regex. findall(string_to_look_in)
print('Mathces found:', matches)

else:
print('\nThe regex was not valid, so no matches.')

Google aps script : properly convert string to regular expression

I'd recommend using a template literal so you need a lot less back slashes:

const r = new RegExp(`^((?!\\b${keyword}\\b).)*$`, 'i')

Also, you may want to escape the keywords to support special characters. If you are interested, you can see this answer to see how to do it.

How to convert text input to regex pattern in PHP

You were close with this line:

$userPattern = "/^" .  preg_quote($_POST['txt1']). "$/";

But missed an extra parameter which indicates the delimiter that you are using, as explained in http://www.php.net//manual/en/function.preg-quote.php:

$userPattern = "/^" . preg_quote($_POST['txt1'], '/') . "$/";

The delimiter can be any "any non-alphanumeric, non-backslash, non-whitespace character" as explained here: http://www.php.net//manual/en/regexp.reference.delimiters.php

Regex - Get string after last /, if that string contains something (user input)

You may use this regex:

\S*/([^/\s]*dog\b\S*)

RegEx Demo

RegEx Details:

  • \S*/: Match longest match before last /. \S matches any non-whitespace character
  • (: Start capture group
    • [^/\s]*: Match 0 or more of any character that is not / and not a whitespace
    • dog\b: Match text dog ending with a word boundary
    • \S*: Match remaining string till end
  • ): capture group

How to replace a user input in regular expression?

I think you're just missing string.Format():

string pattern = string.Format(@"\b(?!{0})\w+\b", UserInput);

Transform a "Regex" string to actual Regex in Javascript

You can initialize a regex with the RegExp method (documentation on MDN):

The RegExp constructor creates a regular expression object for matching text with a pattern.

const regex2 = new RegExp('^(?:\\d{8}|\\d{11})$');
console.log(regex2); // /^(?:\d{8}|\d{11})$/


Related Topics



Leave a reply



Submit