What Is the Equivalent of Regexp_Substr in MySQL

What is the equivalent of REGEXP_SUBSTR in mysql?

"I didn't find the REGEXP_SUBSTR function in MySQL docs. But I am hoping that it exists.."

Yes, starting from MySQL 8.0 it is supported. Regular Expressions:

REGEXP_SUBSTR(expr, pat[, pos[, occurrence[, match_type]]])

Returns the substring of the string expr that matches the regular expression specified by the pattern pat, NULL if there is no match. If expr or pat is NULL, the return value is NULL.

How to use REGEXP_SUBSTR or REGEXP_EXTRACT in MySQL Workbench with database hosted on Google Cloud SQL?

Assuming you just want the first two words, you could use SUBSTRING_INDEX here, which should work on most versions of MySQL:

SELECT SUBSTRING_INDEX('LVLS 45 Beam 1', ' ', 2) AS item
FROM yourTable;

If you also need to assert that the first term contains capitals, with the second digits, you could use REGEXP:

SELECT
CASE WHEN 'LVLS 45 Beam 1' REGEXP '^[A-Z]+ [0-9]+'
THEN SUBSTRING_INDEX('LVLS 45 Beam 1', ' ', 2)
ELSE 'no match' END AS item
FROM yourTable;

Use of REGEXP_SUBSTR to get date values from string

Since you are using MySQL 8+, it means you also have access to the REGEXP_REPLACE function, which is suitable for isolating the portion of the string which contains the two dates. In the CTE below, I isolate the date string, then in a subquery on that CTE, I fish out the two dates in separate columns using SUBSTRING_INDEX.

WITH cte AS (
SELECT
text,
REGEXP_REPLACE(text, '^.*\(([0-9]{2}-[0-9]{2}-[0-9]{4} - [0-9]{2}-[0-9]{2}-[0-9]{4})\).*$', '$1') AS dates
FROM yourTable
)

SELECT
text,
SUBSTRING_INDEX(dates, ' - ', 1) AS first_date,
SUBSTRING_INDEX(dates, ' - ', -1) AS second_date
FROM cte;

Demo

Here is an explanation of the regex pattern used:

^                                   from the start of the string
.* match any content, until hitting
\( '(' which is followed by
( (capture what follows)
[0-9]{2}-[0-9]{2}-[0-9]{4} a single date
- -
[0-9]{2}-[0-9]{2}-[0-9]{4} another single date
) (stop capture)
\) ')'
.* match the remainder of the content
$ end of the string

Note that we include a pattern which matches the entire input, which is a requirement since we want to use a capture group. Also, note that REGEXP_SUBSTR might have been viable here, but it could run the risk that you get false positives, in the event that a date could appear elsewhere besides the terms in parentheses.

looking for REGEXP_REPLACE() alternative on MySQL 5.7.27

This is not a final answer and if someone have a good or closer example to what i have asked then go ahead and post an answer. I would like to see it!

What i did is i just used LIKE

SELECT * FROM table WHERE substring_index(product,' ',1) LIKE 'VALUE%'

I know that this can show wrong data for some case but for my needs this way is acceptable.

MySQL REGEXP_SUBSTR() escaping issue?

In MySQL v8.x that supports ICU regex, you may use

SELECT Description, REGEXP_SUBSTR(Description, '(?im)(?=\\b(?:[0-9/]+(?:\\.[0-9/]+)?\\s*(?:[X-]|$)|[0-9/\\s]+(?:\\.[0-9/]+)?(?:[CM]?M|["”TH])))[0-9/\\s.]+(?:[CM]?M|["”TH])?(?:\\s*[/X-]\\s*[0-9/\\s.]+(?:[CM]?M|["”TH])?)?(?=[.\\s()]|$)') AS Size FROM tbl_Example

The main points:

  • The flags can be used as inline options, (?mi), m will enable multiline mode when ^ and $ match start/end of a line and i will enable case insensitive mode
  • [$] matches a $ char, to match end of a line position, you need to move $ out of a character class, use alternations in this case ((?=[\.\s\(\)$]) -> (?=[.\s()]|$), yes, do not escape what does not have to be escaped, too)
  • Matching fractional number part, it is better to use a (?:\.[0-9/]+)? like pattern (it matches an optional sequence of . and then 1 or more digits or /s)
  • (C|M)? is better written as [CM]? (a character class is more efficient)


Related Topics



Leave a reply



Submit