Split Alphanumeric String Between Leading Digits and Trailing Letters

Split alphanumeric string between leading digits and trailing letters

You can use preg_split using lookahead and lookbehind:

print_r(preg_split('#(?<=\d)(?=[a-z])#i', "0982asdlkj"));

prints

Array
(
[0] => 0982
[1] => asdlkj
)

This only works if the letter part really only contains letters and no digits.

Update:

Just to clarify what is going on here:

The regular expressions looks at every position and if a digit is before that position ((?<=\d)) and a letter after it ((?=[a-z])), then it matches and the string gets split at this position. The whole thing is case-insensitive (i).

How to split strings into text and number?

I would approach this by using re.match in the following way:

import re
match = re.match(r"([a-z]+)([0-9]+)", 'foofo21', re.I)
if match:
items = match.groups()
print(items)
>> ("foofo", "21")

Extracting alpha and numeric parts from a column

Yes, it is:

SELECT
@col:=col1 AS col,
@num:=REVERSE(CAST(REVERSE(@col) AS UNSIGNED)) AS num,
SUBSTRING_INDEX(@col, @num, 1) AS word
FROM
tab1

-will work only if your column contain letters and then numbers (like you've described). That's why double REVERSE() is needed (otherwise CAST() will have no effect). Check this demo.

Remove leading and trailing numbers from string, while leaving 2 numbers, using sed or awk

You may try this sed:

sed -E 's/^[0-9]+([0-9]{2})|([0-9]{2})[0-9]+$/\1\2/g' file

51word24
anotherword
12yetanother1
62andherese123anotherline43
23andherese123anotherline45
53andherese123anotherline41

Command Details:

  • ^[0-9]+([0-9]{2}): Match 1+ digits at start if that is followed by 2 digits (captured in a group) and replace with 2 digits in group #1.
  • ([0-9]{2})[0-9]+$: Match 1+ digits at the end if that is preceded by 2 digits (captured in a group) and replace with 2 digits in group #2.

Python - Splitting numbers and letters into sub-strings with regular expression

What's wrong with re.findall ?

>>> s = '125km'
>>> re.findall(r'[A-Za-z]+|\d+', s)
['125', 'km']

[A-Za-z]+ matches one or more alphabets. | or \d+ one or more digits.

OR

Use list comprehension.

>>> [i for i in re.split(r'([A-Za-z]+)', s) if i]
['125', 'km']
>>> [i for i in re.split(r'(\d+)', s) if i]
['125', 'km']


Related Topics



Leave a reply



Submit