Function to Return Only Alpha-Numeric Characters from String

Function to return only alpha-numeric characters from string?

Warning: Note that English is not restricted to just A-Z.

Try this to remove everything except a-z, A-Z and 0-9:

$result = preg_replace("/[^a-zA-Z0-9]+/", "", $s);

If your definition of alphanumeric includes letters in foreign languages and obsolete scripts then you will need to use the Unicode character classes.

Try this to leave only A-Z:

$result = preg_replace("/[^A-Z]+/", "", $s);

The reason for the warning is that words like résumé contains the letter é that won't be matched by this. If you want to match a specific list of letters adjust the regular expression to include those letters. If you want to match all letters, use the appropriate character classes as mentioned in the comments.

Quick way to extract alphanumeric characters from a string in Javascript

Many ways to do it, basic regular expression with replace

var str = "123^&*^&*^asdasdsad";var clean = str.replace(/[^0-9A-Z]+/gi,"");console.log(str);console.log(clean);

How to keep only alphanumeric characters of a string?

This might help :

def process_text(text):
from string import ascii_letters as al
return ' '.join(i for i in text.split() if any(j for j in al if j in i))
s = 'RegExr was created by gskinner34 in the summer of 69.'
print(process_text(s))

Output :

'RegExr was created by gskinner34 in the summer of'

Swift - Getting only AlphaNumeric Characters from String

You may directly use replacingOccurrences (that removes all non-overlapping matches from the input string) with [^A-Za-z0-9]+ pattern:

let str = "_<$abc$>_"
let pattern = "[^A-Za-z0-9]+"
let result = str.replacingOccurrences(of: pattern, with: "", options: [.regularExpression])
print(result) // => abc

The [^A-Za-z0-9]+ pattern is a negated character class that matches any char but the ones defined in the class, one or more occurrences (due to + quantifier).

See the regex demo.

extract only alphanumeric token from text string of certain length and negate only letters or only digits in python

In the pattern that you tried, this part (?![0-9]*$) matches a position where what is directly to the right are not only digits till the end of the string, which are all positions except after the characters f89

Then this part [a-zA-Z0-9]+ will match all allowed in the character class, and will match all words in the text.


You can assert not only digits and at least 4 characters. Then match at least a single digit.

\b(?=[a-zA-Z0-9]{4})(?![0-9]+\b)[a-zA-Z]*[0-9][a-zA-Z0-9]*\b

Explanation

  • \b A word boundary to prevent a partial match
  • (?=[a-zA-Z0-9]{4}) Positive lookahead, assert 4 chars
  • (?![0-9]+\b) Negative lookahead, assert not only digits
  • [a-zA-Z]*[0-9][a-zA-Z0-9]* Match at least a single digit
  • \b A word boundary

Regex demo

import re

text = 'latitude 7400 ws9083r f89'
match = re.findall(r'\b(?=[a-zA-Z0-9]{4})(?![0-9]+\b)[a-zA-Z]*[0-9][a-zA-Z0-9]*\b', text)

print(match)

Output

['ws9083r']

Stripping everything but alphanumeric chars from a string in Python

I just timed some functions out of curiosity. In these tests I'm removing non-alphanumeric characters from the string string.printable (part of the built-in string module). The use of compiled '[\W_]+' and pattern.sub('', str) was found to be fastest.

$ python -m timeit -s \
"import string" \
"''.join(ch for ch in string.printable if ch.isalnum())"
10000 loops, best of 3: 57.6 usec per loop

$ python -m timeit -s \
"import string" \
"filter(str.isalnum, string.printable)"
10000 loops, best of 3: 37.9 usec per loop

$ python -m timeit -s \
"import re, string" \
"re.sub('[\W_]', '', string.printable)"
10000 loops, best of 3: 27.5 usec per loop

$ python -m timeit -s \
"import re, string" \
"re.sub('[\W_]+', '', string.printable)"
100000 loops, best of 3: 15 usec per loop

$ python -m timeit -s \
"import re, string; pattern = re.compile('[\W_]+')" \
"pattern.sub('', string.printable)"
100000 loops, best of 3: 11.2 usec per loop

RegEx for Javascript to allow only alphanumeric

/^[a-z0-9]+$/i

^ Start of string
[a-z0-9] a or b or c or ... z or 0 or 1 or ... 9
+ one or more times (change to * to allow empty string)
$ end of string
/i case-insensitive

Update (supporting universal characters)

if you need to this regexp supports universal character you can find list of unicode characters here.

for example: /^([a-zA-Z0-9\u0600-\u06FF\u0660-\u0669\u06F0-\u06F9 _.-]+)$/

this will support persian.

Create a new string that contains only alphanumeric characters from the old string

  1. Allocate a new char array of the same length as your string. Convince yourself that this is enough space. Don't forget the NUL.
  2. Loop through the string, copying to the new string only those characters that are alphanumeric. You can't do this portably without also including <ctype.h> and using a function/macro from that header, unless you're going to enumerate all characters that you consider alphanumeric.
  3. Again, don't forget the NUL.

Find spaces and alphanumeric characters in a string C Language

Some of your && must replaced by || because one character is a number OR a lower case OR a space OR an upper case, but it cannot be all these things at a time :

check = 1;
for (int i = 0; i < strlen(str); i++)
{
if (! (((str[i] >= '0') && (str[i] <= '9')) ||
((str[i] >= 'a') && (str[i] <= 'z')) ||
(str[i] == ' ') ||
((str[i] >= 'A') && (str[i] <= 'Z')))) {
check = -1;
break;
}
}


Related Topics



Leave a reply



Submit