In Python, how to check if a string only contains certain characters?
Final(?) edit
Answer, wrapped up in a function, with annotated interactive session:
>>> import re
>>> def special_match(strg, search=re.compile(r'[^a-z0-9.]').search):
... return not bool(search(strg))
...
>>> special_match("")
True
>>> special_match("az09.")
True
>>> special_match("az09.\n")
False
# The above test case is to catch out any attempt to use re.match()
# with a `$` instead of `\Z` -- see point (6) below.
>>> special_match("az09.#")
False
>>> special_match("az09.X")
False
>>>
Note: There is a comparison with using re.match() further down in this answer. Further timings show that match() would win with much longer strings; match() seems to have a much larger overhead than search() when the final answer is True; this is puzzling (perhaps it's the cost of returning a MatchObject instead of None) and may warrant further rummaging.
==== Earlier text ====
The [previously] accepted answer could use a few improvements:
(1) Presentation gives the appearance of being the result of an interactive Python session:
reg=re.compile('^[a-z0-9\.]+$')
>>>reg.match('jsdlfjdsf12324..3432jsdflsdf')
True
but match() doesn't return True
(2) For use with match(), the ^
at the start of the pattern is redundant, and appears to be slightly slower than the same pattern without the ^
(3) Should foster the use of raw string automatically unthinkingly for any re pattern
(4) The backslash in front of the dot/period is redundant
(5) Slower than the OP's code!
prompt>rem OP's version -- NOTE: OP used raw string!
prompt>\python26\python -mtimeit -s"t='jsdlfjdsf12324..3432jsdflsdf';import
re;reg=re.compile(r'[^a-z0-9\.]')" "not bool(reg.search(t))"
1000000 loops, best of 3: 1.43 usec per loop
prompt>rem OP's version w/o backslash
prompt>\python26\python -mtimeit -s"t='jsdlfjdsf12324..3432jsdflsdf';import
re;reg=re.compile(r'[^a-z0-9.]')" "not bool(reg.search(t))"
1000000 loops, best of 3: 1.44 usec per loop
prompt>rem cleaned-up version of accepted answer
prompt>\python26\python -mtimeit -s"t='jsdlfjdsf12324..3432jsdflsdf';import
re;reg=re.compile(r'[a-z0-9.]+\Z')" "bool(reg.match(t))"
100000 loops, best of 3: 2.07 usec per loop
prompt>rem accepted answer
prompt>\python26\python -mtimeit -s"t='jsdlfjdsf12324..3432jsdflsdf';import
re;reg=re.compile('^[a-z0-9\.]+$')" "bool(reg.match(t))"
100000 loops, best of 3: 2.08 usec per loop
(6) Can produce the wrong answer!!
>>> import re
>>> bool(re.compile('^[a-z0-9\.]+$').match('1234\n'))
True # uh-oh
>>> bool(re.compile('^[a-z0-9\.]+\Z').match('1234\n'))
False
Check if a string contains only given characters
You could use any
with a generator expression:
if any(c not in 'abc' for c in _str): # Don't use str as a name.
print('Wrong character')
Check if a string contains only characters from a regular expression
this reproduces what you want: ^
matches the beginning of the string $
the end. in between there are repeating +
characters \w = [A-Za-z0-9_]
.
legal_characters = '^\w+$'
update
after the modification of your question this is my suggestion:
^
matches the beginning of the string $
the end. in between there are repeating +
elements of [*-]
:
legal_characters = '^[*-]+$'
there is no need to escape *-
with \
.
as pointed out by Maroun Maroun you can leave out the ^
as match
scans the beginning of the string anyway:
legal_characters = '[*-]+$'
Test if string ONLY contains given characters
Assuming the discrepancy in your example is a typo, then this should work:
my_list = ['aba', 'acba', 'caz']
result = [s for s in my_list if not s.strip('abc')]
results in ['aba', 'acba']
. string.strip(characters) will return an empty string if the string to be stripped contains nothing but characters in the input. Order of the characters should not matter.
How to check if a string contains only specific characters in python?
There are a couple of issues here. First .
and -
are meta-characters, and need to escaped with a \
. Second, you don't really need the loop here - add a *
to indicate any number of these characters, and qualify the regex between ^
and $
to signify the entire string needs to be made up of these characters:
import re
e = 'p12/5@gmail.com'
p = re.compile(r'^[a-zA-Z0-9@\.\-_]*$')
if p.match(e):
print('okay')
else:
print('n')
Related Topics
Using a Global Variable With a Thread
How to Open Different Urls At the Same Time by Using Python Selenium
If-Condition With Multiple Actions in Robot Framework
How to Update/Delete Rows in Bigquery from the Python API
Pandas - Calculate Average of Columns With Condition Based on Values in Other Columns
Convert a Standard Python Key Value Dictionary List to Pyspark Data Frame
Permissionerror: [Errno 13] Permission Denied Flask.Run()
String Concatenate Typeerror: Can Only Concatenate Str (Not "Int") to Str"
How to Hide Tkinter Python Gui
Django: Calling .Update() on a Single Model Instance Retrieved by .Get()
Sum of Square Differences (Ssd) in Numpy/Scipy
Replacing Special Characters in a List in Python
How to Find Words in a List That Starts With a Certain Letter the User Asked For
How to Convert Np.Int64 into Python Int64 for Pandasseries