Find USA phone numbers in python script
If you are interested in learning Regex, you could take a stab at writing it yourself. It's not quite as hard as it's made out to be. Sites like RegexPal allow you to enter some test data, then write and test a Regular Expression against that data. Using RegexPal, try adding some phone numbers in the various formats you expect to find them (with brackets, area codes, etc), grab a Regex cheatsheet and see how far you can get. If nothing else, it will help in reading other peoples Expressions.
Edit:
Here is a modified version of your Regex, which should also match 7 and 10-digit phone numbers that lack any hyphens, spaces or dots. I added question marks after the character classes (the []s), which makes anything within them optional. I tested it in RegexPal, but as I'm still learning Regex, I'm not sure that it's perfect. Give it a try.
(\d{3}[-\.\s]??\d{3}[-\.\s]??\d{4}|\(\d{3}\)\s*\d{3}[-\.\s]??\d{4}|\d{3}[-\.\s]??\d{4})
It matched the following values in RegexPal:
000-000-0000
000 000 0000
000.000.0000
(000)000-0000
(000)000 0000
(000)000.0000
(000) 000-0000
(000) 000 0000
(000) 000.0000
000-0000
000 0000
000.0000
0000000
0000000000
(000)0000000
Python phone number regex
https://docs.python.org/3/library/re.html#re.findall
Findall returns lists of tuples, with each tuple representing the groups from one match. You are grouping the whitespaces but you're not grouping the actual digits.
Try a regex that groups the digits too:
r"(\+420)?(\s*)?(\d{3})(\s*)?\(d{3})(\s*)?\(d{3})"
E.g.
def detect_numbers(text):
phone_regex = re.compile(r"(\+420)?\s*?(\d{3})\s*?(\d{3})\s*?(\d{3})")
print(phone_regex.findall(text))
detect_numbers("so I need to match +420 123 123 123, also 123 123 123, also +420123123123 and also 123123123. Can y")
prints:
[('+420', '123', '123', '123'), ('', '123', '123', '123'), ('+420', '123', '123', '123'), ('', '123', '123', '123')]
You could then string-join the group matches to get the numbers, e.g.
def detect_numbers(text):
phone_regex = re.compile(r"(\+420)?\s*?(\d{3})\s*?(\d{3})\s*?(\d{3})")
groups = phone_regex.findall(text)
for g in groups:
print("".join(g))
detect_numbers("so I need to match +420 123 123 123, also 123 123 123, also +420123123123 and also 123123123. Can y")
prints:
+420123123123
123123123
+420123123123
123123123
python regex - finding phone number
A quick fix for you pattern is
\+?\d+(?:[- \)]+\d+)+
See the regex demo. Note that use of the non-capturing group that helps avoid creating lists of tuples in the result of the re.findall
call.
Details
\+?
- an optional (1 or 0) plus signs\d+
- 1+ digits(?:
- start of a non-capturing group:[- )]+
- 1 or more-
,spaces,
)` chars\d+
- 1+ digits
)+
- 1 or more repetitions (the whole(?:...)
sequence of patterns are quantified this way, both symbols and digits are required at least once and as a sequence).
Python demo:
import re
rx = r"\+?\d+(?:[- )]+\d+)+"
s = "+00 0000 0000 is my number and +44-787-77950 was my uk number"
print(re.findall(rx, s))
# => ['+00 0000 0000', '+44-787-77950']
Matching phone numbers, regex
I think you are looking for something like this:
(\(\d{3}\) \d{3}-\d{4})
From the Python docs:
{m}
Specifies that exactly m copies of the previous RE should be
matched; fewer matches cause the entire RE not to match. For example,
a{6} will match exactly six 'a' characters, but not five.
(\(\d\d\d\) \d\d\d-\d\d\d\d)
would also work, but, as you said in your question, is rather repetitive. Your other suggested pattern, (\([0-9]+\) [0-9]+-[0-9]+)
, gives false positives on input such as (1) 2-3
.
Find a valid phone number using regular expression in python
It matches 123-111-1234
(Everything except the first digit). Change your regex to: ^\d{3}-\d{3}-\d{4}$
to make sure it only matches the whole input (example).
Extract phone number using regex with different formats python
You could use
\b(?:03|7[016])[- /]?\d{3} ?\d{3}\b
Explanation
\b
A word boundary(?:03|7[016])
Match one of03
70
71
76
[- /]?
Optionally match-
a space or/
\d{3} ?\d{3}
Match 6 digits with an optional space after the 3rd digits\b
A word boundary
Regex demo | Python demo
For example
import re
regex = r"\b(?:03|7[016])[- /]?\d{3} ?\d{3}\b"
test_str = "Hi my name is marc and my phone number is 03-123456 and i would like 2 bottles of water 0.5L"
matches = re.search(regex, test_str)
if matches:
print(matches.group())
Output
03-123456
Python regular expression for phone numbers
I suggest using this pattern:
(?:\B\+ ?49|\b0)(?: *[(-]? *\d(?:[ \d]*\d)?)? *(?:[)-] *)?\d+ *(?:[/)-] *)?\d+ *(?:[/)-] *)?\d+(?: *- *\d+)?
See the regex demo. Note it is written based on your comment saying the phone numbers starts with +49
or a 0
and on the list of examples you provided. It may be considered "work in progress" since you have not provided more specific rules for phone number extraction.
Pattern details
(?:\B\+ ?49|\b0)
- a+
, optional space,49
or a0
, both substrings cannot be preceded with a word char(?: *[(-]? *\d(?:[ \d]*\d)?)?
- an optional substring matching 0+ spaces, then an optional(
or-
, 0+ spaces, a digit and then an optional sequence of digits/spaces followed with a digit*(?:[)-] *)?
- 0+ spaces and then an optional sequence of)
or-
followed with 0+ spaces\d+
- 1+ digits*
- 0+ spaces(?:[/)-] *)?
- an optional sequence of/
,)
or-
followed with 0+ spaces\d+
- 1+ digits*(?:[/)-] *)?
- 0+ spaces and then an optional sequence of/
,)
or-
followed with 0+ spaces\d+
- 1+ digits(?: *- *\d+)?
- an optional sequence: 0+ spaces,-
, 0+ spaces, 1+ digits.
Related Topics
Reading a Text File and Converting String to Float
What Is the Correct Format to Write Float Value to File in Python
How to Write a Lambda Function That Is Conditional on Two Variables (Columns) in Python
How to Click on an Element from the Dropdown Menu Through Python and Selenium
What Is the Correct Way to Make My Pyqt Application Quit When Killed from the Console (Ctrl-C)
Counting the Number of Duplicates in a List
Finding Index of an Item Closest to the Value in a List That'S Not Entirely Sorted
Parsing Outlook .Msg Files With Python
Removing Punctuations and Spaces in a String Without Using Regex
Find Matching Rows in 2 Dimensional Numpy Array
Finding an Exact Substring in a String in Python
Sharing a Complex Object Between Processes
Plot Line Graph from Pandas Dataframe (With Multiple Lines)
Counting CSV Column Occurrences on the Fly in Python
Python - How to Sort Multidimensional List to Two-Dimensional List
Python/Pandas: How to Match List of Strings With a Dataframe Column
How to Concisely Replace Column Values Given Multiple Conditions