How to extract numbers from a string in Python?
If you only want to extract only positive integers, try the following:
>>> txt = "h3110 23 cat 444.4 rabbit 11 2 dog"
>>> [int(s) for s in txt.split() if s.isdigit()]
[23, 11, 2]
I would argue that this is better than the regex example because you don't need another module and it's more readable because you don't need to parse (and learn) the regex mini-language.
This will not recognize floats, negative integers, or integers in hexadecimal format. If you can't accept these limitations, jmnas's answer below will do the trick.
Extract Number from String in Python
You can filter
the string by digits using str.isdigit
method,
>>> int(filter(str.isdigit, str1))
3158
Is there a better way to extract numbers from a string in python 3
Here's one way you can do the regex search that @Barmar suggested:
>>> import re
>>> int(re.search("\d+", "V70N-HN")[0])
70
Get only numbers from string in python
you can use regex:
import re
just = 'Standard Price:20000'
price = re.findall("\d+", just)[0]
OR
price = just.split(":")[1]
Extract numbers from an Array which has more than one string element
Use re.search
, which extract the first match to the pattern of 1 or more digit, followed by 3 zeros.
import re
my_array = ['STK72184 4/28/2022 50 from Exchange Balance, 50 from Earning Balance & 10 from Bonus 5000 Regular 10/20/2023 Approved 4/28/2022',
'STK725721 4/27/2022 50 from Exchange Balance, 40 from Earning Balance & 10 from Bonus Balance 5000 Regular 10/19/2023 Approved 4/27/2022',
'STK725721 4/27/2022 50 from Exchange Balance, 40 from Earning Balance & 10 from Bonus Balance 15000 Regular 10/19/2023 Approved 4/27/2022',
'STK722222 4/26/2022 50 from Exchange Balance, 40 from Earning Balance & 10 from Bonus Balance 10000 Regular 10/18/2023 Approved 4/26/2022']
# If you want strings:
nums = [re.search(r'\d+000', s)[0] for s in my_array]
print(nums)
# ['5000', '5000', '15000', '10000']
# If you want integers:
nums = [int(re.search(r'\d+000', s)[0]) for s in my_array]
print(nums)
# [5000, 5000, 15000, 10000]
How to Extract Numbers from String Column in Pandas with decimal?
If you want to match the numbers followed by OZ
You could write the pattern as:
(\d*\.?\d+)\s*OZ\b
Explanation
(
Capture group 1 (the value will be picked up be str.extract)\d*\.?\d+
Match optional digits, optional dot and 1+ digits)
Close group 1\s*OZ\b
Match optional whitspace chars and thenOZ
followed by a word boundary
See a regex demo.
import pandas as pd
data= [
"tld los 16OZ",
"HSJ14 OZ",
"hqk 28.3 OZ",
"rtk .7 OZ",
"ahdd .92OZ",
"aje 0.22 OZ"
]
df = pd.DataFrame(data, columns=["Product"])
df['Numbers'] = df['Product'].str.extract(r'(\d*\.?\d+)\s*OZ\b')
print(df)
Output
Product Numbers
0 tld los 16OZ 16
1 HSJ14 OZ 14
2 hqk 28.3 OZ 28.3
3 rtk .7 OZ .7
4 ahdd .92OZ .92
5 aje 0.22 OZ 0.22
Extract numbers only from the strings in which a keyword is mentioned
Use list comprehension with re.search
and an if
. Note that the second example shows that regex-based search can be quite powerful in pulling out just the patterns you want, thus I almost always prefer it to exact string match (except when performance is critical). Also, I renamed array
to lst
(this data structure is called list in Python, and array is some other languages).
import re
my_lst = ['STK72184 4/28/2022 50 from Exchange Balance, 50 from Earning Balance & 10 from Bonus 25000 Regular 10/20/2023 Approved 4/28/2022',
'STK725721 4/27/2022 50 from Exchange Balance, 40 from Earning Balance & 10 from Bonus Balance 5000 Regular 10/19/2023 Closed 4/27/2022',
'STK725721 4/27/2022 50 from Exchange Balance, 40 from Earning Balance & 10 from Bonus Balance 15000 Regular 10/19/2023 Closed 4/27/2022',
'STK722222 4/26/2022 50 from Exchange Balance, 40 from Earning Balance & 10 from Bonus Balance 10000 Regular 10/18/2023 Approved 4/26/2022']
nums = [int(re.search(r'\d+000', s)[0]) for s in my_lst if re.search(r'Approved', s)]
print(nums)
# [25000, 10000]
nums = [int(re.search(r'\d+000', s)[0]) for s in my_lst if re.search(r'4/2[67]', s)]
print(nums)
# [5000, 15000, 10000]
Related Topics
Pyplot Common Axes Labels for Subplots
How to Upgrade to Python 3.6 with Conda
How to Get the Current Time in Milliseconds in Python
Why Doesn't Os.Path.Join() Work in This Case
Display Fullscreen Mode on Tkinter
How to Force Django to Ignore Any Caches and Reload Data
How to Use Youtube-Dl from a Python Program
How to Detect the Python Version at Runtime
Pass a Parameter to a Fixture Function
How to Give a Pandas/Matplotlib Bar Graph Custom Colors
Check If String Contains Only Whitespace
Can't Subtract Offset-Naive and Offset-Aware Datetimes
How to Remove Gaps Between Subplots in Matplotlib