Finding a span tag with a 'variable'? but no class - Beautiful soup/Python
You can select a <span>
element with a specific attribute (such as data-automation
) by passing an attrs
dict as a keyword argument to .find()
or .find_all()
. See the documentation.
To find <span>
's where data-automation
has any value:
soup.find('span', attrs={'data-automation': True})
Where data-automation
has a specific value:
soup.find('span', attrs={'data-automation': 'jobListingDate'})
Python 3 Beautifulsoup: Get span tag value with specific text which is also randomly placed within the html tree
Use regular expression.
import re
html='''<div class="sk-expander-content" style="display: block;">
<ul>
<li>
<span>Third Party Liability</span>
<span>€756.62</span>
</li>
<li>
<span>Fire & Theft</span>
<span>€15.59</span>
</li>
</ul>
</div>
<div class="sk-expander-content" style="display: block;">
<ul>
<li>
<span>Fire & Theft</span>
<span>€756.62</span>
</li>
<li>
<span>Third Party Liability</span>
<span>€15.59</span>
</li>
</ul>
</div>'''
soup = BeautifulSoup(html, "html.parser")
for item in soup.find_all(class_="sk-expander-content"):
for span in item.find_all('span',text=re.compile("€(\d+).(\d+)")):
print(span.find_previous_sibling('span').text)
print(span.text)
Output:
Third Party Liability
€756.62
Fire & Theft
€15.59
Fire & Theft
€756.62
Third Party Liability
€15.59
UPDATE:
If you want to get first node value.Then use find()
instead of find_all()
.
import re
html='''<div class="sk-expander-content" style="display: block;">
<ul>
<li>
<span>Third Party Liability</span>
<span>€756.62</span>
</li>
<li>
<span>Fire & Theft</span>
<span>€15.59</span>
</li>
</ul>
</div>
<div class="sk-expander-content" style="display: block;">
<ul>
<li>
<span>Fire & Theft</span>
<span>€756.62</span>
</li>
<li>
<span>Third Party Liability</span>
<span>€15.59</span>
</li>
</ul>
</div>'''
soup = BeautifulSoup(html, "html.parser")
for span in soup.find(class_="sk-expander-content").find_all('span',text=re.compile("€(\d+).(\d+)")):
print(span.find_previous_sibling('span').text)
print(span.text)
Get value from first span tag in beautifulsoup
try using a css selector
,
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, "html.parser")
print(soup.select_one("td > span").text)
$39,465,077,974.88
How to get value inside span tag using beautiful soup 3
You can index the contents list print 'Contents:',tag.contents[0]
or better just to pull the text from the td:
tags = soup('span')
for tag in tags:
print('Contents:',tag.text)
Which using your link would give you:
('Contents:', u'100')
('Contents:', u'100')
('Contents:', u'97')
('Contents:', u'95')
('Contents:', u'95')
('Contents:', u'94')
('Contents:', u'93')
('Contents:', u'92')
('Contents:', u'84')
('Contents:', u'78')
('Contents:', u'78')
('Contents:', u'76')
('Contents:', u'69')
('Contents:', u'64')
('Contents:', u'60')
('Contents:', u'58')
('Contents:', u'53')
('Contents:', u'51')
('Contents:', u'49')
('Contents:', u'49')
('Contents:', u'45')
('Contents:', u'45')
('Contents:', u'45')
('Contents:', u'44')
('Contents:', u'39')
('Contents:', u'38')
('Contents:', u'37')
('Contents:', u'35')
('Contents:', u'34')
('Contents:', u'33')
('Contents:', u'32')
('Contents:', u'32')
('Contents:', u'30')
('Contents:', u'29')
('Contents:', u'28')
('Contents:', u'27')
('Contents:', u'21')
('Contents:', u'19')
('Contents:', u'16')
('Contents:', u'16')
('Contents:', u'15')
('Contents:', u'13')
('Contents:', u'13')
('Contents:', u'12')
('Contents:', u'11')
('Contents:', u'9')
('Contents:', u'6')
('Contents:', u'2')
('Contents:', u'1')
('Contents:', u'1')
The u
just means you have unicode strings, you can call str(tag.text))
if you really want to remove it or if you want integers you will have to call int(tag.text))
. Also I would recommend you upgrade to bs4.
Beautiful soup - Extract value from span class
Value 5.3%
is in span tag with class attribute 'red'.
html='''
<h2>
<span class="group_name">All Files</span>
(<span class="cover"><span class="red">5.3%</span></span>
covered at
<span class="cover_strength">
<span class="green"> 545 </span>
</span> hits/line)
</h2>
'''
from bs4 import BeautifulSoup
#print(soup.prettify())
soup = BeautifulSoup(html, 'html.parser')
result = soup.find("span", {"class": "red"})
print(result.text)
Output:
5.3%
Extract data from span, using BeautifulSoup, python
Using find_all methodYou can use find_all() method to find the text content of html
and use just your own link
import requests
from bs4 import BeautifulSoup
result = requests.get(yourUrl)
src = result.content
soup = BeautifulSoup(src, "html.parser")
tds = soup.find_all("td")
contents = []
for td in tds:
content = td.find("span", {"class":"sc-15yy2pl-0 hzgCfk"}).find("span", {"class":"icon-Caret-down"}).text
contents.append(result)
Related Topics
Find the Index of a Value in a 2D Array
How to Convert a 1 Channel Image into a 3 Channel With Opencv2
Possible to Loop Through Excel Files With Differently Named Sheets, and Import into a List
Sub Totals and Grand Totals in Python
How to Ask a Set of Questions Multiple Times Based on User Input
How to Fill Empty Cell Value in Pandas With Condition
How to Open a Password Protected Excel File Using Python
How to Sum Dictionaries Values With Same Key Inside a List
How to Delete a Specific Line in a File
How to Suppress Scientific Notation When Printing Float Values
Valueerror: Cannot Reshape Array of Size 30470400 into Shape (50,1104,104)
Easiest Way to Replace a String Using a Dictionary of Replacements
Programme to Print Mulitples of 5 in a Range Specified by User
How to Get the Response Json Data from Network Call in Xhr Using Python Selenium Web Driver Chorme
How to Change Python Version in Command Prompt If I Have 2 Python Version Installed
Convert Numbers into Corresponding Letter Using Python
How to Fix the 403:Insufficient Authentication Scopes Error from Google Analytics User Deletion API