How to get text from span tag in BeautifulSoup
You can use a css selector, pulling the span you want using the title text :
soup = BeautifulSoup("""<div class="systemRequirementsMainBox">
<div class="systemRequirementsRamContent">
<span title="000 Plus Minimum RAM Requirement">1 GB</span> </div>""", "xml")
print(soup.select_one("span[title*=RAM]").text)
That finds the span with a title attribute that contains RAM, it is equivalent to saying in python, if "RAM" in span["title"]
.
Or using find with re.compile
import re
print(soup.find("span", title=re.compile("RAM")).text)
To get all the data:
from bs4 import BeautifulSoup
r = requests.get("http://www.game-debate.com/games/index.php?g_id=21580&game=000%20Plus").content
soup = BeautifulSoup(r,"lxml")
cont = soup.select_one("div.systemRequirementsRamContent")
ram = cont.select_one("span")
print(ram["title"], ram.text)
for span in soup.select("div.systemRequirementsSmallerBox.sysReqGameSmallBox span"):
print(span["title"],span.text)
Which will give you:
000 Plus Minimum RAM Requirement 1 GB
000 Plus Minimum Operating System Requirement Win Xp 32
000 Plus Minimum Direct X Requirement DX 9
000 Plus Minimum Hard Disk Drive Space Requirement 500 MB
000 Plus GD Adjusted Operating System Requirement Win Xp 32
000 Plus GD Adjusted Direct X Requirement DX 9
000 Plus GD Adjusted Hard Disk Drive Space Requirement 500 MB
000 Plus Recommended Operating System Requirement Win Xp 32
000 Plus Recommended Hard Disk Drive Space Requirement 500 MB
How to get text from span tag in BeautifulSoup in loop
In your code, product_name_ref
is always the same value, because you are selecting from soup
, not from caption
.
To get desired information, you can use this example:
from bs4 import BeautifulSoup
txt = '''
<div class="product-details">
<h2 class="product-name" title=" Weekly Roundup"> Weekly Roundup</h2>
<span class="reference-number">REF NO. A1400.5</span>
</div>
<div class="product-details">
<h2 class="product-name" title=" Weekly Roundup"> Weekly Roundup 2</h2>
<span class="reference-number">REF NO. A1400.5 2</span>
</div>
'''
soup = BeautifulSoup(txt, 'html.parser')
product_new = []
product_ref = []
for product in soup.select('div.product-details'):
product_new.append(product.h2.get_text(strip=True))
product_ref.append(product.select_one('span.reference-number').get_text(strip=True))
print(product_new)
print(product_ref)
Prints:
['Weekly Roundup', 'Weekly Roundup 2']
['REF NO. A1400.5', 'REF NO. A1400.5 2']
EDIT:
product_new = []
product_ref = []
for product in soup.select('div.product-details'):
n = product.h2
r = product.select_one('span.reference-number')
if n and r:
product_new.append(n.get_text(strip=True))
product_ref.append(r.get_text(strip=True))
print(product_new)
print(product_ref)
EDIT2:
from bs4 import BeautifulSoup
txt = '''
<div class="product-details">
<h2 class="product-name" title=" Weekly Roundup"> Weekly Roundup</h2>
<span class="reference-number">REF NO. A1400.5</span>
</div>
<div class="product-details">
<h2 class="product-name" title=" Weekly Roundup"> Weekly Roundup 2</h2>
<span class="reference-number">REF NO. A1400.6</span>
</div>
'''
soup = BeautifulSoup(txt, 'html.parser')
product_new = []
product_ref = []
for product in soup.select('div.product-details'):
n = product.h2
r = product.select_one('span.reference-number')
if n and r:
product_new.append(n.get_text(strip=True))
product_ref.append(r.get_text(strip=True).rsplit(maxsplit=1)[-1])
print(product_new)
print(product_ref)
Prints:
['Weekly Roundup', 'Weekly Roundup 2']
['A1400.5', 'A1400.6']
EDIT 3:
for a, b in zip(product_new, product_ref):
print('{:<30} {}'.format(a, b))
Prints:
Weekly Roundup A1400.5
Weekly Roundup 2 A1400.6
Get the text from the nested span tag with beautifulsoup
This will work fine:
from bs4 import BeautifulSoup
s = '''<span class="value">401<span class="Suffix">st</span></span>'''
soup = BeautifulSoup(s, 'html.parser')
get_text = soup.find(class_='value')
print(get_text.contents[0])
Output
401
cant get text from span with Beautifulsoup
The "problem" is the pc_color
CSS class. When you load the page, you need to specify what version of page do you need (PS4/XBOX/PC) - this is done by "platform" cookie (or you can use ps4_color
instead of pc_color
, for example):
import requests
from bs4 import BeautifulSoup
url = "https://www.futbin.com/players"
cookies = {"platform": "pc"}
soup = BeautifulSoup(requests.get(url, cookies=cookies).content, "html.parser")
result_prices_pc = soup.find_all(
"span", attrs={"class": "pc_color font-weight-bold"}
)
for price in result_prices_pc:
print(price.text)
Prints:
0
1.15M
3.75M
1.7M
4.19M
1.81M
351.65K
0
1.66M
98K
1.16M
3M
775K
99K
1.62M
187K
280K
245K
220K
1.03M
395K
100K
185K
864.2K
0
1.95M
540K
0
0
89K
Related Topics
Get Discord User Id from Username
How to Count the Number of Messages
Best Way to Get the Max Value in a Spark Dataframe Column
How to Allocate Array With Shape and Data Type
How to Make a Tkinter Label Background Transparent
Macos: How to Downgrade Homebrew Python
How to Clear/Delete the Contents of a Tkinter Text Widget
Python: How to Check If Cell in CSV File Is Empty
Python Selenium - Element Is Not Currently Interactable and May Not Be Manipulated
How to Change Python Version in Anaconda Spyder
How to Extract Hours and Minutes from a Datetime.Datetime Object
Print() Prints Only Every Second Input
Replacing All Negative Values in Certain Columns by Another Value in Pandas
How to Add Parenthesis Around a Substring in a String
Using Regex to Get the Value Between Two Characters (Python 3)
Passing Multiple Arguments from Django Template Href Link to View