How to Get Text from Span Tag in Beautifulsoup

How to get text from span tag in BeautifulSoup

You can use a css selector, pulling the span you want using the title text :

soup = BeautifulSoup("""<div class="systemRequirementsMainBox">
<div class="systemRequirementsRamContent">
<span title="000 Plus Minimum RAM Requirement">1 GB</span> </div>""", "xml")

print(soup.select_one("span[title*=RAM]").text)

That finds the span with a title attribute that contains RAM, it is equivalent to saying in python, if "RAM" in span["title"].

Or using find with re.compile

import re
print(soup.find("span", title=re.compile("RAM")).text)

To get all the data:

from bs4 import BeautifulSoup 
r = requests.get("http://www.game-debate.com/games/index.php?g_id=21580&game=000%20Plus").content

soup = BeautifulSoup(r,"lxml")
cont = soup.select_one("div.systemRequirementsRamContent")
ram = cont.select_one("span")
print(ram["title"], ram.text)
for span in soup.select("div.systemRequirementsSmallerBox.sysReqGameSmallBox span"):
print(span["title"],span.text)

Which will give you:

000 Plus Minimum RAM Requirement 1 GB
000 Plus Minimum Operating System Requirement Win Xp 32
000 Plus Minimum Direct X Requirement DX 9
000 Plus Minimum Hard Disk Drive Space Requirement 500 MB
000 Plus GD Adjusted Operating System Requirement Win Xp 32
000 Plus GD Adjusted Direct X Requirement DX 9
000 Plus GD Adjusted Hard Disk Drive Space Requirement 500 MB
000 Plus Recommended Operating System Requirement Win Xp 32
000 Plus Recommended Hard Disk Drive Space Requirement 500 MB

How to get text from span tag in BeautifulSoup in loop

In your code, product_name_ref is always the same value, because you are selecting from soup, not from caption.

To get desired information, you can use this example:

from bs4 import BeautifulSoup


txt = '''
<div class="product-details">
<h2 class="product-name" title=" Weekly Roundup"> Weekly Roundup</h2>
<span class="reference-number">REF NO. A1400.5</span>
</div>

<div class="product-details">
<h2 class="product-name" title=" Weekly Roundup"> Weekly Roundup 2</h2>
<span class="reference-number">REF NO. A1400.5 2</span>
</div>
'''

soup = BeautifulSoup(txt, 'html.parser')

product_new = []
product_ref = []

for product in soup.select('div.product-details'):
product_new.append(product.h2.get_text(strip=True))
product_ref.append(product.select_one('span.reference-number').get_text(strip=True))

print(product_new)
print(product_ref)

Prints:

['Weekly Roundup', 'Weekly Roundup 2']
['REF NO. A1400.5', 'REF NO. A1400.5 2']

EDIT:

product_new = []
product_ref = []

for product in soup.select('div.product-details'):
n = product.h2
r = product.select_one('span.reference-number')

if n and r:
product_new.append(n.get_text(strip=True))
product_ref.append(r.get_text(strip=True))

print(product_new)
print(product_ref)

EDIT2:

from bs4 import BeautifulSoup


txt = '''
<div class="product-details">
<h2 class="product-name" title=" Weekly Roundup"> Weekly Roundup</h2>
<span class="reference-number">REF NO. A1400.5</span>
</div>

<div class="product-details">
<h2 class="product-name" title=" Weekly Roundup"> Weekly Roundup 2</h2>
<span class="reference-number">REF NO. A1400.6</span>
</div>
'''

soup = BeautifulSoup(txt, 'html.parser')

product_new = []
product_ref = []

for product in soup.select('div.product-details'):
n = product.h2
r = product.select_one('span.reference-number')

if n and r:
product_new.append(n.get_text(strip=True))
product_ref.append(r.get_text(strip=True).rsplit(maxsplit=1)[-1])

print(product_new)
print(product_ref)

Prints:

['Weekly Roundup', 'Weekly Roundup 2']
['A1400.5', 'A1400.6']

EDIT 3:

for a, b in zip(product_new, product_ref):
print('{:<30} {}'.format(a, b))

Prints:

Weekly Roundup                 A1400.5
Weekly Roundup 2 A1400.6

Get the text from the nested span tag with beautifulsoup

This will work fine:

from bs4 import BeautifulSoup
s = '''<span class="value">401<span class="Suffix">st</span></span>'''
soup = BeautifulSoup(s, 'html.parser')
get_text = soup.find(class_='value')
print(get_text.contents[0])

Output

401

cant get text from span with Beautifulsoup

The "problem" is the pc_color CSS class. When you load the page, you need to specify what version of page do you need (PS4/XBOX/PC) - this is done by "platform" cookie (or you can use ps4_color instead of pc_color, for example):

import requests
from bs4 import BeautifulSoup

url = "https://www.futbin.com/players"
cookies = {"platform": "pc"}
soup = BeautifulSoup(requests.get(url, cookies=cookies).content, "html.parser")


result_prices_pc = soup.find_all(
"span", attrs={"class": "pc_color font-weight-bold"}
)

for price in result_prices_pc:
print(price.text)

Prints:

0 
1.15M
3.75M
1.7M
4.19M
1.81M
351.65K
0
1.66M
98K
1.16M
3M
775K
99K
1.62M
187K
280K
245K
220K
1.03M
395K
100K
185K
864.2K
0
1.95M
540K
0
0
89K


Related Topics



Leave a reply



Submit