How to Get Text from Span Tag in Beautifulsoup

How to get text from span tag in BeautifulSoup

You can use a css selector, pulling the span you want using the title text :

soup = BeautifulSoup("""<div class="systemRequirementsMainBox">
<div class="systemRequirementsRamContent">
<span title="000 Plus Minimum RAM Requirement">1 GB</span> </div>""", "xml")

print(soup.select_one("span[title*=RAM]").text)

That finds the span with a title attribute that contains RAM, it is equivalent to saying in python, if "RAM" in span["title"].

Or using find with re.compile

import re
print(soup.find("span", title=re.compile("RAM")).text)

To get all the data:

from bs4 import BeautifulSoup 
r  = requests.get("http://www.game-debate.com/games/index.php?g_id=21580&game=000%20Plus").content

soup = BeautifulSoup(r,"lxml")
cont = soup.select_one("div.systemRequirementsRamContent")
ram = cont.select_one("span")
print(ram["title"], ram.text)
for span in soup.select("div.systemRequirementsSmallerBox.sysReqGameSmallBox span"):
        print(span["title"],span.text)

Which will give you:

000 Plus Minimum RAM Requirement 1 GB
000 Plus Minimum Operating System Requirement Win Xp 32
000 Plus Minimum Direct X Requirement DX 9
000 Plus Minimum Hard Disk Drive Space Requirement 500 MB
000 Plus GD Adjusted Operating System Requirement Win Xp 32
000 Plus GD Adjusted Direct X Requirement DX 9
000 Plus GD Adjusted Hard Disk Drive Space Requirement 500 MB
000 Plus Recommended Operating System Requirement Win Xp 32
000 Plus Recommended Hard Disk Drive Space Requirement 500 MB

How to get text from span tag in BeautifulSoup in loop

In your code, product_name_ref is always the same value, because you are selecting from soup, not from caption.

To get desired information, you can use this example:

from bs4 import BeautifulSoup


txt = '''
<div class="product-details">
   <h2 class="product-name" title=" Weekly Roundup"> Weekly Roundup</h2>
   <span class="reference-number">REF NO. A1400.5</span>
</div>

<div class="product-details">
   <h2 class="product-name" title=" Weekly Roundup"> Weekly Roundup 2</h2>
   <span class="reference-number">REF NO. A1400.5 2</span>
</div>
'''

soup = BeautifulSoup(txt, 'html.parser')

product_new = []
product_ref = []

for product in soup.select('div.product-details'):
    product_new.append(product.h2.get_text(strip=True))
    product_ref.append(product.select_one('span.reference-number').get_text(strip=True))

print(product_new)
print(product_ref)

Prints:

['Weekly Roundup', 'Weekly Roundup 2']
['REF NO. A1400.5', 'REF NO. A1400.5 2']

EDIT:

product_new = []
product_ref = []

for product in soup.select('div.product-details'):
    n = product.h2
    r = product.select_one('span.reference-number')

    if n and r:
        product_new.append(n.get_text(strip=True))
        product_ref.append(r.get_text(strip=True))

print(product_new)
print(product_ref)

EDIT2:

from bs4 import BeautifulSoup


txt = '''
<div class="product-details">
   <h2 class="product-name" title=" Weekly Roundup"> Weekly Roundup</h2>
   <span class="reference-number">REF NO. A1400.5</span>
</div>

<div class="product-details">
   <h2 class="product-name" title=" Weekly Roundup"> Weekly Roundup 2</h2>
   <span class="reference-number">REF NO. A1400.6</span>
</div>
'''

soup = BeautifulSoup(txt, 'html.parser')

product_new = []
product_ref = []

for product in soup.select('div.product-details'):
    n = product.h2
    r = product.select_one('span.reference-number')

    if n and r:
        product_new.append(n.get_text(strip=True))
        product_ref.append(r.get_text(strip=True).rsplit(maxsplit=1)[-1])

print(product_new)
print(product_ref)

Prints:

['Weekly Roundup', 'Weekly Roundup 2']
['A1400.5', 'A1400.6']

EDIT 3:

for a, b in zip(product_new, product_ref):
    print('{:<30} {}'.format(a, b))

Prints:

Weekly Roundup                 A1400.5
Weekly Roundup 2               A1400.6

Get the text from the nested span tag with beautifulsoup

This will work fine:

from bs4 import BeautifulSoup
s = '''<span class="value">401<span class="Suffix">st</span></span>'''
soup = BeautifulSoup(s, 'html.parser')
get_text = soup.find(class_='value')
print(get_text.contents[0])

Output

cant get text from span with Beautifulsoup

The "problem" is the pc_color CSS class. When you load the page, you need to specify what version of page do you need (PS4/XBOX/PC) - this is done by "platform" cookie (or you can use ps4_color instead of pc_color, for example):

import requests
from bs4 import BeautifulSoup

url = "https://www.futbin.com/players"
cookies = {"platform": "pc"}
soup = BeautifulSoup(requests.get(url, cookies=cookies).content, "html.parser")


result_prices_pc = soup.find_all(
    "span", attrs={"class": "pc_color font-weight-bold"}
)

for price in result_prices_pc:
    print(price.text)

Prints:

0 
1.15M 
3.75M 
1.7M 
4.19M 
1.81M 
351.65K 
0 
1.66M 
98K 
1.16M 
3M 
775K 
99K 
1.62M 
187K 
280K 
245K 
220K 
1.03M 
395K 
100K 
185K 
864.2K 
0 
1.95M 
540K 
0 
0 
89K

How to Get Text from Span Tag in Beautifulsoup

How to get text from span tag in BeautifulSoup

How to get text from span tag in BeautifulSoup in loop

Get the text from the nested span tag with beautifulsoup

cant get text from span with Beautifulsoup

Related Topics

Leave a reply