How to Get Rid of Beautifulsoup User Warning

How to get rid of BeautifulSoup user warning?

The solution to your problem is clearly stated in the error message. Code like the below does not specify an XML/HTML/etc. parser.

BeautifulSoup( ... )

In order to fix the error, you'll need to specify which parser you'd like to use, like so:

BeautifulSoup( ..., "html.parser" )

You can also install a 3rd party parser if you'd like.

Beautiful soup module error(html parser)

You'll have to import BeautifulSoup from bs4 package

import urllib2
import requests
from bs4 import BeautifulSoup #here
headers = {'User-Agent': 'Mozilla/5.0'}
response = requests.get("https://www.sikayetvar.com/onedio", headers = headers)

soup = BeautifulSoup(response.text)
pages = soup.select('div.pagination a')

a = int(pages[-2].text)
print a

lxml / BeautifulSoup parser warning

I had to read lxml's and BeautifulSoup's source code to figure this out.

I'm posting my own answer here, in case someone else may need it in the future.

The fromstring function in question is defined so:

def fromstring(data, beautifulsoup=None, makeelement=None, **bsargs):

The **bsargs arguments ends up being sent forward to the BeautifulSoup constructor, which is called like so (in another function, _parse):

tree = beautifulsoup(source, **bsargs)

The BeautifulSoup constructor is defined so:

def __init__(self, markup="", features=None, builder=None,
parse_only=None, from_encoding=None, exclude_encodings=None,
**kwargs):

Now, back to the warning in the question, which is recommending that the argument "html.parser" be added to BeautifulSoup's contructor. According to this, that would be the argument named features.

Since the fromstring function will pass on named arguments to BeautifulSoup's constructor, we can specify the parser by naming the argument to the fromstring function, like so:

root = fromstring(clean, features='html.parser')

Poof. The warning disappears.



Related Topics



Leave a reply



Submit