lxml error IOError: Error reading file when parsing facebook mobile in a python scraper script
This is your problem:
tree = etree.parse(body)
The documentation says that "source
is a filename or file object containing XML data." You have provided a string, so lxml is taking the text of your HTTP response body as the name of the file you wish to open. No such file exists, so you get an IOError
.
The error message you get even says "Error reading file" and then gives your XML string as the name of the file it's trying to read, which is a mighty big hint about what's going on.
You probably want etree.XML()
, which takes input from a string. Or you could just do tree = etree.parse(res)
to read directly from the HTTP request into lxml (the result of opener.open()
is a file-like object, and etree.parse()
should be perfectly happy to consume it).
IOError passing requests Response.content to lxml.etree.parse()
etree.parse
expects a filename, a file-like object, or a URL as its first argument (see help(etree.parse)
). It does not expect an XML string. To parse an XML string use
xmlObject = etree.fromstring(r.content)
Note that etree.fromstring
returns a lxml.etree._Element
. In contrast, etree.parse
returns a lxml.etree._ElementTree
. Given the _Element
, you can obtain the _ElementTree
with the getroottree
method:
xmlTree = xmlObject.getroottree()
Multithreaded lxml scraper executes without any error or output
Try changing your last if
statement to
if __name__ == '__main__'
instead of '__name__'
Related Topics
Passing Variable from Python Script to Bash Script
How to Protect My Python Scripts on Raspberry Pi
Detect Face Then Autocrop Pictures
Why Use Python's Os Module Methods Instead of Executing Shell Commands Directly
How to Get the Pythonpath in Shell
Fastest Way to Download 3 Million Objects from a S3 Bucket
No Such File or Directory "Limits.H" When Installing Pillow on Alpine Linux
Serving a Request from Gunicorn
No Module Named 'Virtualenvwrapper'
How Transform a Python Program .Py in an Executable Program in Ubuntu
Error Installing Uwsgi in Virtualenv
Make (Install from Source) Python Without Running Tests
How to Add File Extensions Based on File Type on Linux/Unix
Python Library for Linux Process Management
Docker.Errors.Dockerexception: Error While Fetching Server API Version