mirror of
https://github.com/kennethreitz/python-guide.git
synced 2026-06-05 14:50:19 +00:00
use .content in lxml
This can also be thought of as a bug in lxml. It just occurred to me that maybe I should fix that.
This commit is contained in:
@@ -38,7 +38,10 @@ parse it using the ``html`` module and save the results in ``tree``:
|
||||
.. code-block:: python
|
||||
|
||||
page = requests.get('http://econpy.pythonanywhere.com/ex/001.html')
|
||||
tree = html.fromstring(page.text)
|
||||
tree = html.fromstring(page.content)
|
||||
|
||||
(We need to use ``page.content`` rather than ``page.text`` because
|
||||
``html.fromstring`` implicitly expects ``bytes`` as input.)
|
||||
|
||||
``tree`` now contains the whole HTML file in a nice tree structure which
|
||||
we can go over two different ways: XPath and CSSSelect. In this example, we
|
||||
|
||||
Reference in New Issue
Block a user