Commit Graph

182 Commits

Author SHA1 Message Date
Angus Dippenaar 2a7d08722d Initialize PyQuery with lxml
PyQuery with XML sites also has the same issue that LXML does with unicode encoded strings because it uses LXML to parse the page.
The fix has already been applied to LXML, so we can fix the issue with PyQuery by passing the already parsed LXML into PyQuery.
2018-04-14 21:32:00 +02:00
Angus Dippenaar c21f0784cd Create LXML from raw_html
Create LXML from `self.raw_html` instead of `self.html` to allow LXML to process plain XML pages as per beda42's findings in issue https://github.com/kennethreitz/requests-html/issues/145

I have tested this change with 200 sites and it seems to fix the issue. HTML pages seem to all be working as expected. I haven't run into an issue with any that I've tested.
2018-04-05 13:47:39 +02:00
kennethreitz 122b42a144 cleanup
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-21 07:46:27 -04:00
Ordanis Sanchez a2cc6bfa55 Update HTML.render to use session.browser anf close pages automatically 2018-03-20 19:50:04 -04:00
Ordanis Sanchez 9a53202ce5 Extend session close method to shutdown browser 2018-03-20 19:20:20 -04:00
Ordanis Sanchez c279bd3d63 Add browser obj to HTMLSession 2018-03-20 18:47:06 -04:00
kennethreitz ef67e9f96f Merge pull request #141 from oldani/bugfix/issue_135
Catch typeError on render, add maxretires exception
2018-03-20 17:11:53 -04:00
Ordanis Sanchez ff95aded81 Catch typeError on render, add maxretires exception 2018-03-16 12:02:03 -04:00
Ordanis Sanchez 9b21faf291 Update Sessions classes to be passed down to HTML class 2018-03-14 10:31:36 -04:00
Ordanis Sanchez a79e5479de Move next method form BaseParser to HTML class 2018-03-14 10:16:40 -04:00
bonfy 76f2f6434c add func add_next_symbol make it possible to append word to default next page symbols 2018-03-13 11:11:07 +08:00
kennethreitz 6f8b676ac3 Merge branch 'master' of github.com:kennethreitz/requests-html 2018-03-11 09:44:55 -04:00
shaunpud d55bcfb34f Shorten 2018-03-11 19:53:54 +08:00
kennethreitz bcb0881d15 Merge pull request #126 from frostming/bugfix/links
Fix bugs related to links
2018-03-11 07:37:20 -04:00
Frost Ming af97ddd5f1 Fix bugs related to links
* #121 KeyError of special base tag
* #124 Remove 'mailto:' links out from links
2018-03-11 16:26:58 +08:00
miyakogi dc932571ee Pyppeteer's api has been changed
Today I released new version of pyppeteer (0.0.13).
In that release, `pyppeteer.launch` has been changed to coroutine function.
2018-03-10 14:57:07 +09:00
kennethreitz d9ee89eaf4 Merge branch 'master' of github.com:kennethreitz/requests-html 2018-03-09 10:42:08 -05:00
kennethreitz 3a5a94eb85 cleaning
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-09 10:42:04 -05:00
Andrew Gorcester 14da46f03d Add tests for ._make_absolute() and make them pass. 2018-03-06 16:06:23 -08:00
kennethreitz 89c001a02e Merge branch 'master' of github.com:kennethreitz/requests-html 2018-03-06 11:45:44 -05:00
kennethreitz 6ab1aff41c Merge pull request #101 from oldani/feature/async_support
Feature/async support
2018-03-06 11:45:19 -05:00
Viberring b386dd36f3 fix repeated slot atrribute 2018-03-06 19:23:46 +08:00
Ordanis Sanchez c7ba3c17cf Add loop and workers params to AsyncHTMLSession 2018-03-05 15:58:18 -04:00
Ordanis Sanchez ea05c69fe5 Add HTMLResponse hook and mock_browser param 2018-03-05 15:58:18 -04:00
Ordanis Sanchez 23d81af0ef Add AsyncHTMLSession 2018-03-05 15:58:18 -04:00
kennethreitz 34dcd78ba1 fix fix
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-04 15:28:04 -05:00
kennethreitz e5a7be391b fix
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-04 15:26:38 -05:00
kennethreitz 72a7e0be69 Merge branch 'master' of github.com:kennethreitz/requests-html
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
2018-03-04 15:23:54 -05:00
kennethreitz 7ac299f527 Merge branch 'master' of github.com:kennethreitz/requests-html 2018-03-04 15:01:13 -05:00
kennethreitz e87ff8700c self.page = None
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-04 14:58:46 -05:00
kennethreitz 588c4e69cc #99
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-04 14:58:19 -05:00
kennethreitz 4ecd13163c #99
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-04 14:57:35 -05:00
Prodesire 98d19d73f1 when calling element.attrs, using cache if available 2018-03-04 23:05:15 +08:00
yech b1e353d1f1 add args '--no-sandbox'
add args '--no-sandbox'
2018-03-04 14:25:13 +08:00
kennethreitz 90de9b7ac5 improvements
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-03 10:05:22 -05:00
kennethreitz 3f38a495c0 improvements
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-03 09:51:12 -05:00
kennethreitz 38f692e5bd .next()
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-03 09:47:18 -05:00
kennethreitz 90d9bbbc0f better tests
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-03 08:27:02 -05:00
kennethreitz 6e1938e588 improvements
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-03 08:24:31 -05:00
kennethreitz 8af172c5ce vast improvements
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-03 08:17:42 -05:00
kennethreitz cc2632218d use fstrings
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-03 07:37:42 -05:00
kennethreitz 0d875eb536 more accurate description
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-01 12:54:48 -05:00
kennethreitz e509dff888 fixes
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-01 12:53:04 -05:00
kennethreitz 63605e37bc fixes
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-01 12:52:49 -05:00
kennethreitz 1a286a7919 enhancements
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-01 12:40:33 -05:00
kennethreitz 4b7871267e much better approach to #57
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-01 12:37:15 -05:00
kennethreitz 7c4de5ed4c evaluate javascript on page
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-01 12:33:26 -05:00
kennethreitz 9b2727ddf2 Merge pull request #71 from isudox/feature/optimization
Add typing hints, and remove unnecessary code for constructor.
2018-03-01 10:29:55 -05:00
isudox 8fab96cdff Reset my careless changes. 2018-03-01 22:05:47 +08:00
kennethreitz 64e10e4a37 docs
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-01 08:24:37 -05:00