Angus Dippenaar
2a7d08722d
Initialize PyQuery with lxml
...
PyQuery with XML sites also has the same issue that LXML does with unicode encoded strings because it uses LXML to parse the page.
The fix has already been applied to LXML, so we can fix the issue with PyQuery by passing the already parsed LXML into PyQuery.
2018-04-14 21:32:00 +02:00
Angus Dippenaar
c21f0784cd
Create LXML from raw_html
...
Create LXML from `self.raw_html` instead of `self.html` to allow LXML to process plain XML pages as per beda42's findings in issue https://github.com/kennethreitz/requests-html/issues/145
I have tested this change with 200 sites and it seems to fix the issue. HTML pages seem to all be working as expected. I haven't run into an issue with any that I've tested.
2018-04-05 13:47:39 +02:00
kennethreitz
122b42a144
cleanup
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-21 07:46:27 -04:00
Ordanis Sanchez
a2cc6bfa55
Update HTML.render to use session.browser anf close pages automatically
2018-03-20 19:50:04 -04:00
Ordanis Sanchez
9a53202ce5
Extend session close method to shutdown browser
2018-03-20 19:20:20 -04:00
Ordanis Sanchez
c279bd3d63
Add browser obj to HTMLSession
2018-03-20 18:47:06 -04:00
kennethreitz
ef67e9f96f
Merge pull request #141 from oldani/bugfix/issue_135
...
Catch typeError on render, add maxretires exception
2018-03-20 17:11:53 -04:00
Ordanis Sanchez
ff95aded81
Catch typeError on render, add maxretires exception
2018-03-16 12:02:03 -04:00
Ordanis Sanchez
9b21faf291
Update Sessions classes to be passed down to HTML class
2018-03-14 10:31:36 -04:00
Ordanis Sanchez
a79e5479de
Move next method form BaseParser to HTML class
2018-03-14 10:16:40 -04:00
bonfy
76f2f6434c
add func add_next_symbol make it possible to append word to default next page symbols
2018-03-13 11:11:07 +08:00
kennethreitz
6f8b676ac3
Merge branch 'master' of github.com:kennethreitz/requests-html
2018-03-11 09:44:55 -04:00
shaunpud
d55bcfb34f
Shorten
2018-03-11 19:53:54 +08:00
kennethreitz
bcb0881d15
Merge pull request #126 from frostming/bugfix/links
...
Fix bugs related to links
2018-03-11 07:37:20 -04:00
Frost Ming
af97ddd5f1
Fix bugs related to links
...
* #121 KeyError of special base tag
* #124 Remove 'mailto:' links out from links
2018-03-11 16:26:58 +08:00
miyakogi
dc932571ee
Pyppeteer's api has been changed
...
Today I released new version of pyppeteer (0.0.13).
In that release, `pyppeteer.launch` has been changed to coroutine function.
2018-03-10 14:57:07 +09:00
kennethreitz
d9ee89eaf4
Merge branch 'master' of github.com:kennethreitz/requests-html
2018-03-09 10:42:08 -05:00
kennethreitz
3a5a94eb85
cleaning
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-09 10:42:04 -05:00
Andrew Gorcester
14da46f03d
Add tests for ._make_absolute() and make them pass.
2018-03-06 16:06:23 -08:00
kennethreitz
89c001a02e
Merge branch 'master' of github.com:kennethreitz/requests-html
2018-03-06 11:45:44 -05:00
kennethreitz
6ab1aff41c
Merge pull request #101 from oldani/feature/async_support
...
Feature/async support
2018-03-06 11:45:19 -05:00
Viberring
b386dd36f3
fix repeated slot atrribute
2018-03-06 19:23:46 +08:00
Ordanis Sanchez
c7ba3c17cf
Add loop and workers params to AsyncHTMLSession
2018-03-05 15:58:18 -04:00
Ordanis Sanchez
ea05c69fe5
Add HTMLResponse hook and mock_browser param
2018-03-05 15:58:18 -04:00
Ordanis Sanchez
23d81af0ef
Add AsyncHTMLSession
2018-03-05 15:58:18 -04:00
kennethreitz
34dcd78ba1
fix fix
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-04 15:28:04 -05:00
kennethreitz
e5a7be391b
fix
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-04 15:26:38 -05:00
kennethreitz
72a7e0be69
Merge branch 'master' of github.com:kennethreitz/requests-html
...
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
2018-03-04 15:23:54 -05:00
kennethreitz
7ac299f527
Merge branch 'master' of github.com:kennethreitz/requests-html
2018-03-04 15:01:13 -05:00
kennethreitz
e87ff8700c
self.page = None
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-04 14:58:46 -05:00
kennethreitz
588c4e69cc
#99
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-04 14:58:19 -05:00
kennethreitz
4ecd13163c
#99
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-04 14:57:35 -05:00
Prodesire
98d19d73f1
when calling element.attrs, using cache if available
2018-03-04 23:05:15 +08:00
yech
b1e353d1f1
add args '--no-sandbox'
...
add args '--no-sandbox'
2018-03-04 14:25:13 +08:00
kennethreitz
90de9b7ac5
improvements
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-03 10:05:22 -05:00
kennethreitz
3f38a495c0
improvements
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-03 09:51:12 -05:00
kennethreitz
38f692e5bd
.next()
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-03 09:47:18 -05:00
kennethreitz
90d9bbbc0f
better tests
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-03 08:27:02 -05:00
kennethreitz
6e1938e588
improvements
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-03 08:24:31 -05:00
kennethreitz
8af172c5ce
vast improvements
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-03 08:17:42 -05:00
kennethreitz
cc2632218d
use fstrings
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-03 07:37:42 -05:00
kennethreitz
0d875eb536
more accurate description
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-01 12:54:48 -05:00
kennethreitz
e509dff888
fixes
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-01 12:53:04 -05:00
kennethreitz
63605e37bc
fixes
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-01 12:52:49 -05:00
kennethreitz
1a286a7919
enhancements
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-01 12:40:33 -05:00
kennethreitz
4b7871267e
much better approach to #57
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-01 12:37:15 -05:00
kennethreitz
7c4de5ed4c
evaluate javascript on page
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-01 12:33:26 -05:00
kennethreitz
9b2727ddf2
Merge pull request #71 from isudox/feature/optimization
...
Add typing hints, and remove unnecessary code for constructor.
2018-03-01 10:29:55 -05:00
isudox
8fab96cdff
Reset my careless changes.
2018-03-01 22:05:47 +08:00
kennethreitz
64e10e4a37
docs
...
Signed-off-by: Kenneth Reitz <me@kennethreitz.org >
2018-03-01 08:24:37 -05:00