Commit Graph

439 Commits

Author SHA1 Message Date
Ordanis Sanchez 69dd1cc77f Add asyncsession.run method 2018-09-18 16:59:14 -04:00
Ordanis Sanchez 09c7b683cc Fix r.html.next() for next url 2018-09-18 16:59:14 -04:00
Ordanis Sanchez fc1fabd8dc Fix HTML class to use async iter and render on bare mode 2018-09-18 16:59:14 -04:00
Ordanis Sanchez 85e77d134a Add arender method to HTML 2018-09-18 16:56:55 -04:00
Ordanis Sanchez c12d7c6aca Add async iterator to HTML class 2018-09-18 16:49:22 -04:00
Ordanis Sanchez dd05a02de7 Add HTMLSession.browser runtime exception, AsyncSession an async close method 2018-09-18 16:49:22 -04:00
Ordanis Sanchez 2e460d93c3 Create a base session 2018-09-18 16:49:22 -04:00
Ordanis Sanchez 9cef8a06b9 Fix merge errors on HTMLSession 2018-09-18 16:37:31 -04:00
kennethreitz aba5c8cbb5 Merge pull request #212 from pigna90/add-ignoreHTTPSError-parameter
Added ignoreHTTPErrors parameter
2018-09-18 12:00:00 -04:00
Alessandro Romano 52ddd80824 Merge branch 'master' into add-ignoreHTTPSError-parameter 2018-09-18 17:31:17 +02:00
kennethreitz 7353466355 Merge pull request #153 from clarksun/patch-1
Automatically?
2018-09-18 02:56:27 -04:00
kennethreitz 51afd9e474 Merge pull request #157 from CodeMogul/fixes
Made basic edits
2018-09-18 02:56:12 -04:00
kennethreitz 29acbaabc7 Merge pull request #160 from SN9NV/patch-1
Create LXML from raw_html
2018-09-18 02:54:05 -04:00
kennethreitz f760df2be2 Merge pull request #162 from SN9NV/patch-2
Replace errors when decoding raw_html
2018-09-18 02:52:37 -04:00
kennethreitz a693658f21 Merge pull request #163 from Norbinsh/master
Minor typo fix
2018-09-18 02:52:00 -04:00
kennethreitz 625910e1a5 Merge branch 'master' into master 2018-09-18 02:51:54 -04:00
kennethreitz c47fd127b1 Merge pull request #176 from wasabigeek/patch-1
Docs: Add note to install Linux packages
2018-09-18 02:47:34 -04:00
kennethreitz c6f6858ea0 Merge pull request #200 from meetmangukiya/patch-1
requests_html.py: Typo HTTPSession -> HTMLSession
2018-09-18 02:44:41 -04:00
kennethreitz e37b40e59f Merge pull request #189 from carrionc/patch-2
Multiple chromium tab fix
2018-09-18 02:42:54 -04:00
kennethreitz e05933acfc Merge pull request #191 from montenegrodr/patch-1
fix: typo
2018-09-18 02:42:43 -04:00
kennethreitz 5de8c88a1f Merge pull request #203 from rachmadaniHaryono/master
exclude html files for github linguist
2018-09-18 02:42:34 -04:00
kennethreitz 5d7c859975 Merge pull request #193 from pennyarcade/master
Update requests_html.py
2018-09-18 02:41:02 -04:00
kennethreitz 16c0dbe13d Merge pull request #201 from timotk/patch-1
Fix minor typo
2018-09-18 02:40:16 -04:00
kennethreitz 87b183c7fc Merge pull request #205 from leven-cn/develop
Add "tag" attribute for Element objects
2018-09-18 02:40:06 -04:00
kennethreitz 95a113cfc3 Merge pull request #217 from m9mhmdy/master
Fix a small typo
2018-09-18 02:37:40 -04:00
kennethreitz ee34f3f9ea Update README.rst 2018-09-17 08:07:49 -04:00
kennethreitz d159d2045a Update README.rst 2018-09-17 08:07:22 -04:00
kennethreitz 6190a47eef Update README.rst 2018-09-17 08:06:43 -04:00
m9mhmdy cb6e5fb557 Fix a small typo 2018-08-30 16:25:25 +02:00
Alessandro Romano b1a7acf33a Added ignoreHTTPErrors parameter 2018-08-09 15:03:15 +02:00
Li Yun 1c21f63672 Add "lineno" attribute for Element object 2018-07-04 11:30:59 +08:00
Li Yun 116a4b08eb Add "tag" attribute for Element object 2018-07-04 11:20:08 +08:00
rachmadaniHaryono 96973aaf4e chg: dev: generic html 2018-07-02 21:49:28 +08:00
rachmadaniHaryono 6fef1d8583 new: dev: ignore html count 2018-07-02 21:44:00 +08:00
Timo 71e2571d3a Fix minor typo
lopp -> loop in init docstring of AsyncHTMLSession.
2018-06-25 13:18:57 +02:00
Meet Mangukiya 4db2931ddc requests_html.py: Typo HTTPSession -> HTMLSession 2018-06-24 22:23:47 +05:30
Martin Rotwang 96dbba8fbd Update requests_html.py
e.g. to add a proxy setting
usage: s=Session(browser_args=['--no-sandbox', '--proxy-server=127.0.0.1:9876'])
@see: https://github.com/GoogleChrome/puppeteer/issues/336
2018-06-05 12:39:46 +02:00
Robson D. Montenegro a1c5e6ac8b fix: typo 2018-06-03 21:52:51 +01:00
carrionc 956e60054c Multiple chromium tab fix
Within the render function, the page is rendered through the _async_render function. This function will try to render content by first creating a page, and currently will only close said page if the content is generated. However, if at any point there's a timeout beforehand, the current page isn't closed, and instead _async_render will be called again [as per the # assigned to retries in render()] and end up leaving behind an unused page. This change will enable render to close the "failed" attempt BEFORE opening a new page to try again, and should fix the issue of massive cpu buildup with multiple chromium instances. Sorry if this is messy, it's my first time using git to make a change.
2018-05-30 00:40:37 -04:00
Nicholas 81998d84c4 Docs: Add note to install Linux packages
I ran into `pyppeteer.errors.BrowserError: Failed to connect to browser port:` and after a bit of snooping found that some Linux packages needed to be installed on my machine for pyppeteer to run. Suggest to add a note to save others time!
2018-05-07 01:32:59 +08:00
Angus Dippenaar 2a7d08722d Initialize PyQuery with lxml
PyQuery with XML sites also has the same issue that LXML does with unicode encoded strings because it uses LXML to parse the page.
The fix has already been applied to LXML, so we can fix the issue with PyQuery by passing the already parsed LXML into PyQuery.
2018-04-14 21:32:00 +02:00
Shay Elmualem 50c9058d04 Minor typo fix 2018-04-07 22:18:46 +03:00
Angus Dippenaar 05ff6e87ca Replace errors when decoding raw_html
Some websites don't have valid bytes, even when the encoding is specified. I'm not 100% sure if replacing "bad" bytes is the correct way to fix the problem. It seems to fix the issues I've run into with some sites.
2018-04-07 17:15:51 +02:00
Angus Dippenaar c21f0784cd Create LXML from raw_html
Create LXML from `self.raw_html` instead of `self.html` to allow LXML to process plain XML pages as per beda42's findings in issue https://github.com/kennethreitz/requests-html/issues/145

I have tested this change with 200 sites and it seems to fix the issue. HTML pages seem to all be working as expected. I haven't run into an issue with any that I've tested.
2018-04-05 13:47:39 +02:00
Siddhesh Nachane cb55034b42 Made basic fixes
1. Corrected Comments and DocStrings Spell Errors.
2. Added .vscode folder to .gitignore
3. Replaced `i` with place holder `_` (as i is never used)
2018-03-31 22:51:34 +05:30
Sun Wei 132e5cb522 Automatically? 2018-03-26 17:57:50 +08:00
kennethreitz c59480bf15 v0.9.0
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
v0.9.0
2018-03-21 07:50:17 -04:00
kennethreitz ad0ded932a next version
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
v0.8.4
2018-03-21 07:47:28 -04:00
kennethreitz 550d79f4c0 pipfile
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-21 07:47:15 -04:00
kennethreitz 122b42a144 cleanup
Signed-off-by: Kenneth Reitz <me@kennethreitz.org>
2018-03-21 07:46:27 -04:00