public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* pyppeteer error in Python3
       [not found] <d341cef6-cd51-67ec-0fec-7efdf19d4b13.ref@bellsouth.net>
@ 2021-09-23 17:32 ` Dennis Putnam
  0 siblings, 0 replies; only message in thread
From: Dennis Putnam @ 2021-09-23 17:32 UTC (permalink / raw)
  To: cygwin

*I'm not sure this is really a cygwin problem but I don't know where 
else to ask. I'm runing a python3 script to extract a web page:**
*
#!/usr/bin/python3

# This script auto submitsw do not call complaints

from bs4 import BeautifulSoup
from requests_html import HTMLSession
from urllib.parse import urljoin

print('Starting process')
session=HTMLSession()

def get_all_forms(url):
    """Returns all form tags found on a web page's `url` """
    # GET request
    print("getting page")
    res = session.get(url)
    # for javascript driven website
    print("Running Javascript")
    res.html.render()
    print("parsing url")
    soup = BeautifulSoup(res.html.html, "html.parser")
    return soup.find_all("form")
print(get_all_forms("https://blahblah"))

*The result is a traceback when executing 'res.html.render'.*

Traceback (most recent call last):
   File "./donotcall.py", line 23, in <module>
print(get_all_forms("https://www.donotcall.gov/report.html#step1"))
   File "./donotcall.py", line 19, in get_all_forms
     res.html.render()
   File "/usr/local/lib/python3.8/site-packages/requests_html.py", line 
586, in render
     self.browser = self.session.browser  # Automatically create a event 
loop and browser
   File "/usr/local/lib/python3.8/site-packages/requests_html.py", line 
730, in browser
     self._browser = self.loop.run_until_complete(super().browser)
   File "/usr/lib/python3.8/asyncio/base_events.py", line 616, in 
run_until_complete
     return future.result()
   File "/usr/local/lib/python3.8/site-packages/requests_html.py", line 
714, in browser
     self._browser = await 
pyppeteer.launch(ignoreHTTPSErrors=not(self.verify), headless=True, 
args=self.__browser_args)
   File "/usr/local/lib/python3.8/site-packages/pyppeteer/launcher.py", 
line 307, in launch
     return await Launcher(options, **kwargs).launch()
   File "/usr/local/lib/python3.8/site-packages/pyppeteer/launcher.py", 
line 168, in launch
     self.browserWSEndpoint = get_ws_endpoint(self.url)
   File "/usr/local/lib/python3.8/site-packages/pyppeteer/launcher.py", 
line 227, in get_ws_endpoint
     raise BrowserError('Browser closed unexpectedly:\n')
pyppeteer.errors.BrowserError: Browser closed unexpectedly:

*From what I can find with my searches, it has something to do with 
pyppeteer (chromium)  and synchronization. Can someone help me debug 
this or point me to a better place to ask? TIA.*



^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-09-23 17:32 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <d341cef6-cd51-67ec-0fec-7efdf19d4b13.ref@bellsouth.net>
2021-09-23 17:32 ` pyppeteer error in Python3 Dennis Putnam

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).