Re-establish Communication a with Selenium Browser Instance after the Script that Started it Has Exited

Currently editing.

Still getting to the point of re-establishing communication.

Find the chromedriver

In [1]:
import shutil

CHROMEDRIVER = "chromedriver"
chrome_driver = shutil.which(CHROMEDRIVER)
assert chrome_driver, f"Could not find executable '{CHROMEDRIVER}'."
chrome_driver
Out[1]:
'/usr/local/bin/chromedriver'

Create a selenium.webdriver.chrome.service.Service instance.

In [2]:
import selenium.webdriver.chrome.service as selservice

service = selservice.Service(chrome_driver)

My first attempt at using the python3.8 "walrus operator" in a comprehension.

In [3]:
exclusions = ("_", "stop", "start", "send", )
methods = [
    method
    for attr in dir(service)
    if not any(attr.startswith(exclusion) for exclusion in exclusions)
    and callable(method := getattr(service, attr))
]
methods
Out[3]:
[<bound method Service.assert_process_still_running of <selenium.webdriver.chrome.service.Service object at 0x7f61c45c9eb0>>,
 <bound method Service.command_line_args of <selenium.webdriver.chrome.service.Service object at 0x7f61c45c9eb0>>,
 <bound method Service.is_connectable of <selenium.webdriver.chrome.service.Service object at 0x7f61c45c9eb0>>]

Sans assigment expression aka "walrus operator" the code might look like this.

In [4]:
exclusions = ("_", "stop", "start", "send")
values = ( 
    getattr(service, attr)
    for attr in dir(service)
    if not any(attr.startswith(exclusion) for exclusion in exclusions)
    # WET and inefficient to then say 'and callable(getattr(service, attr))' here
)
methods = [value for value in values if callable(value)]
print(methods)

def output_method_calls():
    for method in methods:
        try:
            print({method.__name__: method()})
        except AttributeError:
            print(f"'{method.__name__}' failed")
[<bound method Service.assert_process_still_running of <selenium.webdriver.chrome.service.Service object at 0x7f61c45c9eb0>>, <bound method Service.command_line_args of <selenium.webdriver.chrome.service.Service object at 0x7f61c45c9eb0>>, <bound method Service.is_connectable of <selenium.webdriver.chrome.service.Service object at 0x7f61c45c9eb0>>]

Start the service.

In [5]:
service.start()

Call the methods.

In [6]:
output_method_calls()
{'assert_process_still_running': None}
{'command_line_args': ['--port=52503']}
{'is_connectable': True}
In [7]:
!pgrep chrome
24242

Get a driver aka "browser".

In [8]:
from selenium import webdriver
from IPython.display import display, Image

capabilities = {"chrome.binary": chrome_driver}
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--headless")
driver = webdriver.Remote(
    service.service_url, desired_capabilities=capabilities, options=chrome_options
)
driver.get("http://example.com")
screenshot_path = "screenshot.png"
driver.save_screenshot(screenshot_path)
display(Image(screenshot_path))

Check for chromium processes.

They do not automatically stop.

In [9]:
!pgrep chrome
24242
24396
24405
24407
24422
24423
24440
24449

Save meta info about service so that re-connection can be established.

I discovered via pytesting that services and drivers created inside of a function die automatically after the function is called.

If the services and drivers exist in the module then they tend to persist after the script exits.

The chrome processes persist when this code is run in a script in the terminal.

selresus is a Python library I wrote to wrap the services and drivers needed to accomplish persisting chrome instances with which re-communication can be established.

In [3]:
"""Start a Chrome browser."""
from time import sleep
from selresus.serviceutils import get_service
from selresus.driverutils import get_headless_driver, get_headed_driver


def main():
    """If the drivers are created here and not returned then they are closed.
    They must exist inside of __main__ or they are closed. How does that happen?
    """
    service = get_service()
    service.start()
    driver = get_headless_driver(service)
    driver.get("http://example.com")
    return service, driver


if __name__ == "__main__":
    service, driver = main()
    assert all(item in driver.page_source for item in ("<html", "Example"))

The chrome processes do not persist when this code is run in a script in the terminal.

In [20]:
"""Start a Chrome browser."""
from time import sleep
from selresus.serviceutils import get_service
from selresus.driverutils import get_headless_driver, get_headed_driver


def main():
    """If the drivers are created here and not returned then they are closed.
    They must exist inside of __main__ or they are closed. How does that happen?
    """
    service = get_service()
    service.start()
    driver = get_headless_driver(service)
    driver.get("http://example.com")
    assert all(item in driver.page_source for item in ("<html", "Example"))

    # return service, driver


if __name__ == "__main__":
    main()