Skip to content Skip to sidebar Skip to footer

How Can I Make A Selenium Script Undetectable Using Geckodriver And Firefox Through Python?

Is there a way to make your Selenium script undetectable in Python using geckodriver? I'm using Selenium for scraping. Are there any protections we need to use so websites can't de

Solution 1:

There are different methods to avoid websites detecting the use of Selenium.

  1. The value of navigator.webdriver is set to true by default when using Selenium. This variable will be present in Chrome as well as Firefox. This variable should be set to "undefined" to avoid detection.

  2. A proxy server can also be used to avoid detection.

  3. Some websites are able to use the state of your browser to determine if you are using Selenium. You can set Selenium to use a custom browser profile to avoid this.

The code below uses all three of these approaches.

profile = webdriver.FirefoxProfile('C:\\Users\\You\\AppData\\Roaming\\Mozilla\\Firefox\\Profiles\\something.default-release')

PROXY_HOST = "12.12.12.123"
PROXY_PORT = "1234"
profile.set_preference("network.proxy.type", 1)
profile.set_preference("network.proxy.http", PROXY_HOST)
profile.set_preference("network.proxy.http_port", int(PROXY_PORT))
profile.set_preference("dom.webdriver.enabled", False)
profile.set_preference('useAutomationExtension', False)
profile.update_preferences()
desired = DesiredCapabilities.FIREFOX

driver = webdriver.Firefox(firefox_profile=profile, desired_capabilities=desired)

Once the code is run, you will be able to manually check that the browser run by Selenium now has your Firefox history and extensions. You can also type "navigator.webdriver" into the devtools console to check that it is undefined.

Solution 2:

The fact that selenium driven Firefox / GeckoDriver gets detected doesn't depends on any specific GeckoDriver or Firefox version. The Websites themselves can detect the network traffic and can identify the Browser Client i.e. Web Browser as WebDriver controled.

As per the documentation of the WebDriver Interface in the latest editor's draft of WebDriver - W3C Living Document the webdriver-activeflag which is initially set as false, is set to true when the user agent is under remote control i.e. when controlled through Selenium.

NavigatorAutomationInformation

Now that the NavigatorAutomationInformation interface should not be exposed on WorkerNavigator.

mixin NavigatorAutomationInformation

So,

webdriver
    Returns trueif webdriver-active flag isset, false otherwise.

where as,

navigator.webdriver
    Defines a standard way for co-operating user agents to inform the document that it is controlled by WebDriver, for example so that alternate code paths can be triggered during automation.

So, the bottom line is:

Selenium identifies itself


However some generic approaches to avoid getting detected while web-scraping are as follows:

Post a Comment for "How Can I Make A Selenium Script Undetectable Using Geckodriver And Firefox Through Python?"