16.9.14

Automatically Download Files with Selenium

Selenium itself cannot interact with system-level dialogs opened by JS in the browser. In order to download PDFs as part of the browser automation process, it requires the help from either additional frameworks or an approach that handles the downloads automatically, and prevent Firefox from popping up the "Save file" dialog.

One way is to use the Firefox preferences to download the files directly without going through the dialog box, and here is the code that will do just that.

This approach is better as we don't have to use custom/third party software to deal with the dialog boxes separately, which can become tedious and difficult to integrate with the already existing complex setup we usually have for selenium.


public WebDriver getFireFoxProfileDriverForDownload(){

        String downloadFolderPath = "C:\\All_Downloads\\Java";

        //creating an object for the profiles
        ProfilesIni allProfiles = new ProfilesIni();

        //getting the firefox profile for webdriver, which was created manually
        FirefoxProfile profile = allProfiles.getProfile("webdriverprofile");

        //Setting the preferences for download -
        profile.setPreference("browser.download.folderList",2);
        profile.setPreference("browser.download.dir", downloadFolderPath);
        profile.setPreference("browser.helperApps.neverAsk.saveToDisk","application/pdf");

        //Prevent Firefox from previewing PDFs -
        profile.setPreference("pdfjs.disabled",true);

        //Disabling the third party PDF viewers -
        profile.setPreference("plugin.scan.plid.all",false);
        profile.setPreference("plugin.scan.Acrobat","90.0");
        profile.setPreference("plugin.disable_full_page_plugin_for_types","application/pdf");

        //instantiating the firefox browser with this new profile
        WebDriver firefoxDriver = new FirefoxDriver(profile);

        //This driver instance can now be used to open firefox with the above preferences
        return firefoxDriver;

    }


Description of the code above -
·        Setting the preferences for download -
The preferences can be set using the setPreference method of the FirefoxProfile class. This method needs the value of the preferences in key-value pairs, where the key is the name of the preference and the value is the option that you want to set it to. The name and acceptable/allowable values for the preferences can be found by browsing the 'config' properties of the firefox, via the 'about:config' page. The key would be the 'Preference Name' and the value would be the 'Value', as seen from the 'config' window of the firefox.
Firefox’s download manager preferences can be set programmatically while instantiating FirefoxDriver. The following properties can be used for this.
  • "browser.download.folderList" - This controls the default folder to download a file to. 0 indicates the Desktop, 1 indicates the systems default downloads location and 2 indicates a custom folder.
  • "browser.download.dir" - This holds the custom destination folder for downloading. It is activated if browser.download.folderList has been set to 2.
  • "browser.helperApps.neverAsk.saveToDisk" - This stores a comma-separated list of MIME types to save to disk without asking what to use to open the file.

The MIME type defined here is "application/pdf", which is a type that most PDF files use. However, if the target PDF file has a non-standard MIME type, then “Save file” dialog might still show up. In order to fix this issue, the actual MIME type has to be added into browser.helperApps.neverAsk.saveToDisk property, which can be checked out using either of the following approaches:
  • Upload file to online tools like - http://mime.ritey.com/
  • Download file and monitor MIME type in Chrome’s developer tool or web debugging proxy like Fiddler, CharlesProxy, etc.
·        Prevent Firefox from previewing PDFs -
With the release of Firefox 19.0, PDF.js has been integrated into Firefox to provide built-in ability of displaying PDF files inside browser. It tries to parse and render PDFs into HTML5, which can be automated using Selenium WebDriver in theory. However, to download PDFs instead of preview in Firefox, another about:config entry needs to be changed to disable PDF.js.

·        Disabling the third party PDF viewers -
Except for Firefox’s built-in PDF viewer, there might be other third party plugins preventing Firefox from downloading PDFs automatically. If a machine has Adobe Reader installed, then default PDF viewing setting in Firefox might have been set to Adobe Acrobat without notice.
To avoid previewing PDFs with those plugins, two more about:config entries need to be configured when starting WebDriver instance.
  • "plugin.scan.plid.all" - This needs to be set to false, so that Firefox won’t scan and load plugins.
  • "plugin.scan.Acrobat" - This is a key that holds the minimum allowed version number that Adobe Acrobat is allowed to launch. Setting it to a number larger than currently installed Adobe Acrobat version should do the trick.
  • "plugin.disable_full_page_plugin_for_types" = "application/pdf" - Without this, the pdf gets opened in the browser itself, instead of getting downloaded.