RPA and Web Scraping using Jupyter

6 min readFeb 11, 2021
Photo by Rock’n Roll Monkey on Unsplash


In my previous article, I walked through with you how to use Python + requests + lxml to scrape stock data. In this article let’s explore using Robotic Process Automation (RPA) in a Jupyter Notebook environment to perform web scraping. Personally, I find Jupyter Notebook + RPA a great combination as the interactive nature of Jupyter Notebook allows for quick iterations and trial-and-errors when developing robots. Also, another good thing is that all these tools are open source.

I am going to usexeus-robot which is a Jupyter kernel for Robot Framework based on the native implementation of the Jupyter protocol xeus.



I assumed you already have JupyterLab 3.0 and above installed. To install xeus-robot and its dependencies, just follow the instructions and run the following commands

$ conda install -c conda-forge xeus-robot

xeus-robot depends on Robot Framework which is a generic open-source automation framework for acceptance testing, acceptance test-driven development (ATDD), and robotic process automation (RPA).


Since I am going to perform web scraping, I need to install SeleniumLibrary from Robot Framework.

$ pip install --upgrade robotframework-seleniumlibrary

Browser Drivers

I also need to install a web driver based on the browser I want to automate. I can use webdrivermanager to install the browser driver. In this case, I installed for both Firefox and Chrome.

$ pip install webdrivermanager
$ webdrivermanager firefox chrome --linkpath /usr/local/bin

Note that I install the drivers to /usr/local/bin. You can definitely install them to another location, but make sure the location is in your environment PATH.


Software engineer, Data Science and ML practitioner.