Web Page Scraping and Testing using GraphQL and Playwright

Build a web page scraping and testing service using GraphQL and Playwright.

Photo by Frank Albrecht on Unsplash

Overview

In my previous article, I walked through with you on developing serverless APIs to test web pages under different resolutions using Puppeteer. In this article let’s use Playwright, which is a similar library to perform web browser automation.

Playwright is a library available in Node.js, Python, and Java to automate Chromium, Firefox, and WebKit with a single API. It is built to enable cross-browser web automation that is ever-green, capable, reliable, and fast.

Setup

Let’s install Playwright and browser binaries for Chromium, Firefox, and WebKit. Playwright requires Python 3.7+.

$ pip install playwright
$ playwright install

Let’s install the required Python libraries. The requirements.txt is shown below. I am going to use FastAPI and graphene to develop the GraphQL APIs.

Pillow
fastapi
playwright
graphene>=2.0
uvicorn

Run pip install -r requirements.txt to install the libraries.

Application

Below is the FastAPI source code which

  • exposes a GraphQL query endpoint that accepts URL, width, and height parameters.
  • uses Playwright to capture the web page with the preferred width and height.
  • returns a base64-encoded PNG image string.
main.py

You can browse to http://localhost:8088 to play around with the API using the GraphiQL playground.

GraphiQL Playground

Let’s develop a Node.js client to test the service. I am going to use the graphql-request library.

const {request, gql} = require('graphql-request');
const fs = require("fs");
const query = gql`
{
screenshot(url: "http://www.medium.com", width:1024, height:768)
}
`
const endpoint = 'http://localhost:8088';request(endpoint, query).then((data) => {
console.log(data.screenshot);
const buffer = Buffer.from(data.screenshot, "base64");
fs.writeFileSync("screenshot.png", buffer);
});

The Javascript client requests the API to take a screenshot of Medium using the viewport of 1024x768. The screenshot is saved as a PNG image.

Screenshot in Specific ViewPort

The source code for this article can be found in this repository.

References

Programmer and occasional blogger.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store