This example goes over how to load data from webpages using Playwright. One document will be created for each webpage. Playwright is a Node.js library that provides a high-level API for controlling multiple browser engines, including Chromium, Firefox, and WebKit. You can use Playwright to automate web page interactions, including extracting data from dynamic web pages that require JavaScript to render. If you want a lighterweight solution, and the webpages you want to load do not require JavaScript to render, you can use theDocumentation Index
Fetch the complete documentation index at: https://langchain-5e9cc07a-preview-mdrxyo-1777658790-7be347c.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
CheerioWebBaseLoader instead.
Setup
npm
Usage
Options
Here’s an explanation of the parameters you can pass to the PlaywrightWebBaseLoader constructor using the PlaywrightWebBaseLoaderOptions interface:-
launchOptions: an optional object that specifies additional options to pass to the playwright.chromium.launch() method. This can include options such as the headless flag to launch the browser in headless mode. -
gotoOptions: an optional object that specifies additional options to pass to the page.goto() method. This can include options such as the timeout option to specify the maximum navigation time in milliseconds, or the waitUntil option to specify when to consider the navigation as successful. -
evaluate: an optional function that can be used to evaluate JavaScript code on the page using a custom evaluation function. This can be useful for extracting data from the page, interacting with page elements, or handling specific HTTP responses. The function should return a Promise that resolves to a string containing the result of the evaluation.
PlaywrightWebBaseLoader constructor, you can customize the behavior of the loader and use Playwright’s powerful features to scrape and interact with web pages.
Here is a basic example to do it:
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

