ScrapeGraph
This notebook provides a quick overview for getting started with ScrapeGraph tools. For detailed documentation of all ScrapeGraph features and configurations head to the API reference.
For more information about ScrapeGraph AI:
Overview
Integration details
| Class | Package | Serializable | JS support | Package latest |
|---|---|---|---|---|
| SmartScraperTool | langchain-scrapegraph | ✅ | ❌ | |
| MarkdownifyTool | langchain-scrapegraph | ✅ | ❌ | |
| LocalScraperTool | langchain-scrapegraph | ✅ | ❌ | |
| GetCreditsTool | langchain-scrapegraph | ✅ | ❌ |
Tool features
| Tool | Purpose | Input | Output |
|---|---|---|---|
| SmartScraperTool | Extract structured data from websites | URL + prompt | JSON |
| MarkdownifyTool | Convert webpages to markdown | URL | Markdown text |
| LocalScraperTool | Extract data from HTML content | HTML + prompt | JSON |
| GetCreditsTool | Check API credits | None | Credit info |
Setup
The integration requires the following packages:
%pip install --quiet -U langchain-scrapegraph
Note: you may need to restart the kernel to use updated packages.
Credentials
You'll need a ScrapeGraph AI API key to use these tools. Get one at scrapegraphai.com.
import getpass
import os
if not os.environ.get("SGAI_API_KEY"):
os.environ["SGAI_API_KEY"] = getpass.getpass("ScrapeGraph AI API key:\n")
It's also helpful (but not needed) to set up LangSmith for best-in-class observability:
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()