ScrapeGraph
This notebook provides a quick overview for getting started with ScrapeGraph tools. For detailed documentation of all ScrapeGraph features and configurations head to the API reference.
For more information about ScrapeGraph AI:
Overview
Integration details
Class | Package | Serializable | JS support | Package latest |
---|---|---|---|---|
SmartScraperTool | langchain-scrapegraph | ✅ | ❌ | |
MarkdownifyTool | langchain-scrapegraph | ✅ | ❌ | |
LocalScraperTool | langchain-scrapegraph | ✅ | ❌ | |
GetCreditsTool | langchain-scrapegraph | ✅ | ❌ |
Tool features
Tool | Purpose | Input | Output |
---|---|---|---|
SmartScraperTool | Extract structured data from websites | URL + prompt | JSON |
MarkdownifyTool | Convert webpages to markdown | URL | Markdown text |
LocalScraperTool | Extract data from HTML content | HTML + prompt | JSON |
GetCreditsTool | Check API credits | None | Credit info |
Setup
The integration requires the following packages:
%pip install --quiet -U langchain-scrapegraph
Note: you may need to restart the kernel to use updated packages.
Credentials
You'll need a ScrapeGraph AI API key to use these tools. Get one at scrapegraphai.com.
import getpass
import os
if not os.environ.get("SGAI_API_KEY"):
os.environ["SGAI_API_KEY"] = getpass.getpass("ScrapeGraph AI API key:\n")
It's also helpful (but not needed) to set up LangSmith for best-in-class observability:
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()