Crawlee for Python：构建可靠的 Python 爬虫

Crawlee: Powerful Web Scraping and Browser Automation Library

Introduction

Crawlee is a robust web scraping and browser automation library for Python. It enables developers to build reliable crawlers quickly and efficiently.

Key Features

Python implementation with type hints
Seamless switching between HTTP and headless browser crawling
Built on Playwright for browser automation
Automatic scaling and proxy management
Support for Chrome, Firefox, and other browsers

Use Cases

Web scraping at scale
Browser automation tasks
Data extraction from JavaScript-rendered websites
Maintaining large-scale crawling projects

Teams

Crawlee is developed by experienced web scraping professionals who use it daily for large-scale data extraction projects.

Getting Started

pipx run crawlee create my-crawler
pip install 'crawlee[playwright]'
playwright install

Example Usage

import asyncio
from crawlee.playwright_crawler import PlaywrightCrawler, PlaywrightCrawlingContext

async def main():
    crawler = PlaywrightCrawler(
        max_requests_per_crawl=5,
        headless=False,
        browser_type='firefox',
    )

    @crawler.router.default_handler
    async def request_handler(context: PlaywrightCrawlingContext) -> None:
        await context.enqueue_links()
        data = {
            'url': context.request.url,
            'title': await context.page.title(),
            'content': (await context.page.content())[:100],
        }
        await context.push_data(data)

    await crawler.run(['https://crawlee.dev'])
    await crawler.export_data('results.json')

if __name__ == '__main__':
    asyncio.run(main())

Crawlee for Python 的替代品

No-Code Scraper

# # # # # # # # # # # # # # # # # # # # # # # # # .

Octoparse

轻松 Web 抓取，人人可享。

Kimono Labs

在不存在的地方创建 API。使用 kimono，您可以快速...

Saldor

Saldor 为大型语言模型提取最佳的网络数据。

InstantAPI

立即将网站转换为可定制的 API。

AgentQL

无痛数据提取和Web自动化

Nimble API

无缝地抓取、解析和扩展 Web 数据

Scraping Fish

最简单的网页抓取 API，不会被封锁。

Bytebot

AI 驱动的浏览器自动化。

MrScraper

简化网络数据抓取

Crawlee for PythonBuild reliable scrapers in Python

Crawlee: Powerful Web Scraping and Browser Automation Library

Introduction

Key Features

Use Cases

Teams

Getting Started

Example Usage

Crawlee for Python 的替代品

No-Code Scraper

Octoparse

Kimono Labs

Saldor

InstantAPI

AgentQL

Nimble API

Scraping Fish

Bytebot

MrScraper

每周十大热门产品

Osmos

Zivy

Fibr

AnyParser API (YC S23)

Surfsite AI

AIPhone.AI

Supademo 3.0

Cracked (YC S24)

ConfettiTherapy.com

Creem