WebHarvest
Documentation

Comprehensive guides, API references, and tutorials

Documentation Versioning

All documentation is versioned to match specific WebHarvest releases. The current version is v2.2.0. Links to plugin details and API documentation are version-specific to ensure accuracy.

Enhanced Metrics - v2.2.0 Status

✅ Production Metrics: Duration, Elements Processed, Processing Rate, Session History API

⚠️ Simulated (Demo): HTTP Metrics and Plugin Breakdown show example values

Real-time HTTP and plugin tracking via EventBus integration coming in v2.3.0 (Q1 2026). UI is production-ready, data collection in progress.

Getting Started

Your first web scraper in 5 minutes - step-by-step tutorial for beginners.

Settings & CLI NEW

Unified configuration system for CLI and GUI - modern syntax, auto-discovery, file overrides

Bug Fixes v2.2.0 NEW

Critical bugs fixed: Binary downloads, memory leaks, JavaScript support, MIME detection

Beginner Tutorials NEW

Step-by-step guides for common scenarios: form submit, pagination, CSS selectors, file operations

Plugin Reference (v2.2.0)

Complete documentation for all 57 plugins (47 core + 10 extensions)

Architecture

System design, module structure, plugin system, session management

Plugin Development

Create custom plugins with @CorePlugin architecture, automatic discovery, and best practices

IDE Guide

Web-based IDE installation, features, usage, troubleshooting.

Session Management (v2.2)

UUID tracking, lifecycle states, metrics, event system, multi-threading.

Token Tracking (v2.2)

Resource monitoring, quota enforcement, billing integration.

Examples

Ready-to-use configurations organized by category.

API Reference

Core APIs: Session Management, Token Tracking, Client Context, and Service APIs.