WebHarvest
Examples

Ready-to-use scraping configurations

Professional examples organized by category: Web & HTTP, Data Extraction, Data Transformation, Control Flow, and File Operations.

How to Use These Examples

  1. Download: Click any example below to download the XML configuration
  2. Edit: Open in WebHarvest IDE or your favorite XML editor
  3. Customize: Update URLs, selectors, and parameters for your needs
  4. Run: Execute via CLI (java -jar webharvest-cli.jar config.xml) or IDE
Try in IDE View Core Plugins

Web & HTTP

HTTP requests, HTML parsing, web scraping

Data Extraction

XPath, XQuery, regex-based data extraction

Data Transformation

JSON, XML, CSV conversion and processing

Control Flow

Loops, conditions, functions, error handling

Extension Modules

Database, Mail, FTP, Browser automation

Extension modules require separate Maven dependencies. View plugin documentation →

All Available Examples

Complete list of downloadable configurations

simple_test.xml

Basic HTTP test configuration

advanced_search_example.xml

Advanced search and extraction

amazon_test.xml

Amazon product scraping

api_integration.xml

REST API integration

basic_variables_example.xml

Variable usage examples

canon.xml

XML canonicalization

canon_with_namespace.xml

Namespace handling

crawler.xml

Web crawler with link following

data_processing_pipeline.xml

Complete ETL pipeline

database_plugin_demo.xml

Database plugin example

ecommerce_monitoring.xml

Price and inventory monitoring

flickr.xml

Flickr photo scraping

ftp_plugin_demo.xml

FTP plugin example

functions.xml

Custom function examples

google_images.xml

Google Images scraping

improved_nytimes.xml

New York Times articles

mail_plugin_demo.xml

Mail plugin example

modern_web_scraping.xml

Modern scraping techniques

product_catalog.xml

Product catalog extraction

social_media_analytics.xml

Social analytics

webbrowser_plugin_demo.xml

Browser plugin example

xquery.xml

XQuery transformations

yahoomail.xml

Yahoo Mail integration