Develop, test, and debug scrapers directly in your browser
Built on Monaco Editor with real-time execution, per-tab workspaces, session tracking, and Apple HIG-inspired UX design.
Everything you need for productive scraper development
VS Code's powerful editor engine with XML syntax highlighting, auto-completion, bracket matching, and multi-cursor support.
WebSocket-based live streaming of logs, progress bars, variable values, and execution results as your scraper runs.
Apple HIG-inspired design: each tab is an isolated workspace with dedicated logs, results, variables, and session state.
Built-in session tracking with UUID-based IDs, lifecycle states, duration metrics, HTTP metrics, and plugin breakdown visualization.
Run configurations directly from the editor with instant feedback. Pause execution to inspect variables, resume to continue, or stop to cancel completely.
Standalone executable with embedded Jetty server. No external dependencies, no complex setup—just run and go.
Get up and running in 3 simple steps
Download the IDE distribution from SourceForge:
wget https://sourceforge.net/projects/web-harvest/files/webhervest/2.2.0/webharvest-ide-2.2.0.jar/download -O webharvest-ide-2.2.0.jar
java -jar webharvest-ide-2.2.0.jar
Start the IDE server:
java -jar webharvest-ide-2.2.0.jar
Or use the provided launcher script:
./start-ide.sh # Linux/macOS
start-ide.bat # Windows
Then open your browser at: http://localhost:8080
Quick guide to key workflows
Ctrl/Cmd + Space
for auto-completionCtrl/Cmd + S
F5
)<echo>
tags to output debug infoCtrl/Cmd + S
to save current tab~/.webharvest/configs/
Ctrl/Cmd + S
- Save configurationCtrl/Cmd + N
- New tabF5
- Run scraperShift + F5
- Stop executionCtrl/Cmd + /
- Toggle commentCtrl/Cmd + Space
- Auto-completeThe IDE now includes HTTP Metrics and Plugin Breakdown visualization in the Session panel. Here's what you need to know:
Why simulated? Real-time tracking requires deep EventBus integration with HttpService and all plugin lifecycle events. This involves modifying 50+ classes and extensive testing.
✅ UI is production-ready - All visualizations work perfectly
🔄 Real data coming soon - Foundation complete, integration in progress
How the IDE works under the hood
Customize IDE behavior
Common issues and solutions
Problem: Port 8080 is already in use
Solution:
# Option 1: Find and kill process
lsof -i :8080
kill -9 <PID>
# Option 2: Use different port
java -Dserver.port=9090 -jar webharvest-ide-2.2.0.jar
Problem: Click Run but nothing happens
Solution:
Ctrl/Cmd + Shift + R
)Problem: Old UI/styles showing after update
Solution:
# Hard refresh browser
Ctrl/Cmd + Shift + R
# Or clear cache manually
Settings → Privacy → Clear Browsing Data → Cached Images/Files
Problem: Unsupported class file major version
Solution:
# Check Java version
java -version # Should be 11+
# Install Java 11+ if needed
# Ubuntu/Debian:
sudo apt install openjdk-11-jdk
# macOS (Homebrew):
brew install openjdk@11
Problem: OutOfMemoryError or slow performance
Solution:
# Increase heap size
java -Xmx2g -jar webharvest-ide-2.2.0.jar
# Or configure in start script
export JAVA_OPTS="-Xmx2g -Xms512m"
./start-ide.sh
Problem: Session panel empty after execution
Solution:
Download WebHarvest IDE and experience professional web scraping development
Java 11+ Required • Apache License 2.0 • 15+ MB Download