Pause/Resume
Execution

Interactive execution control for WebHarvest IDE

Temporarily halt scraper execution and continue later from the same point. Perfect for debugging, resource management, and interactive development workflows.

Key Features

Pause

Halt execution at any time, state preserved

Resume

Continue from exact point where paused

Stop

Permanently cancel execution

API Control

Programmatic access via REST endpoints

Quick Start

Using UI Buttons

Run any configuration (click ▶ Run button)
During execution, control buttons appear below progress bar
Click ⏸ Pause to halt execution
Inspect variables in Variables panel while paused
Click ▶ Resume to continue
Or click ⏹ Stop to cancel permanently

Execution Control Buttons

Running State:
┌──────────────────────────────────┐
│ Progress: ████████░░░░░ 60%      │
│ [⏸ Pause]  [⏹ Stop]             │
└──────────────────────────────────┘

Paused State:
┌──────────────────────────────────┐
│ Status: ⏸ Paused                 │
│ [▶ Resume]  [⏹ Stop]            │
└──────────────────────────────────┘

API Reference

POST /api/execution/{id}/pause

Pauses a running execution.

Request

POST /api/execution/abc-123-def-456/pause

Response (Success)

{
  "success": true,
  "executionId": "abc-123-def-456",
  "action": "pause",
  "status": "PAUSED"
}

POST /api/execution/{id}/resume

Resumes a paused execution.

Request

POST /api/execution/abc-123-def-456/resume

POST /api/execution/{id}/stop

Stops execution permanently (cannot resume).

Request

POST /api/execution/abc-123-def-456/stop

Use Cases

1. Interactive Debugging

Workflow

1. Run scraper
2. Pause after HTTP request
3. Inspect HTML in Variables panel
4. Verify XPath selectors work
5. Resume to continue
6. Pause again after next step
7. Iterate until working perfectly

2. Resource Management

Pause expensive scrapers during peak hours:

JavaScript

const hour = new Date().getHours();

if (hour >= 9 && hour <= 17) {  // Business hours
    await fetch(`/api/execution/${executionId}/pause`, {
        method: 'POST'
    });
    console.log('Paused during peak hours');
}

3. Manual Data Review

Pause to manually verify scraped data before continuing:

XML Configuration

<config>
  <!-- Scrape first page -->
  <http url="https://example.com/page1"/>
  <def var="page1">${http}</def>
  
  <!-- PAUSE HERE: Check if page1 looks correct -->
  
  <!-- Continue with processing -->
  <xpath expression="//data">
    <html-to-xml>${page1}</html-to-xml>
  </xpath>
</config>

Technical Implementation

How It Works

Pause mechanism:

Sets paused flag in ExecutionJob
Execution thread checks flag between plugins
Thread blocks on pauseLock monitor
WebSocket sends PAUSED status to frontend

Resume mechanism:

Clears paused flag
Notifies pauseLock monitor
Execution thread wakes up and continues
WebSocket sends RUNNING status

Stop mechanism:

Sets stopped flag
Interrupts execution thread
Session marked as CANCELLED
Resources cleaned up immediately

Current Limitations (v2.2.0)

Basic Implementation

Pause granularity: Between plugins only (not during plugin execution)
Sleep plugin: Completes before pause takes effect
HTTP requests: Current request completes before pause
No persistence: Paused state lost on server restart

Advanced pause/resume with checkpoints coming in v2.3.0

Pause/ResumeExecution

Key Features

Pause

Resume

Stop

API Control

Quick Start

Using UI Buttons

Execution Control Buttons

API Reference

POST /api/execution/{id}/pause

POST /api/execution/{id}/resume

POST /api/execution/{id}/stop

Use Cases

1. Interactive Debugging

2. Resource Management

3. Manual Data Review

Technical Implementation

How It Works

Current Limitations (v2.2.0)

Basic Implementation

Related Documentation

Pause/Resume
Execution