Rollout Collection

The rollout collection module (webgym/environment/) handles trajectory collection through browser interactions.

Module Structure

webgym/environment/
├── async_webgym.py           # Main async rollout environment
├── client.py                 # HTTP client for browser server
├── task_monitor.py           # Task progress monitoring
├── process_isolator.py       # Process isolation for stability
├── pickleable_http_functions.py  # Serializable HTTP operations
├── actions.py                # Action definitions
└── foundry_endpoints_models.py   # API endpoint models

AsyncWebGym

The AsyncWebGym class (async_webgym.py) is the main environment for collecting trajectories.

from webgym.environment.async_webgym import AsyncWebGym

env = AsyncWebGym(
    master_port=7000,
    host_ip="localhost",
    cpu_cluster_token=token,
    sampled_tasks=tasks,
    save_path="/path/to/save",
    num_workers=20,
    verbose=True,
    retry_policy=retry_config,
    task_timeout_minutes=20,
    completion_threshold=0.98,
    completion_grace_period=120,
    split='train',
    interaction_mode='coordinates'
)

Key Parameters:

num_workers: Concurrent browser instances
task_timeout_minutes: Max time per task before timeout
completion_threshold: Fraction of tasks to complete before killing stragglers (e.g., 0.98)
completion_grace_period: Seconds to wait before killing remaining tasks

Evaluation Integration

AsyncWebGym integrates with the Evaluator class (see Models) for reward computation:

reward_value, evaluation, is_blocked = agent.evaluator.get_verifiable_reward(trajectory)
is_blocked = agent.evaluator.check_if_blocked(trajectory)