Training Jobs - Veri Documentation

Create Training Job

base_model

string

required

HuggingFace model ID (e.g., Qwen/Qwen3-4B, THUDM/CogVideoX-2b).

dataset_id

string

required

ID of the uploaded dataset to train on.

reward_function_id

string

ID of the uploaded reward function. Required for grpo. Must be omitted for sft_video_gen.

method

string

default:"grpo"

Training method: grpo or sft_video_gen.

output_name

string

required

Name for the output model checkpoint.

hyperparameters

object

Training hyperparameters. The schema depends on the method field.

Show GRPO hyperparameters (method: grpo)

learning_rate

float

default:"1e-6"

Learning rate for the optimizer.

num_epochs

integer

default:"1"

Number of training epochs.

max_steps

integer

Optional explicit training step limit.

rollouts_per_prompt

integer

default:"8"

Number of completions generated per prompt.

kl_coef

float

default:"0.001"

KL penalty coefficient.

max_prompt_length

integer

default:"1024"

Prompt token cap.

max_response_length

integer

default:"2048"

Completion token cap.

global_batch_size

integer

default:"64"

Global batch size.

seed

integer

default:"42"

Random seed.

Show Video Gen SFT hyperparameters (method: sft_video_gen)

learning_rate

float

default:"1e-3"

Learning rate for the optimizer.

num_epochs

integer

default:"30"

Number of training epochs.

max_steps

integer

Optional explicit training step limit.

lora_rank

integer

default:"64"

LoRA adapter rank.

lora_alpha

integer

default:"64"

LoRA scaling factor.

resolution_height

integer

default:"480"

Video frame height in pixels.

resolution_width

integer

default:"720"

Video frame width in pixels.

num_frames

integer

default:"49"

Number of video frames per sample.

fps

integer

default:"8"

Frames per second.

batch_size

integer

default:"1"

Per-device batch size.

gradient_accumulation_steps

integer

default:"4"

Gradient accumulation steps.

seed

integer

default:"42"

Random seed.

gpu

object

required

Explicit GPU configuration. Include both type and count.

provider

string

Compute provider: prime_intellect, lambda, runpod. Omit to auto-select the cheapest available.

checkpoint_destination

object

Optional final checkpoint target. Defaults to { "type": "veri" }.

environments

object

OpenReward environments to use as the reward signal. Mutually exclusive with reward_function_id — pick one reward source. Each key is an environment ID, and the value contains configuration for that environment.

{
  "coding_sandbox": { "splits": ["train"] }
}

Requires OpenReward integration to be configured first via PUT /v1/settings/integrations/openreward.

Request (GRPO)

curl -X POST https://api.veri.studio/v1/training_jobs \
  -H "Authorization: Bearer vk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "base_model": "Qwen/Qwen3-4B",
    "dataset_id": "ds_abc123",
    "reward_function_id": "rf_def456",
    "method": "grpo",
    "output_name": "qwen3-4b-math",
    "hyperparameters": {
      "learning_rate": 1e-6,
      "num_epochs": 1,
      "rollouts_per_prompt": 8,
      "max_response_length": 2048,
      "global_batch_size": 64
    },
    "gpu": {
      "type": "A100-80GB",
      "count": 1
    },
    "checkpoint_destination": {
      "type": "veri"
    }
  }'

Request (GRPO with OpenReward Environment)

curl -X POST https://api.veri.studio/v1/training_jobs \
  -H "Authorization: Bearer vk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "base_model": "Qwen/Qwen3-4B",
    "dataset_id": "ds_abc123",
    "environments": {
      "coding_sandbox": { "splits": ["train"] }
    },
    "method": "grpo",
    "output_name": "qwen3-4b-coding",
    "hyperparameters": {
      "learning_rate": 1e-6,
      "num_epochs": 1,
      "rollouts_per_prompt": 8,
      "max_response_length": 2048
    },
    "gpu": {
      "type": "A100-80GB",
      "count": 1
    }
  }'

When using environments, omit reward_function_id. The environment provides the reward signal via OpenReward’s evaluate endpoint. You must configure your OpenReward API key first via Settings → Integrations.

Request (Video Gen SFT)

curl -X POST https://api.veri.studio/v1/training_jobs \
  -H "Authorization: Bearer vk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "base_model": "THUDM/CogVideoX-2b",
    "dataset_id": "ds_vid123",
    "method": "sft_video_gen",
    "output_name": "cogvideo-custom",
    "hyperparameters": {
      "learning_rate": 1e-3,
      "num_epochs": 30,
      "lora_rank": 64
    },
    "gpu": {
      "type": "A100-80GB",
      "count": 1
    }
  }'

Response

{
  "id": "job_xyz789",
  "object": "training_job",
  "status": "queued",
  "base_model": "Qwen/Qwen3-4B",
  "dataset_id": "ds_abc123",
  "reward_function_id": "rf_def456",
  "method": "grpo",
  "output_name": "qwen3-4b-math",
  "hyperparameters": {
    "learning_rate": 1e-6,
    "num_epochs": 1,
    "max_steps": null,
    "rollouts_per_prompt": 8,
    "kl_coef": 0.001,
    "max_prompt_length": 1024,
    "max_response_length": 2048,
    "global_batch_size": 64,
    "seed": 42
  },
  "dashboard_url": null,
  "download_url": null,
  "error": null,
  "gpu": null,
  "gpu_requested": {
    "type": "A100-80GB",
    "count": 1
  },
  "provider": null,
  "duration_seconds": null,
  "cost_usd": null,
  "started_at": null,
  "completed_at": null,
  "created_at": "2026-04-14T12:10:00Z"
}

List Training Jobs

limit

integer

default:"20"

Maximum number of jobs to return.

after

string

Cursor for pagination. Pass the ID of the last item from the previous page.

status

string

Filter by job status: queued, provisioning, running, completed, failed, cancelled.

method

string

Filter by training method: grpo, sft_video_gen.

Request

curl "https://api.veri.studio/v1/training_jobs?status=running&method=grpo&limit=10" \
  -H "Authorization: Bearer vk_your_api_key"

Response

{
  "object": "list",
  "data": [
    {
      "id": "job_xyz789",
      "object": "training_job",
      "status": "running",
      "base_model": "Qwen/Qwen3-4B",
      "method": "grpo",
      "output_name": "qwen3-4b-math",
      "provider": "prime_intellect",
      "dashboard_url": "https://Training in progress",
      "created_at": "2026-04-14T12:10:00Z",
      "started_at": "2026-04-14T12:11:00Z"
    }
  ],
  "has_more": false
}

Get Training Job

Request

curl https://api.veri.studio/v1/training_jobs/job_xyz789 \
  -H "Authorization: Bearer vk_your_api_key"

Response

{
  "id": "job_xyz789",
  "object": "training_job",
  "status": "running",
  "base_model": "Qwen/Qwen3-4B",
  "dataset_id": "ds_abc123",
  "reward_function_id": "rf_def456",
  "method": "grpo",
  "output_name": "qwen3-4b-math",
  "hyperparameters": {
    "learning_rate": 1e-6,
    "num_epochs": 1,
    "max_steps": null,
    "rollouts_per_prompt": 8,
    "kl_coef": 0.001,
    "max_prompt_length": 1024,
    "max_response_length": 2048,
    "global_batch_size": 64,
    "seed": 42
  },
  "dashboard_url": "https://Training in progress",
  "download_url": null,
  "error": null,
  "gpu": {
    "type": "A100-80GB",
    "count": 1
  },
  "gpu_requested": {
    "type": "A100-80GB",
    "count": 1
  },
  "provider": "prime_intellect",
  "cost_usd": null,
  "created_at": "2026-04-14T12:10:00Z",
  "started_at": "2026-04-14T12:11:00Z",
  "completed_at": null
}

Job Statuses

Status	Description
`queued`	Job row created and waiting to be submitted
`provisioning`	Backend accepted the job and compute is starting
`running`	Training is in progress
`completed`	Training finished successfully; checkpoint is available
`failed`	Training failed; check `error.message` and `error.code`
`cancelled`	Job was cancelled by the user

Get Job Events

Returns the lifecycle event history for a job.

limit

integer

default:"50"

Maximum number of events to return.

Request

curl "https://api.veri.studio/v1/training_jobs/job_xyz789/events?limit=20" \
  -H "Authorization: Bearer vk_your_api_key"

Response

{
  "items": [
    {
      "event_type": "created",
      "timestamp": "2026-04-14T12:10:00Z",
      "metadata": {}
    },
    {
      "event_type": "submitted",
      "timestamp": "2026-04-14T12:10:05Z",
      "metadata": {}
    },
    {
      "event_type": "running",
      "timestamp": "2026-04-14T12:11:00Z",
      "metadata": {}
    }
  ]
}

Get Training Metrics

Returns sampled per-step training metrics (loss, reward, KL, learning rate, and any other numeric value emitted by the trainer). Metrics are sampled at ~5 second intervals — the worker throttles status callbacks, so adjacent steps within one window collapse to the most recent.

keys

string

Comma-separated list of metric keys to return (e.g. loss,reward). When omitted, all available keys are returned.

from_step

integer

Lower bound (inclusive). Pass latest_step + 1 from the previous response for incremental polling.

to_step

integer

Upper bound (inclusive).

max_points

integer

default:"500"

Server-side stride downsample target. Long runs are reduced to at most this many evenly-spaced points.

Request

curl "https://api.veri.studio/v1/training_jobs/job_xyz789/metrics?keys=loss,reward&max_points=500" \
  -H "Authorization: Bearer vk_your_api_key"

Response

{
  "metrics": [
    { "step": 0, "values": { "loss": 1.42, "reward": 0.10 } },
    { "step": 5, "values": { "loss": 1.31, "reward": 0.18 } },
    { "step": 10, "values": { "loss": 1.20, "reward": 0.25 } }
  ],
  "available_keys": ["loss", "reward", "kl", "learning_rate"],
  "latest_step": 1234
}

available_keys always reflects every metric key persisted for the job, regardless of keys filtering. latest_step is the absolute maximum step regardless of any stride downsample, so clients can use it as the next from_step - 1 for polling.

Cancel Training Job

Request

curl -X POST https://api.veri.studio/v1/training_jobs/job_xyz789/cancel \
  -H "Authorization: Bearer vk_your_api_key"

Response

{
  "id": "job_xyz789",
  "object": "training_job",
  "status": "cancelled"
}

Only non-terminal jobs can be cancelled. If the job is already completed, failed, or cancelled, the API returns 400.

Stream Job Logs

Returns job logs as Server-Sent Events (SSE).

Real-time log streaming from training is not yet implemented. Currently this endpoint returns a link to the provider dashboard where logs can be viewed. Full SSE streaming of training progress (step, loss, reward) is planned.

Request

curl -N https://api.veri.studio/v1/training_jobs/job_xyz789/logs \
  -H "Authorization: Bearer vk_your_api_key"

Response (SSE stream)

data: Logs available at: https://Training in progress

Get Model Checkpoint

Returns a download URL for the trained model checkpoint. Only available for jobs with completed status.

Request

curl https://api.veri.studio/v1/training_jobs/job_xyz789/model \
  -H "Authorization: Bearer vk_your_api_key"

Response

{
  "download_url": "https://storage.veri.studio/checkpoints/job_xyz789/model.tar.gz"
}

The same URL is also surfaced as download_url on completed job responses when checkpoint resolution succeeds.

Documentation Index

​Create Training Job

​Request (GRPO)

​Request (GRPO with OpenReward Environment)

​Request (Video Gen SFT)

​Response

​List Training Jobs

​Request

​Response

​Get Training Job

​Request

​Response

​Job Statuses

​Get Job Events

​Request

​Response

​Get Training Metrics

​Request

​Response

​Cancel Training Job

​Request

​Response

​Stream Job Logs

​Request

​Response (SSE stream)

​Get Model Checkpoint

​Request

​Response

Create Training Job

Request (GRPO)

Request (GRPO with OpenReward Environment)

Request (Video Gen SFT)

Response

List Training Jobs

Request

Response

Get Training Job

Request

Response

Job Statuses

Get Job Events

Request

Response

Get Training Metrics

Request

Response

Cancel Training Job

Request

Response

Stream Job Logs

Request

Response (SSE stream)

Get Model Checkpoint

Request

Response