Workflow Schema Reference¶

Complete YAML schema reference for Ploston workflows.

Canonical Example¶

This example shows every field. Copy it as a starting point for your workflows.

# ─────────────────────────────────────────────────────────────────
# METADATA (required)
# ─────────────────────────────────────────────────────────────────
name: data-pipeline                    # Required: Workflow identifier
version: "1.0.0"                       # Required: Semantic version
description: "Fetch, transform, and validate data"  # Optional

# ─────────────────────────────────────────────────────────────────
# PACKAGES (optional)
# ─────────────────────────────────────────────────────────────────
packages:
  profile: standard                    # minimal | standard | data_science
  additional:                          # Extra packages to allow
    - requests

# ─────────────────────────────────────────────────────────────────
# DEFAULTS (optional)
# ─────────────────────────────────────────────────────────────────
defaults:
  timeout: 30                          # Default step timeout (seconds)
  on_error: fail                       # fail | continue | retry
  retry:                               # Retry config (when on_error: retry)
    max_attempts: 3
    initial_delay: 1.0
    max_delay: 30.0
    backoff_multiplier: 2.0

# ─────────────────────────────────────────────────────────────────
# INPUTS (optional, but usually needed)
# Format: Array of input definitions
# ─────────────────────────────────────────────────────────────────
inputs:
  # Simple syntax: just the name (required, type: string)
  - url

  # With default value (makes it optional)
  - format: "json"

  # Full definition with all options
  - count:
      type: integer                    # string | integer | number | boolean | array | object
      required: false                  # Default: true
      default: 10                      # Default value
      description: "Number of items"   # For documentation
      minimum: 1                       # Validation: minimum value
      maximum: 100                     # Validation: maximum value

  # Enum constraint
  - output_format:
      type: string
      enum: ["json", "csv", "xml"]     # Allowed values
      default: "json"

  # Pattern constraint
  - email:
      type: string
      pattern: "^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+$"

# ─────────────────────────────────────────────────────────────────
# STEPS (required, at least one)
# ─────────────────────────────────────────────────────────────────
steps:
  # Tool step: calls an MCP tool
  - id: fetch                          # Required: unique step identifier
    tool: http_get                     # MCP tool name
    params:                            # Tool parameters (templates allowed)
      url: "{{ inputs.url }}"
      headers:
        Accept: "application/json"
    timeout: 60                        # Override default timeout
    on_error: retry                    # Override default error handling
    retry:
      max_attempts: 3
      initial_delay: 2.0

  # Code step: runs Python in sandbox
  - id: transform
    code: |
      import json

      # Access previous step output
      data = context.steps['fetch'].output

      # Access inputs
      limit = context.inputs.get('count', 10)

      # Process data
      items = data.get('items', [])[:limit]

      # Return result (available as steps.transform.output)
      return {"items": items, "count": len(items)}

  # Step with dependency
  - id: validate
    depends_on: [transform]            # Wait for these steps first
    code: |
      data = context.steps['transform'].output
      if data['count'] == 0:
          raise ValueError("No items found")
      return {"valid": True, "count": data['count']}

# ─────────────────────────────────────────────────────────────────
# OUTPUTS (optional)
# ─────────────────────────────────────────────────────────────────

# Option 1: Single output (simple)
output: "{{ steps.validate.output }}"

# Option 2: Multiple named outputs (use this OR output, not both)
# outputs:
#   - name: result
#     from_path: steps.validate.output
#     description: "Validation result"
#   - name: item_count
#     value: "{{ steps.transform.output.count }}"
#     description: "Number of items processed"

Top-Level Structure¶

# Required
name: string          # Workflow identifier (alphanumeric, hyphens)
version: string       # Semantic version (e.g., "1.0", "2.1.3")

# Optional
description: string   # Human-readable description
packages: object      # Python package configuration
defaults: object      # Default step settings

# Schema
inputs: array         # Input definitions (array format)
steps: array          # Step definitions (required, at least one)
outputs: array        # Output definitions (optional)
output: string        # Single output expression (alternative to outputs)

Metadata¶

`name` (required)¶

Unique workflow identifier.

Type: string
Pattern: ^[a-zA-Z][a-zA-Z0-9-]*$
Example: data-transform, hello-world

`version` (required)¶

Semantic version string.

Type: string
Example: "1.0", "2.1.3"

`description` (optional)¶

Human-readable description.

Type: string
Example: "Transform and validate JSON data"

Packages Configuration¶

packages:
  profile: string     # Package profile: minimal | standard | data_science
  additional: array   # Additional packages to install

Profiles¶

Profile	Packages
`minimal`	json, re, datetime, math
`standard`	minimal + collections, itertools, functools, hashlib, uuid
`data_science`	standard + numpy, pandas (if available)

Defaults¶

defaults:
  timeout: integer    # Default step timeout (seconds)
  on_error: string    # Error handling: fail | continue | retry
  retry: object       # Retry configuration

Retry Configuration¶

defaults:
  retry:
    max_attempts: 3           # Maximum retry attempts
    initial_delay: 1.0        # Initial delay (seconds)
    max_delay: 30.0           # Maximum delay (seconds)
    backoff_multiplier: 2.0   # Exponential backoff multiplier

Inputs¶

Format: inputs is an array (list) of input definitions.

Ploston supports three syntaxes for input definitions:

Syntax 1: Simple String (Required Input)¶

inputs:
  - url                    # Required string input named "url"
  - topic                  # Required string input named "topic"

Syntax 2: Name with Default (Optional Input)¶

inputs:
  - format: "json"         # Optional, defaults to "json"
  - count: 10              # Optional, defaults to 10

Syntax 3: Full Definition (All Options)¶

inputs:
  - url:
      type: string         # Required: string | integer | number | boolean | array | object
      required: true       # Optional: default is true
      default: null        # Optional: default value (makes input optional)
      description: "URL"   # Optional: human-readable description
      enum: [...]          # Optional: allowed values
      pattern: "^https?"   # Optional: regex pattern (strings only)
      minimum: 1           # Optional: minimum value (numbers only)
      maximum: 100         # Optional: maximum value (numbers only)

Input Types¶

Type	JSON Type	Example	Notes
`string`	string	`"hello"`	Default type if not specified
`integer`	number	`42`	Whole numbers only
`number`	number	`3.14`	Any numeric value
`boolean`	boolean	`true`	true or false
`array`	array	`[1, 2, 3]`	JSON array
`object`	object	`{"key": "value"}`	JSON object

Complete Input Examples¶

inputs:
  # Simple required inputs
  - url
  - topic

  # With default values
  - format: "json"
  - retries: 3

  # Full definitions
  - count:
      type: integer
      required: false
      default: 10
      description: "Number of items to fetch"
      minimum: 1
      maximum: 100

  - output_format:
      type: string
      enum: ["json", "csv", "xml"]
      default: "json"
      description: "Output format"

  - email:
      type: string
      required: true
      description: "Contact email"
      pattern: "^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+$"

Required vs Optional¶

Condition	Required?
Simple string syntax (`- url`)	✅ Required
Has `default` value	❌ Optional
`required: true` (explicit)	✅ Required
`required: false` (explicit)	❌ Optional

Steps¶

steps:
  - id: string          # Step identifier (required)

    # Type (exactly one required)
    tool: string        # MCP tool name
    code: string        # Python code block

    # Tool parameters (tool steps only)
    params: object      # Tool parameters

    # MCP server routing (tool steps only)
    mcp: string         # MCP server hosting the tool

    # Dependencies
    depends_on: array   # List of step IDs to wait for

    # Conditional execution
    when: string        # Jinja2 expression — skip step if falsy

    # Error handling
    timeout: integer    # Step timeout (seconds)
    on_error: string    # Error handling: fail | continue | retry
    retry: object       # Retry configuration

    # Missing tool handling (tool steps only)
    on_missing_tool: string  # "skip" to skip gracefully if tool unavailable

Tool Step¶

steps:
  - id: fetch
    tool: http_get
    params:
      url: "{{ inputs.url }}"
      headers:
        Authorization: "Bearer {{ inputs.token }}"

Code Step¶

steps:
  - id: process
    code: |
      import json
      data = json.loads('{{ inputs.data }}')
      result = {"processed": data}

Dependencies¶

steps:
  - id: step1
    code: |
      result = "first"

  - id: step2
    depends_on: [step1]
    code: |
      result = "second"

  - id: step3
    depends_on: [step1, step2]
    code: |
      result = "third"

Conditional Execution (`when`)¶

Jinja2 expression evaluated before the step runs. If falsy ("", 0, false, null, []), the step is skipped. Downstream steps that depends_on a skipped step proceed normally.

steps:
  - id: notify
    tool: send_email
    when: "{{ steps.validate.output.has_errors }}"
    params:
      to: "{{ inputs.email }}"
      subject: "Validation failed"

MCP Server Routing (`mcp`)¶

Specifies which MCP server hosts the tool. Required when multiple runners have a server with the same name, or to be explicit about routing.

steps:
  - id: list_prs
    tool: list_pull_requests
    mcp: github
    params:
      owner: "{{ inputs.owner }}"
      repo: "{{ inputs.repo }}"

When using mcp: and runner: fields, Ploston resolves the canonical tool name as <runner>__<mcp>__<tool> internally. You do not need to use this format in YAML — use bare tool names with mcp: and optionally defaults.runner.

Missing Tool Handling (`on_missing_tool`)¶

When set to skip, the step is skipped gracefully if the referenced tool is not currently available (e.g. runner is disconnected), rather than failing the workflow. Only applies to tool steps.

steps:
  - id: optional_notify
    tool: send_slack_message
    on_missing_tool: skip
    params:
      channel: "#alerts"
      text: "Done"

Defaults — `runner`¶

Sets the default runner for all tool steps in the workflow, eliminating the need to specify mcp: routing per step when all tools come from one runner. Can be overridden per-step via mcp:.

defaults:
  timeout: 30
  on_error: fail
  runner: my-homelab   # all tool steps resolve against this runner by default

Runner disambiguation priority:

mcp: + explicit runner arg (future F-072)
defaults.runner in workflow YAML
Runner name from bridge context (X-Ploston-Runner header)
Single-match inference from RunnerRegistry

Outputs¶

Single Output¶

output: "{{ steps.final.output }}"

Multiple Outputs¶

outputs:
  - name: string        # Output name
    from_path: string   # Path to value (e.g., "steps.process.output.data")
    value: string       # Template expression (alternative to from_path)
    description: string # Human-readable description

Output Examples¶

outputs:
  - name: result
    from_path: steps.transform.output
    description: Transformed data

  - name: count
    value: "{{ steps.count.output }}"
    description: Number of items processed

Template Expressions¶

Use Jinja2 templates to reference values:

Expression	Description
`{{ inputs.name }}`	Input value
`{{ steps.id.output }}`	Step result value
`{{ steps.id.output.field }}`	Nested field from step output
`{{ steps.id.status }}`	Step status: `completed`, `failed`, or `skipped`
`{{ steps.id.error }}`	Error object if failed (has `code` and `message` keys)
`{{ steps.id.debug_log }}`	List of strings written via `context.log()`
`{{ value \\| tojson }}`	JSON encode
`{{ value \\| default('x') }}`	Default value

Conditional Branching on Step Status¶

steps:
  - id: call_api
    tool: http_request
    on_error: continue
    params:
      url: "{{ inputs.url }}"

  - id: handle_result
    depends_on: [call_api]
    when: "{{ steps.call_api.status == 'completed' }}"
    code: |
      data = context.steps["call_api"].output
      result = {"processed": True, "count": len(data.get("items", []))}

  - id: handle_error
    depends_on: [call_api]
    when: "{{ steps.call_api.status == 'failed' }}"
    code: |
      err = context.steps["call_api"].error
      result = {"processed": False, "reason": err.get("message", "unknown")}

Complete Example¶

name: data-pipeline
version: "1.0"
description: Fetch, transform, and validate data

packages:
  profile: standard

defaults:
  timeout: 30
  on_error: fail

inputs:
  - url:
      type: string
      description: API endpoint URL
  - format:
      type: string
      enum: ["json", "csv"]
      default: "json"

steps:
  - id: fetch
    tool: http_get
    params:
      url: "{{ inputs.url }}"
    timeout: 60

  - id: transform
    depends_on: [fetch]
    code: |
      data = {{ steps.fetch.output }}
      result = [item for item in data if item.get("active")]

  - id: format
    depends_on: [transform]
    code: |
      import json
      data = {{ steps.transform.output }}
      result = json.dumps(data, indent=2)

output: "{{ steps.format.output }}"