Rand Stats

LLM::Data::Inference

zef:apogee
Revision history for LLM::Data::Inference

0.2.0  2026-04-23T15:57:28+01:00
    - Added auth specifiers to META6.json dependencies
    - Task + JSONTask accept an ordered :@backends fallback chain in
      addition to the legacy single :$backend. The retry loop
      classifies failures into three buckets: abort (HTTP 400 / 401 /
      402 / 403 / 404 — config/account errors where retrying any
      model in the chain won't help), retry-same (connection errors /
      5xx / unclassifiable — likely transient), and advance (timeout /
      429 / empty body / parser failure / other 4xx / content-filter-
      style finish-reason quits — model-specific pathology). Each
      backend gets up to $.max-retries HTTP attempts (initial +
      retries-same, exponential backoff with jitter capped at 30 s);
      an advance-class error on any attempt short-circuits the budget
      and moves on. Abort-class errors re-raise immediately without
      trying the rest of the chain. See classify-error and the module
      Pod for the full rule table.
    - Task.classify-error(:$error-class, :$error-status,
      :$parser-failed) exposed as a public method for testability and
      for consumers that want to implement the same policy outside
      the retry loop.
    - Telemetry hook payload adds :backend-index, :model-name,
      :error-class, :error-status, so sinks can identify which model
      in the chain served each call and what failed. Existing keys
      (:attempt, :success, :error, :latency-ms, :prompt-tokens,
      :completion-tokens, :total-tokens, :cost, :model-used,
      :provider-id, :finish-reason, :stage) are unchanged.
    - Single-:backend constructor shape is preserved; a one-element
      chain behaves exactly like the pre-fallback Task on the
      retry-same path (connection / 5xx still get $.max-retries
      same-model attempts with exponential backoff).

    BACKWARD COMPATIBILITY — one behavioural break:

    Advance-class errors (timeout / 429 / parser failure / empty body
    / content-filter-style finish quits / other 4xx) used to retry
    the same model up to $.max-retries times. They now advance to
    the next backend in the chain, or die immediately with
    "all backend(s) exhausted" if the Task only has one. Retry-same
    errors (connection drop / 5xx) are unchanged — still retried on
    the same backend up to $.max-retries times before advancing.

    Consequence for single-backend callers: a Task that previously
    survived a stochastic parser failure via retry now dies on the
    first parse error. Three mitigations:
      - Preferred: pass :@backends with a fallback model. A chain
        of [primary, primary] is also legal and preserves the exact
        old "try the same model twice" behaviour on advance-class
        failures while keeping retry-same semantics intact.
      - Build the retry loop at the application layer if the same
        model genuinely recovers for your workload.
      - Accept the fail — in practice the old behaviour rarely
        recovered on parser failures (malformed JSON tended to
        repeat), which is the motivation for the change.

0.1.3  2026-04-07T20:03:42+01:00
    - Updated readme examples to use generic routing terms

0.1.2  2026-04-07T20:00:40+01:00
    - Removed Windows from CI (upstream Digest::SHA256::Native does not build on Windows)

0.1.1  2026-04-07T19:52:19+01:00
    - Added Windows CI support via MSVC

0.1.0  2026-04-07T18:52:19+01:00
    - Initial release
    - Task: blocking LLM calls with configurable parser and retry
    - JSONTask: JSON extraction from LLM responses with key validation
    - Router: query-based routing using Roaring::Tags
    - PromptBuilder: mustache-style template rendering