
NAME
LLM::Data::ContentTag - Content classification and model routing for LLM data pipelines
SYNOPSIS
use LLM::Data::ContentTag;
# Define classification rules: tag name → trigger keywords
my $classifier = LLM::Data::ContentTag::Classifier.new(
:rules(%(
confidential => <secret classified restricted>,
technical => <algorithm database server>,
creative => <story poem song>,
)),
:restricted('confidential'), # Tags that need a local/unrestricted model
);
# Classify content
my $tags = $classifier.classify('Deploy the database server update.');
say $tags.has-tag('technical'); # True
say $tags.needs-unrestricted-model; # False
say $tags.all-tags; # (technical)
# Classify from metadata
my $tags2 = $classifier.classify-from-metadata(%(
confidential => True,
technical => False,
));
# Route to appropriate backend
my $router = LLM::Data::ContentTag::Router.new(
:default-backend('cloud-api'),
);
$router.add-route('local-model', 'confidential', 'restricted');
$router.add-route('reasoning-model', 'technical');
say $router.select-backend($tags); # "reasoning-model"
DESCRIPTION
LLM::Data::ContentTag provides content classification and model routing for LLM data generation pipelines. Tags and rules are fully configurable — no hardcoded categories.
LLM::Data::ContentTag::Tags
Immutable content tag set. Tags are arbitrary string keys with boolean values.
my $t = LLM::Data::ContentTag::Tags.new(
:tags(%(confidential => True, draft => True)),
:restricted('confidential'), # Which tags need an unrestricted model
);
$t.has-tag('confidential'); # True
$t.has-tag('missing'); # False
$t.needs-unrestricted-model; # True (confidential is restricted and true)
$t.all-tags; # List of tag names that are true
$t.to-hash; # Serializable Hash
LLM::Data::ContentTag::Tags.from-hash(%data); # Deserialize
LLM::Data::ContentTag::Classifier
Assigns tags to content via configurable keyword rules.
my $c = LLM::Data::ContentTag::Classifier.new(
:rules(%(
tag-name => <keyword1 keyword2 keyword3>,
)),
:restricted('tag-name'), # Optional: tags needing unrestricted model
);
$c.classify(Str $content --> Tags); # Match keywords (case-insensitive)
$c.classify-from-metadata(%meta --> Tags); # Set tags from a metadata hash
LLM::Data::ContentTag::Router
Maps content tags to backend identifiers. Routes are checked in order — first match wins.
my $r = LLM::Data::ContentTag::Router.new(
:default-backend('cloud-api'),
);
# Route if content has ANY of these tags
$r.add-route('local-model', 'confidential', 'restricted');
$r.add-route('reasoning-model', 'technical');
$r.select-backend($tags --> Str); # Returns matching backend or default
AUTHOR
Matt Doughty matt@apogee.guru
COPYRIGHT AND LICENSE
Copyright 2026 Matt Doughty
This library is free software; you can redistribute it and/or modify it under the Artistic License 2.0.