DSL::Examples
Raku data package with examples of DSL commands translations to programming code.
The DSL examples are suitable for
LLM few-shot training.
The sub llm-example-function provided by
"LLM::Functions", [AAp2],
can be effectively used to create translation functions utilizing those examples.
The utilization of such LLM-translation functions is exemplified below.
Also in the presentation "Robust LLM pipelines (Mathematica, Python, Raku)":
Similar translations -- with much less computational resources -- are achieved with
grammar-based DSL translators; see
"DSL::Translators", [AAp1]. The package
"LLM::Resources", [AAp4], has LLM-graphs
for code generation that utilize the DSL examples of this package.
Installation
From Zef ecosystem:
zef install DSL::Examples;
From GitHub:
zef install https://github.com/antononcube/Raku-DSL-Examples.git
Usage examples
Get all examples:
use DSL::Examples;
use Data::TypeSystem;
dsl-examples()
==> deduce-type()
# Assoc(Atom((Str)), Tuple([Assoc(Atom((Str)), Tuple([Assoc(Atom((Str)), Atom((Str)), 20), Assoc(Atom((Str)), Atom((Str)), 6), Assoc(Atom((Str)), Atom((Str)), 10), Assoc(Atom((Str)), Atom((Str)), 15)]), 4), Assoc(Atom((Str)), Tuple([Assoc(Atom((Str)), Atom((Str)), 20), Assoc(Atom((Str)), Atom((Str)), 32), Assoc(Atom((Str)), Atom((Str)), 20), Assoc(Atom((Str)), Atom((Str)), 27), Assoc(Atom((Str)), Atom((Str)), 14), Assoc(Atom((Str)), Atom((Str)), 6), Assoc(Atom((Str)), Atom((Str)), 17)]), 7), Assoc(Atom((Str)), Tuple([Assoc(Atom((Str)), Atom((Str)), 26), Assoc(Atom((Str)), Atom((Str)), 20), Assoc(Atom((Str)), Atom((Str)), 17), Assoc(Atom((Str)), Atom((Str)), 10)]), 4), Assoc(Atom((Str)), Tuple([Assoc(Atom((Str)), Atom((Str)), 23), Assoc(Atom((Str)), Atom((Str)), 15), Assoc(Atom((Str)), Atom((Str)), 33), Assoc(Atom((Str)), Atom((Str)), 20)]), 4)]), 4)
Tabulate all translation languages and available workflow examples:
use Data::Translators;
dsl-examples(from => 'English').map({ $_.key X $_.value.keys }).flat(1).map({ <language workflow> Z=> $_ })».Hash.sort.Array
==> to-dataset()
==> to-html(field-names => <language workflow>)
| language | workflow |
|---|
| Python | LSAMon |
| Python | QRMon |
| Python | SMRMon |
| Python | pandas |
| R | DataReshaping |
| R | LSAMon |
| R | QRMon |
| R | SMRMon |
| Raku | DataReshaping |
| Raku | LSAMon |
| Raku | SMRMon |
| Raku | TriesWithFrequencies |
| WL | ClCon |
| WL | DataReshaping |
| WL | LSAMon |
| WL | QRMon |
| WL | SMRMon |
| WL | Tabular |
| WL | TriesWithFrequencies |
Note in dsl-examples the language to translate from is specified.
Currently, the package has DSL examples for Bulgarian, English, Portuguese, and Russian (being from-languages.)
Get the examples for Latent Semantic Analysis (LSA) Monadic pipeline segments in Python:
dsl-examples('Python', 'LSAMon')
==> deduce-type(:tally)
# Assoc(Atom((Str)), Atom((Str)), 15)
Make an LLM example function for translation of LSA workflow building commands:
use LLM::Functions;
my &llm-pipeline-segment = llm-example-function(dsl-examples()<WL><LSAMon>);
Run the LLM function over a list of DSL commands:
my @commands =
"use the dataset aAbstracts",
"make the document-term matrix without stemming",
"exract 40 topics using the method non-negative matrix factorization",
"show the topics";
@commands
.map({ .&llm-pipeline-segment })
.map({ .subst(/:i Output ':'?/):g })
.join("⟹\n")
# LSAMonUnit[aAbstracts]⟹
# LSAMonMakeDocumentTermMatrix["StemmingRules" -> {}, "StopWords" -> Automatic]⟹
# LSAMonExtractTopics["NumberOfTopics"->40, Method->"NNMF"]⟹
# LSAMonEchoTopicsTable[]
Same workflow specified in Bulgarian:
my &llm-pipeline-segment-bg = llm-example-function(dsl-examples(from => 'Bulgarian')<WL><LSAMon>);
my @commands =
"използавай данните aAbstracts",
"направи документ-терм матрицата без да използаваш стъблата на думите",
"намери 40 теми ползвайки методата не-отрицателна матрична факторизация",
"покажи темите";
@commands
.map({ .&llm-pipeline-segment-bg })
.map({ .subst(/:i Output ':'?/):g })
.join("⟹\n")
# LSAMonUnit[aAbstracts]⟹
# LSAMonMakeDocumentTermMatrix["StemmingRules"->{}]⟹
# LSAMonExtractTopics["NumberOfTopics"->40, Method->"NNMF"]⟹
# LSAMonEchoTopicsTable[]
CLI
The package provides the Command Line Interface (CLI) script dsl-examples. Here is its usage message:
dsl-examples --help
# Usage:
# dsl-examples [<lang>] [<workflow>] [--from|--from-lang=<Str>] [-f|--format=<Str>] -- Give DSL examples for specified language and workflow.
# dsl-examples [-l|--to|--lang=<Str>] [-w|--workflow=<Str>] [--from|--from-lang=<Str>] [-f|--format=<Str>]
#
# [<lang>] Language. [default: 'Whatever']
# [<workflow>] Workflow. [default: 'Whatever']
# --from|--from-lang=<Str> Language to translate from. [default: 'English']
# -f|--format=<Str> Format of the result, one of "json" or "raku". [default: 'json']
# -l|--to|--lang=<Str> Language. [default: 'Whatever']
# -w|--workflow=<Str> Workflow. [default: 'Whatever']
Implementation details
There are several ways to organize the DSL examples with respect to the from-languages:
| Type | Comment | Currently used |
|---|
| Have a separate file for each from-langauge | Convenient editing and refinement | Yes |
| One file of all examples; from-langauge is a key for each workflow | Can be produces with the separate files | No |
| Keep English-only DSL examples and use dictionaries of command translations to English | Does not train the LLM directly with the from-language | Dictionaries are kept for reference |
See the Jupyter notebook "DSL-examples-dev.ipynb" with a translation workflow of the English DSL examples to other languages.
References
Packages
[AAp1] Anton Antonov,
DSL::Translators, Raku package,
(2020-2024),
GitHub/antononcube.
[AAp2] Anton Antonov,
LLM::Functions, Raku package,
(2023-2026),
GitHub/antononcube.
[AAp3] Anton Antonov,
LLM::Prompts, Raku package,
(2023-2026),
GitHub/antononcube.
[AAp4] Anton Antonov,
LLM::Resources, Raku package,
(2026),
GitHub/antononcube.
Videos
[AAv1] Anton Antonov,
"Robust LLM pipelines (Mathematica, Python, Raku)",
(2024),
YouTube/AAA4prediction.