Rand Stats

DSL::Examples

zef:antononcube

DSL::Examples

Raku data package with examples of DSL commands translations to programming code.

The DSL examples are suitable for LLM few-shot training. The sub llm-example-function provided by "LLM::Functions", [AAp2], can be effectively used to create translation functions utilizing those examples.

The utilization of such LLM-translation functions is exemplified below. Also in the presentation "Robust LLM pipelines (Mathematica, Python, Raku)":

Similar translations -- with much less computational resources -- are achieved with grammar-based DSL translators; see "DSL::Translators", [AAp1]. The package "LLM::Resources", [AAp4], has LLM-graphs for code generation that utilize the DSL examples of this package.


Installation

From Zef ecosystem:

zef install DSL::Examples;

From GitHub:

zef install https://github.com/antononcube/Raku-DSL-Examples.git

Usage examples

Get all examples:

use DSL::Examples;
use Data::TypeSystem;

dsl-examples()
    ==> deduce-type()
# Assoc(Atom((Str)), Tuple([Assoc(Atom((Str)), Tuple([Assoc(Atom((Str)), Atom((Str)), 20), Assoc(Atom((Str)), Atom((Str)), 6), Assoc(Atom((Str)), Atom((Str)), 10), Assoc(Atom((Str)), Atom((Str)), 15)]), 4), Assoc(Atom((Str)), Tuple([Assoc(Atom((Str)), Atom((Str)), 20), Assoc(Atom((Str)), Atom((Str)), 32), Assoc(Atom((Str)), Atom((Str)), 20), Assoc(Atom((Str)), Atom((Str)), 27), Assoc(Atom((Str)), Atom((Str)), 14), Assoc(Atom((Str)), Atom((Str)), 6), Assoc(Atom((Str)), Atom((Str)), 17)]), 7), Assoc(Atom((Str)), Tuple([Assoc(Atom((Str)), Atom((Str)), 26), Assoc(Atom((Str)), Atom((Str)), 20), Assoc(Atom((Str)), Atom((Str)), 17), Assoc(Atom((Str)), Atom((Str)), 10)]), 4), Assoc(Atom((Str)), Tuple([Assoc(Atom((Str)), Atom((Str)), 23), Assoc(Atom((Str)), Atom((Str)), 15), Assoc(Atom((Str)), Atom((Str)), 33), Assoc(Atom((Str)), Atom((Str)), 20)]), 4)]), 4)

Tabulate all translation languages and available workflow examples:

use Data::Translators;
dsl-examples(from => 'English').map({ $_.key X $_.value.keys }).flat(1).map({ <language workflow> Z=> $_ })».Hash.sort.Array
==> to-dataset()
==> to-html(field-names => <language workflow>)
languageworkflow
PythonLSAMon
PythonQRMon
PythonSMRMon
Pythonpandas
RDataReshaping
RLSAMon
RQRMon
RSMRMon
RakuDataReshaping
RakuLSAMon
RakuSMRMon
RakuTriesWithFrequencies
WLClCon
WLDataReshaping
WLLSAMon
WLQRMon
WLSMRMon
WLTabular
WLTriesWithFrequencies

Note in dsl-examples the language to translate from is specified. Currently, the package has DSL examples for Bulgarian, English, Portuguese, and Russian (being from-languages.)

Get the examples for Latent Semantic Analysis (LSA) Monadic pipeline segments in Python:

dsl-examples('Python', 'LSAMon')
    ==> deduce-type(:tally)
# Assoc(Atom((Str)), Atom((Str)), 15)

Make an LLM example function for translation of LSA workflow building commands:

use LLM::Functions;
my &llm-pipeline-segment = llm-example-function(dsl-examples()<WL><LSAMon>);

Run the LLM function over a list of DSL commands:

my @commands = 
"use the dataset aAbstracts",
"make the document-term matrix without stemming",
"exract 40 topics using the method non-negative matrix factorization",
"show the topics";

@commands
.map({ .&llm-pipeline-segment })
.map({ .subst(/:i Output ':'?/):g })
.join("⟹\n")
# LSAMonUnit[aAbstracts]⟹
# LSAMonMakeDocumentTermMatrix["StemmingRules" -> {}, "StopWords" -> Automatic]⟹
# LSAMonExtractTopics["NumberOfTopics"->40, Method->"NNMF"]⟹
# LSAMonEchoTopicsTable[]

Same workflow specified in Bulgarian:

my &llm-pipeline-segment-bg = llm-example-function(dsl-examples(from => 'Bulgarian')<WL><LSAMon>);

my @commands = 
"използавай данните aAbstracts",
"направи документ-терм матрицата без да използаваш стъблата на думите",
"намери 40 теми ползвайки методата не-отрицателна матрична факторизация",
"покажи темите";

@commands
.map({ .&llm-pipeline-segment-bg })
.map({ .subst(/:i Output ':'?/):g })
.join("⟹\n")
# LSAMonUnit[aAbstracts]⟹
# LSAMonMakeDocumentTermMatrix["StemmingRules"->{}]⟹
# LSAMonExtractTopics["NumberOfTopics"->40, Method->"NNMF"]⟹
# LSAMonEchoTopicsTable[]

CLI

The package provides the Command Line Interface (CLI) script dsl-examples. Here is its usage message:

dsl-examples --help
# Usage:
#   dsl-examples [<lang>] [<workflow>] [--from|--from-lang=<Str>] [-f|--format=<Str>] -- Give DSL examples for specified language and workflow.
#   dsl-examples [-l|--to|--lang=<Str>] [-w|--workflow=<Str>] [--from|--from-lang=<Str>] [-f|--format=<Str>]
#   
#     [<lang>]                    Language. [default: 'Whatever']
#     [<workflow>]                Workflow. [default: 'Whatever']
#     --from|--from-lang=<Str>    Language to translate from. [default: 'English']
#     -f|--format=<Str>           Format of the result, one of "json" or "raku". [default: 'json']
#     -l|--to|--lang=<Str>        Language. [default: 'Whatever']
#     -w|--workflow=<Str>         Workflow. [default: 'Whatever']

Implementation details

There are several ways to organize the DSL examples with respect to the from-languages:

TypeCommentCurrently used
Have a separate file for each from-langaugeConvenient editing and refinementYes
One file of all examples; from-langauge is a key for each workflowCan be produces with the separate filesNo
Keep English-only DSL examples and use dictionaries of command translations to EnglishDoes not train the LLM directly with the from-languageDictionaries are kept for reference

See the Jupyter notebook "DSL-examples-dev.ipynb" with a translation workflow of the English DSL examples to other languages.


References

Packages

[AAp1] Anton Antonov, DSL::Translators, Raku package, (2020-2024), GitHub/antononcube.

[AAp2] Anton Antonov, LLM::Functions, Raku package, (2023-2026), GitHub/antononcube.

[AAp3] Anton Antonov, LLM::Prompts, Raku package, (2023-2026), GitHub/antononcube.

[AAp4] Anton Antonov, LLM::Resources, Raku package, (2026), GitHub/antononcube.

Videos

[AAv1] Anton Antonov, "Robust LLM pipelines (Mathematica, Python, Raku)", (2024), YouTube/AAA4prediction.