DSL::Entity::MachineLearning (Raku package)

Raku grammar classes for Machine Learning (ML) entities (names.)

The package does entity name recognition using regexes over a set of hashes (dictionaries) that map entity name phrases into entity Identifiers (IDs). The hashes are obtained from the resource files.

In short, we say that the package has grammar-resource-based architecture. The same architecture is used in the Domain Specific Language (DSL) entity packages [AAr3-AAr6].

Remark: It is assumed that the associations between entity name phrases and entity IDs placed in the resource files are going to be changed in the future, because of classifier systems updates, or usage feedback. This is one of the main reasons to use grammar-resource-based architecture: subsequent package versions would have better, fuller associations.


zef install https://github.com/antononcube/Raku-DSL-Entity-MachineLearning.git

Usage examples

use DSL::Entity::MachineLearning;
use DSL::Entity::MachineLearning::ResourceAccess;

my $pCOMMAND = DSL::Entity::MachineLearning::Grammar;

say $pCOMMAND.parse('DecisionTree', rule => 'machine-learning-entity-command');
say $pCOMMAND.parse('gradient boosted trees', rule => 'machine-learning-entity-command');
say $pCOMMAND.parse('roc curve', rule => 'machine-learning-entity-command');
# 「DecisionTree」
#  classifier-entity-command => 「DecisionTree」
#   entity-classifier-name => 「DecisionTree」
#    0 => 「DecisionTree」
#     word-value => 「DecisionTree」
# 「gradient boosted trees」
#  classifier-entity-command => 「gradient boosted trees」
#   entity-classifier-name => 「gradient boosted trees」
#    0 => 「gradient boosted trees」
#     word-value => 「gradient」
#     word-value => 「boosted」
#     word-value => 「trees」
# 「roc curve」
#  classifier-measurement-entity-command => 「roc curve」
#   entity-classifier-measurement-name => 「roc curve」
#    0 => 「roc curve」
#     word-value => 「roc」
#     word-value => 「curve」

Command line interface

The package provide as Command Line Interface (CLI) to its functionalities:

> ToMachineLearningEntityCode --help 
# Usage:
#   ToMachineLearningEntityCode <command> [--target=<Str>] [--user=<Str>] -- Conversion of (natural) DSL machine learning entity name into code.
#   ToMachineLearningEntityCode <target> <command> [--user=<Str>] -- Both target and command as arguments.
#     <command>         natural language command (DSL commands)
#     --target=<Str>    target language/system/package (defaults to 'WL-System') [default: 'WL-System']
#     --user=<Str>      user identifier (defaults to '') [default: '']
#     <target>          Programming language.

Remark: (Currently) the CLI script always returns results in JSON format.

Resource files

The resource file:

  1. "ClassifierNameToEntityID_EN.csv", was derived from the Mathematica function page for Classify, [WRI1].

  2. "ClassifierMeasurementNameToEntityID_EN.csv" was derived using Mathematica's built-in function ClassifierMeasurements, [WRI3]. Some additional associations were put in following [WK1].

  3. "ClassifierPropertyNameToEntityID_EN.csv", was derived using Mathematica's built-in function Information, [WRI4].

  4. "ROCFunctionNameToEntityID_EN.csv" uses the names and mappings in [WK1]. (See also the related package [AAr7].)

The initial versions Bulgarian versions of the resource files with name suffix "_BG.csv" were derived by automatic translations of the corresponding English content. Afterwards the Bulgarian mappings were reviewed and manually modified.



