Rand Stats

P5study

zef:lizmat

Actions Status

NAME

Raku port of Perl's study() built-in

SYNOPSIS

use P5study; # exports study()

DESCRIPTION

This module tries to mimic the behaviour of Perl's study built-in as closely as possible in the Raku Programming Language.

ORIGINAL PERL DOCUMENTATION

study SCALAR
study   Takes extra time to study SCALAR ($_ if unspecified) in
        anticipation of doing many pattern matches on the string before it
        is next modified. This may or may not save time, depending on the
        nature and number of patterns you are searching and the
        distribution of character frequencies in the string to be
        searched; you probably want to compare run times with and without
        it to see which is faster. Those loops that scan for many short
        constant strings (including the constant parts of more complex
        patterns) will benefit most. (The way "study" works is this: a
        linked list of every character in the string to be searched is
        made, so we know, for example, where all the 'k' characters are.
        From each search string, the rarest character is selected, based
        on some static frequency tables constructed from some C programs
        and English text. Only those places that contain this "rarest"
        character are examined.)

        For example, here is a loop that inserts index producing entries
        before any line containing a certain pattern:

            while (<>) {
                study;
                print ".IX foo\n"    if /\bfoo\b/;
                print ".IX bar\n"    if /\bbar\b/;
                print ".IX blurfl\n" if /\bblurfl\b/;
                # ...
                print;
            }

        In searching for "/\bfoo\b/", only locations in $_ that contain
        "f" will be looked at, because "f" is rarer than "o". In general,
        this is a big win except in pathological cases. The only question
        is whether it saves you more time than it took to build the linked
        list in the first place.

        Note that if you have to look for strings that you don't know till
        runtime, you can build an entire loop as a string and "eval" that
        to avoid recompiling all your patterns all the time. Together with
        undefining $/ to input entire files as one record, this can be
        quite fast, often faster than specialized programs like fgrep(1).
        The following scans a list of files (@files) for a list of words
        (@words), and prints out the names of those files that contain a
        match:

            $search = 'while (<>) { study;';
            foreach $word (@words) {
                $search .= "++\$seen{\$ARGV} if /\\b$word\\b/;\n";
            }
            $search .= "}";
            @ARGV = @files;
            undef $/;
            eval $search;        # this screams
            $/ = "\n";        # put back to normal input delimiter
            foreach $file (sort keys(%seen)) {
                print $file, "\n";
            }

PORTING CAVEATS

Currently, study is a no-op in Raku. As it is in more recent Perl's.

AUTHOR

Elizabeth Mattijsen liz@raku.rocks

Source can be located at: https://github.com/lizmat/P5study . Comments and Pull Requests are welcome.

COPYRIGHT AND LICENSE

Copyright 2018, 2019, 2020, 2021 Elizabeth Mattijsen

Re-imagined from Perl as part of the CPAN Butterfly Plan.

This library is free software; you can redistribute it and/or modify it under the Artistic License 2.0.