Rand Stats

MoarVM::Bytecode

zef:lizmat

Actions Status

NAME

MoarVM::Bytecode - Provide introspection into MoarVM bytecode

SYNOPSIS

use MoarVM::Bytecode;

my $M = MoarVM::Bytecode.new($filename);  # or letter or IO or Blob

say $M.hll-name;     # most likely "Raku"

say $M.strings[99];  # the 100th string on the string heap

DESCRIPTION

MoarVM::Bytecode provides an object oriented interface to the MoarVM bytecode format, based on the information provided in docs/bytecode.markdown.

CLASS METHODS

new

my $M = MoarVM::Bytecode.new("c");           # the 6.c setting

my $M = MoarVM::Bytecode.new("foo/bar");     # file as string

my $M = MoarVM::Bytecode.new($filename.IO);  # path as IO object

my $M = MoarVM::Bytecode.new($buf);          # a Buf object

Create an instance of the MoarVM::Bytecode object from a letter (assumed to be a Raku version letter such as "c", "d" or "e"), a filename, an IO::Path or a Buf/Blob object.

files

.say for MoarVM::Bytecode.files;

.say for MoarVM::Bytecode.files(:instantiate);

Returns a sorted list of paths of MoarVM bytecode files that could be found in the installation of the currently running rakudo executable.

Optionally accepts a :instantiate named argument to return a sorted list of instantiated MoarVM::Bytecode objects instead of just paths.

root

my $rootdir = MoarVM::Bytecode.rootdir;

Returns an IO::Path of the root directory of the installation of the currently running rakudo executable.

setting

my $setting = MoarVM::Bytecode.setting;

my $setting = MoarVM::Bytecode.setting("d");

Returns an IO::Path of the bytecode file of the given setting letter. Assumes the currently lowest supported setting by default.

HELPER SCRIPTS

bceval

$ bceval c '.strings.grep(*.contains("zip"))'
&zip
zip
zip-latest

Helper script to allow simple actions on a MoarVM::Bytecode object from the command line. The first argument indicates the bytecode file to load. The second argument indicates the code to be executed.

The topic $_ is set with the MoarVM::Bytecode object upon entry.

bcinfo

$ bcinfo --help
Usage:
  bin/bcinfo <file> [--filename=<Str>] [--name=<Str>] [--opcode=<Str>] [--header] [--decomp] [--hexdump] [--verbose]

    <file>              filename of bytecode, or setting letter
    --filename=<Str>    select frames with given filename
    --name=<Str>        select frames with given name
    --opcode=<Str>      select frames containing opcode
    --header            show header information
    --decomp            de-compile file / selected frames
    --hexdump           show hexdump of selected frames
    --verbose           be verbose when possible

Produces various types of information about the given bytecode file.

csites

$ csites c 12
 12 $, $, N

Helper code to show the callsite info of the given callsite number.

opinfo

$ opinfo if_i unless_i
 24 if_i r(int64),ins (8 bytes)
 25 unless_i r(int64),ins (8 bytes)

$ opinfo 42 666
 42 bindlex_nn str,r(num64) (8 bytes)
666 atpos2d_s w(str),r(obj),r(int64),r(int64) (10 bytes)

Helper script to show the gist of the given op name(s) or number(s).

sheap

$ sheap e 3 4 5
    3 SETTING::src/core.e/core_prologue.rakumod
    4 language_revision_type
    5 lang-meth-call

$ sheap e byte
   42 byte
 2844 bytecode-size

Helper script for browsing the string heap of a given bytecode file (specified by either a setting letter, or a filename of a bytecode file).

String arguments are interpreted as a key to do a .grep on the whole string heap. Numerical arguments are interpreted as indices into the string heap.

Shown are the string index and the string.

INSTANCE METHODS

callsites

.say for $M.callsites[^10];  # show the first 10 callsites

Returns a list of Callsite objects, which contains information about the arguments at a given callsite.

de-compile

Returns a string with the opcodes and their arguments.

extension-ops

.say for $M.extension-ops;  # show all extension ops

Returns a list of NQP extension operators that have been added to this bytecode. Each element consists of an ExtensionOp object.

frames

.say for $M.frames[^10];  # show the first 10 frames on the frame heap

my @frames := $M.frames.reify-all;

Returns a Frames object that serves as a Positional for all of the frames on the frame heap. Since the reification of a Frame object is rather expensive, this is done lazily on each access.

To reify all Frame objects at once, one can call the reify-all method, which also returns a list of the reified Frame objects.

hll-name

say $M.hll-name;     # most likely "Raku"

Returns the HLL language name for this bytecode. Most likely "Raku", or "nqp".

op

say $M.op(0x102);     #  102 istype       w(int64),r(obj),r(obj)

say $M.op("istype");  #  102 istype       w(int64),r(obj),r(obj)

Attempt to create an opcode object for the given name or opcode number. Also includes any extension ops that may be defined in the bytecode itself.

opcodes

A Buf with the actual opcodes.

sc-dependencies

.say for $M.sc-dependencies;  # identifiers for Serialization Context

Returns a list of strings of the Serialization Contexts on which this bytecode depends.

strings

.say for $M.strings[^10];  # The first 10 strings on the string heap

Returns a Strings object that serves as a Positional for all of the strings on the string heap.

version

say $M.version;     # most likely 7

Returns the numeric version of this bytecode. Most likely "7".

PRIMITIVES

bytecode

my $b = $M.bytecode;

Returns the Buf with the bytecode.

hexdump

say $M.hexdump($M.string-heap-offset);  # defaults to 256

say $M.hexdump($M.string-heap-offset, 1024);

Returns a hexdump representation of the bytecode from the given byte offset for the given number of bytes (256 by default).

slice

dd $M.slice(0, 8).chrs;     # "MOARVM\r\n"

Returns a List of unsigned 32-bit integers from the given offset and number of bytes. Basically a shortcut for $M,bytecode[$offset ..^ $offset + $bytes]. The number of bytes defaults to 256 if not specified.

str

say $M.str(76);  # Raku or nqp

Returns the string of which the index is the given offset.

subbuf

dd $M.subbuf(0, 8).decode;  # "MOARVM\r\n"

Calls subbuf on the bytecode and returns the result. Basically a shortcut for $M.bytecode.subbuf(...).

uint16

my $i = $M.uint16($offset);

Returns the unsigned 16-bit integer value at the given offset in the bytecode.

uint16s

my @values := = $M.uint16s($M.string-heap-offset);  # 16 entries

my @values := $M.uint16s($M.string-heap-offset, $entries);

Returns an unsigned 16-bit integer array for the given number of entries at the given offset in the bytecode. The number of entries defaults to 16 if not specified.

uint32

my $i = $M.uint32($offset);

Returns the unsigned 32-bit integer value at the given offset in the bytecode.

uint32s

my @values := = $M.uint32s($offset);  # 16 entries

my @values := $M.uint32s($offset, $entries);

Returns an unsigned 32-bit integer array for the given number of entries at the given offset in the bytecode. The number of entries defaults to 16 if not specified.

HEADER SHORTCUTS

The following methods provide shortcuts to the values in the bytecode header. They are explained in the MoarVM documentation.

sc-dependencies-offset, sc-dependencies-entries, extension-ops-offset, extension-ops-entries, frames-data-offset, frames-data-entries, callsites-data-offset, callsites-data-entries, string-heap-offset, string-heap-entries, sc-data-offset, sc-data-length, opcodes-offset, opcodes-length, annotation-data-offset, annotation-data-length, main-entry-frame-index, library-load-frame-index, deserialization-frame-index

SUBCLASSES

Instances of these classes are usually created automatically.

Argument

The Argument class provides these methods:

flags

The raw 8-bit bitmap of flags. The following bits have been defined:

is-flattened

Returns 1 if the argument is flattened, else 0.

is-literal

Returns 1 if the argument is a literal value, else 0.

name

The name of the argument if it is a named argument, else the empty string.

type

The type of the argument: possible values are Mu (indicating a HLL object of some kind), or any of the basic native types: str, int, uint or num.

Callsite

The Callsite class provides these methods:

arguments

The list of Argument objects for this callsite, if any.

bytes

The number of bytes this callsite needs.

ExtensionOp

The ExtensionOp class provides these methods:

adverbs

Always an empty Map.

annotation

Always the empty string.

bytes

The number of bytes this opcode uses.

name

The name with which the extension op can be called.

descriptor

An 8-byte Buf with descriptor information.

is-sequence

Always False.

Frame

The Frame class provides these methods:

is-inlineable

Return Bool whether the current frame is considered to be inlineable.

cuuid

A string representing the compilation unit ID.

de-compile

Returns a string with the opcodes and their arguments of this frame.

flags

A 16-bit unsigned integer bitmap with flags of this frame.

handlers

A list of Handler objects, representing the handlers in this frame.

has-exit-handler

1 if this frame has an exit handler, otherwise 0.

hexdump

Return a hexdump of the opcodes of this frame. Optionally takes a named argument :highlight which will highlight the bytes of the actual opcodes (excluding any argument bytes following them).

index

A 16-bit unsigned integer indicating the frame index of this frame.

is-thunk

1 if this frame is a thunk (as opposed to a real scope), otherwise 0.

lexicals

A list of Lexical objects, representing the lexicals in this frame.

locals

A list of Local objects, representing the locals in this frame.

name

The name of this frame, if any.

no-outer

1 if this frame has no outer, otherwise 0.

opcodes

A Buf with the actual bytecode of this frame.

outer-index

A 16-bit unsigned integer indicating the frame index of the outer frame.

sc-dependency-index

A 32-bit unsigned integer index into

sc-object-index

A 32-bit unsigned integer index into

statements

A list of Statement objects for this frame, may be empty.

Handler

start-protected-region

end-protected-region

category-mask

action

register-with-block

handler-goto

Lexical

name

The name of this lexical, if any.

type

The type of this lexical.

flags

A 16-bit unsigned integer bitmap for this lexical.

sc-dependency-index

Index of into the sc-dependencies list.

sc-object-index

Index of into the sc-dependencies list.

Local

name

The name of this local, if any.

type

The type of this local.

Statement

line

The line number of this statement.

offset

The opcode offset of this statement.

Op

all-adverbs

Return a List of all possible adverbs.

all-ops

Return a List of all possible ops.

annotation

The annotation of this operation. Currently recognized annotations are:

Absence of annotation if indicated by the empty string. See also is-sequence.

adverbs

A Map of additional adverb strings.

bytes

my $bytes := $op.bytes($frame, $offset);

The number of bytes this op occupies in memory. Returns 0 if the op has a variable size.

Some ops have a variable size depending on the callsite in the frame it is residing. For those cases, one can call the bytes method with the Frame object and the offset in the opcodes of that frame to obtain the number of bytes for that instance.

index

The numerical index of this operation.

is-sequence

True if this op is the start of a sequence of ops that share the same annotation.

name

The name of this operation.

new

my $op = MoarVM::Op.new(0);

my $op = MoarVM::Op.new("no_op");

Return an instantiated MoarVM::Op object from the given name or opcode number.

not-inlineable

Returns True if the op causes the frame to which it belongs to be not inlineable. Otherwise returns False.

operands

A List of operands, if any.

AUTHOR

Elizabeth Mattijsen liz@raku.rocks

COPYRIGHT AND LICENSE

Copyright 2024 Elizabeth Mattijsen

Source can be located at: https://github.com/lizmat/MoarVM-Bytecode . Comments and Pull Requests are welcome.

If you like this module, or what I’m doing more generally, committing to a small sponsorship would mean a great deal to me!

This library is free software; you can redistribute it and/or modify it under the Artistic License 2.0.