Rand Stats

JSON::Collector

zef:lizmat

Actions Status Actions Status Actions Status

NAME

JSON::Collector - collect and process JSON

SYNOPSIS

use JSON::Collector;

my $collector = JSON::Collector.new(
  :todo($todo-dir.IO),
  :done($done-dir.IO)
);
$collector.store($json);

for $collector.unprocessed -> \data {
    # process JSON data
    data.mark-as-processed($type);  # or data.discard
}

DESCRIPTION

The JSON::Collector distribution provides logic to collect JSON data into a temporary storage location awaiting processing. Then, when the data is processed, it can optionally be stored into an archive location.

This is typically intended to handle JSON as generated by push services such as Github, SourceHut or Codeberg webhooks.

Although this is typically used for JSON related data, it can also be used by any other data source: the storage will however be JSON based.

JSON::Collector

The JSON::Collector class contains the logic for storing and handling processed data.

new

my $collector = JSON::Collector.new(
  :todo("todo".IO),
  :done("done".IO),
  :pretty, :spaces(2), :sorted-keys
);

The new method takes the following named arguments:

todo

The directory in which items that are added with the store method are being placed. Defaults to "todo" in the current directory. The directory must exist and be writable.

done

Where the data that has been processed should be stored. It can have 3 types of destination.

Defaults to "done" in the current directory. Any IO::Path (implicitely) specified must be an existing directory and be writable.

pretty / spacing / sorted-keys

These named arguments apply to the way any data added by calling the store method with a data structure (as opposed to a string) will be converted to JSON. The defaults are:

store

$collector.store($string); # or $collector.store(data);

The store method stores the given data in the todo directory. If the given data is a string, then it is assumed to be valid JSON and the string will be stored as-is.

In any other case, the given data will be converted to JSON with the pretty, spacing and sorted-keys settings specified in the creation of the collector (but can be overridden by specifying these same named arguments in the call to store);

$collector.store(data, :!pretty);

The store method either returns an IO::Path object of the file that was created to store the data, or Nil if something went wrong.

unprocessed

for $collector.unprocessed -> \data {
    # process JSON data
    data.mark-as-processed($type);  # or data.discard
}

The unprocessed method produces a Seq of JSON::Collector::Item objects of data that has not yet been processed yet.

JSON::Collector::Item

The JSON::Collector::Item class encapsulates on item that was previously stored. It acts as closes as possible to the original data stored, with some additional methods that can be invoked.

collector

The JSON::Collector object that created this JSON::Collector::Item object.

IO

The IO::Path object of the data. After having been successfully processed (with mark-as-processed), then it contains the

data

The actual data structure that was stored. It's generally not needed to call this method, as the object itself will act as the data as much as possible.

mark-as-processed

for $collector.unprocessed -> \item {
    # process JSON data
    item.mark-as-processed($type);
}

The mark-as-processed method will mark the data that was previously stored as processed.

Depending on the value specified with done in the JSON::Collector object, it will do one of three things:

if done is Nil

Then this call is equivalent to calling the discard method, as in removing the item from the "todo" directory.

if done is a Callable

Then the Callable will be called with the JSON::Collector::Item object as the only positional argument. If the return value is True, then the object will be removed from the "todo" directory, otherwise it will be kept there.

if done is an IO::Path

The optional type indication (which defaults to the empty string) will be used as the name of a subdirectory in the "done" directory in which the data will ultimately be stored. Defaults to the empty string, which means no subdirectory.

Then the JSON file in the "todo" directory (or "todo/$type") will be moved to a sub-directory YYYY/YYYY-MM-DD of that directory, using the current date.

discard

for $collector.unprocessed -> \item {
    # process JSON data
    item.discard;
}

Discards the item from the "todo" directory.

THEORY OF OPERATION

Whenever some data is stored (and it's not JSON already) it is converted to JSON. Then the JSON is spurted to a file with as basename the current nanosecond value, with the extension "tmp". Once the write is completed, the file is (atomically, presumably) renamed to not have any extension.

This means that many threads or processes can store date asynchronously in the same collection. When it comes to processing, it is assumed that only a single process/thread will be calling the unprocessed method. The actual processing of the items may me multi-threaded.

AUTHOR

Elizabeth Mattijsen liz@raku.rocks

Source can be located at: https://codeberg.org/lizmat/actions . Comments and Pull Requests are welcome.

If you like this module, or what I'm doing more generally, committing to a small sponsorship would mean a great deal to me!

COPYRIGHT AND LICENSE

Copyright 2026 Elizabeth Mattijsen

This library is free software; you can redistribute it and/or modify it under the Artistic License 2.0.