Math::Nearest
Raku package for various algorithm for finding Nearest Neighbors (NNs)
The implementation is tested for correctness against Mathematica's Nearest
.
(See the resource files.)
Features
- Finds both top-k Nearest Neighbors (NNs).
- Finds NNs within a ball with a given radius.
- Can return distances, indexes, and labels for the found NNs.
- The result shape is controlled with the option
:prop
.
- Works with arrays of numbers, arrays of arrays of numbers, and arrays of pairs.
- Utilizes distance functions from "Math::DistanceFunctions".
- Which can be specified by their string names, like, "bray-curtis" or "cosine-distance".
- Allows custom distance functions to be used.
- Currently has two algorithms (simple) Scan and K-Dimensional Tree (KDTree).
Installation
From Zef ecosystem:
zef install Math::Nearest
From GitHub:
zef install https://github.com/antononcube/Raku-Math-Nearest.git
Usage examples
Setup
use Math::Nearest;
use Data::TypeSystem;
use Text::Plot;
# (Any)
Set of points
Make a random set of points:
my @points = ([(^100).rand, (^100).rand] xx 30).unique;
deduce-type(@points);
# Vector(Vector(Atom((Numeric)), 2), 30)
Create the K-dimensional tree object
my &finder = nearest(@points);
say &finder;
# Math::Nearest::Finder(Algorithm::KDimensionalTree(points => 30, distance-function => &euclidean-distance))
Nearest k-neighbors
Use as a search point one from the points set:
my @searchPoint = |@points.head;
# [82.7997137400612 51.815911977937425]
Find 4
nearest neighbors:
my @res = &finder(@searchPoint, 4);
.say for @res;
# [82.7997137400612 51.815911977937425]
# [76.74690048700612 45.30314548421236]
# [93.76468756535048 53.45047592032191]
# [71.62175684954694 55.56128771507127]
Instead of using the "finder" object as a callable (functor) we can use nearest
:
.say for nearest(&finder, @searchPoint, count => 4)
# [82.7997137400612 51.815911977937425]
# [76.74690048700612 45.30314548421236]
# [93.76468756535048 53.45047592032191]
# [71.62175684954694 55.56128771507127]
Find nearest neighbors within a ball with radius 30
:
.say for &finder(@searchPoint, (Whatever, 30))
# [93.76468756535048 53.45047592032191]
# [82.7997137400612 51.815911977937425]
# [68.66354164361587 52.25273427733615]
# [67.1439347976243 51.678609421192775]
# [76.74690048700612 45.30314548421236]
# [71.62175684954694 55.56128771507127]
# [58.6565707908774 67.61014125155556]
# [80.23300662613086 80.49154195793962]
# [82.92996945428985 71.0086509061165]
# [54.30669960433475 57.177447534448724]
Plot
Plot the points, the found nearest neighbors, and the search point:
my @point-char = <* ⏺ ▲>;
say <data nns search> Z=> @point-char;
say text-list-plot(
[@points, @res, [@searchPoint,]],
:@point-char,
x-limit => (0, 100),
y-limit => (0, 100),
width => 60,
height => 20);
# (data => * nns => ⏺ search => ▲)
# ++----------+-----------+----------+-----------+----------++
# + * + 100.00
# | |
# | * |
# +* * * * * * + 80.00
# | * |
# | * * |
# | * |
# + * + 60.00
# | * ** ⏺ ▲ ⏺ |
# | ⏺ |
# + * + 40.00
# |* * |
# | |
# | * |
# + * + 20.00
# | * * |
# | * * |
# + + 0.00
# ++----------+-----------+----------+-----------+----------++
# 0.00 20.00 40.00 60.00 80.00 100.00
TODO
- TODO Implementation
- TODO Implement the Octree nearest neighbors algorithm.
- TODO Make the nearest methods work with strings
- For example, using Hamming distance over a collection of words.
- Requires using the distance function as a comparator for the splitting hyperplanes.
- This means, any objects can be used as long as they provide a distance function.
- TODO Documentation
- DONE Basic usage examples with text plots
- TODO More extensive documentation with a Jupyter notebook
- TODO Corresponding blog post
- MAYBE Corresponding video
References
[AAp1] Anton Antonov, Math::DistanceFunctions Raku package, (2024), GitHub/antononcube.
[AAp2] Anton Antonov, Algorithm::KDimensionalTree Raku package, (2024), GitHub/antononcube.
[AAp3] Anton Antonov, Data::TypeSystem Raku package, (2023), GitHub/antononcube.
[AAp4] Anton Antonov, Text::Plot Raku package, (2022), GitHub/antononcube.