Rand Stats

DAWG

zef:slavenskoj
Revision history for DAWG

v0.0.6  2025-06-17
    - Fixed bug in minimization of large DAWGs
    - Assigned stable IDs to all nodes before minimization
    - Used proper caching with node IDs instead of memory addresses
    - Implemented correct recursive minimization that processes nodes bottom-up
    - Added proper cycle detection to handle any circular references
    - Ensured node equivalence checking compares actual structure, not just signatures

  The fixed implementation now consistently preserves all words during minimization, even with datasets of 10,000+ words.




v0.0.6  2025-06-17
    - Fixed package structure for proper zef installation
    - Fixed serialization in dictionary example to use binary format
    - Improved error handling for save/load operations

v0.0.5  2025-06-15
    - Tested search function
    - Added spell checking example script

v0.0.4  2025-06-15
    - Date typo fix in Changes file

v0.0.3  2025-06-15
    - README updated
    - License text updated in some files

v0.0.2  2025-06-15
    - Ran additional tests to verify functionality
    - Verified examples 
    - Updated META6

v0.0.1  2025-06-06
    - Initial release
    - Core DAWG functionality with add, contains, lookup, and find-prefixes
    - DAWG minimization for space efficiency
    - JSON serialization support
    - Binary serialization with ASCII optimization
    - Memory-mapped file support for zero-copy access
    - Value association with words
    - Node ID support for direct traversal and bounded search algorithms
    - Subtree statistics computation (word count, min/max length, depth)
    - Pattern matching with wildcards (? and *)
    - Fuzzy search with Levenshtein edit distance
    - Spell-check functionality
    - Automatic 7-bit Unicode compression for dictionaries with ≤89 unique characters
    - Separate search modules (DAWG::Search::Pattern and DAWG::Search::Fuzzy)
    - Comprehensive test suite with 125+ tests
    - Full Unicode support
    - Performance optimizations for large datasets
    - Detailed documentation and examples