Rand Stats

Lang::JA::Kana

zef:slavenskoj

Lang::JA::Kana

A comprehensive Raku module for converting between Hiragana and Katakana, including support for historical and obsolete kana characters.

Repository: https://github.com/slavenskoj/raku-lang-ja-kana

Features

Installation

Place the module file in your Raku library path:

mkdir -p lib/Lang/JA
cp Kana.rakumod lib/Lang/JA/

Usage

Basic Usage

use Lang::JA::Kana;

# Using functions
say to-katakana("ひらがな");  # Output: ヒラガナ
say to-hiragana("カタカナ");  # Output: かたかな

Mixed Text

The module gracefully handles mixed text, converting only kana characters:

say to-katakana("Hello こんにちは World");  # Output: Hello コンニチハ World
say to-hiragana("Hello コンニチハ World");  # Output: Hello こんにちは World

Historical and Obsolete Kana

# Historical wi/we sounds (pre-1946 orthography)
say to-katakana("ゐゑ");  # Output: ヰヱ
say to-hiragana("ヰヱ");  # Output: ゐゑ

# VU sound
say to-katakana("ゔぁゔぃゔぇゔぉ");  # Output: ヴァヴィヴェヴォ
say to-hiragana("ヴァヴィヴェヴォ");  # Output: ゔぁゔぃゔぇゔぉ

# Digraph Yori
say to-katakana("");  # Output: ヿ
say to-hiragana("");  # Output: ゟ

Modern Extensions for Foreign Sounds

# FA-FO sounds
say to-katakana("ふぁふぃふぇふぉ");  # Output: ファフィフェフォ
say to-hiragana("ファフィフェフォ");  # Output: ふぁふぃふぇふぉ

# TI/DI sounds
say to-katakana("てぃでぃ");  # Output: ティディ
say to-hiragana("ティディ");  # Output: てぃでぃ

# WI-WO sounds for loan words
say to-katakana("うぃうぇうぉ");  # Output: ウィウェウォ
say to-hiragana("ウィウェウォ");  # Output: うぃうぇうぉ

Small Kana

# Small vowels
say to-katakana("ぁぃぅぇぉ");  # Output: ァィゥェォ
say to-hiragana("ァィゥェォ");  # Output: ぁぃぅぇぉ

# Small WA
say to-katakana("");  # Output: ヮ
say to-hiragana("");  # Output: ゎ

Hentaigana (Historical Kana Variants)

# Single reading Hentaigana
say hentaigana-to-hiragana("𛀁𛀂𛀃");  # Output: あいう

# Multiple reading Hentaigana (separated by middle dots)
say hentaigana-to-hiragana("𛀒𛀓");  # Output: し・せじ・ぜ

# W-series with historical/modern readings
say hentaigana-to-hiragana("𛁆𛁇𛁈");  # Output: ゐ・いゑ・えを・お

# Complex variants with multiple interpretations
say hentaigana-to-hiragana("𛂬𛂭");  # Output: ふ・ぶ・ぷへ・べ・ぺ

# Mixed text
say hentaigana-to-hiragana("Hello 𛀁𛂚 World");  # Output: Hello あこ・き World

Modern Hiragana to Hentaigana

# Single variant
say hiragana-to-hentaigana("");  # Output: 𛁂

# Multiple variants
say hiragana-to-hentaigana("");  # Output: 𛀁・𛄀・𛄁
say hiragana-to-hentaigana("");  # Output: 𛀈・𛂚・𛂦

# Sound mark splitting (dakuten/handakuten)
say hiragana-to-hentaigana("");  # Output: 𛀆・𛂥゛
say hiragana-to-hentaigana("");  # Output: 𛀩・𛂛゜

# Text conversion
say hiragana-to-hentaigana("こんにちは");  # Output: 𛀎・𛂚𛁉𛀥𛀜・𛂫𛀩・𛂛

# Mixed text
say hiragana-to-hentaigana("Hello がき World");  # Output: Hello 𛀆・𛂥゛𛀈・𛂚・𛂦 World

Half-width Katakana

# Half-width to full-width conversion
say to-fullwidth-katakana("アイウエオ");  # Output: アイウエオ
say to-fullwidth-katakana("カタカナ");   # Output: カタカナ
say to-fullwidth-katakana("Hello アイウ World");  # Output: Hello アイウ World

# Full-width to half-width conversion
say to-halfwidth-katakana("アイウエオ");  # Output: アイウエオ
say to-halfwidth-katakana("カタカナ");   # Output: カタカナ
say to-halfwidth-katakana("Hello アイウ World");  # Output: Hello アイウ World

# Integration with existing functions (half-width automatically converted)
say Hiragana "カタカナ";  # Output: かたかな
say to-hiragana("アイウエオ");  # Output: あいうえお

# Voiced and semi-voiced combinations
say to-fullwidth-katakana("ガギグゲゴ");  # Output: ガギグゲゴ
say to-fullwidth-katakana("パピプペポ");  # Output: パピプペポ
say to-halfwidth-katakana("ザジズゼゾダヂヅデド");  # Output: ザジズゼゾダヂヅデド

# Punctuation and marks
say to-fullwidth-katakana("。、「」ー");  # Output: 。、「」ー

Square Katakana (Enclosed Forms)

# Circled Katakana
say decircle-katakana("㋐㋑㋒㋓㋔");  # Output: アイウエオ
say decircle-katakana("㋕㋖㋗㋘㋙");  # Output: カキクケコ

# Square measurement units
say desquare-katakana("㌔㌘㍍");  # Output: キログラムメートル
say desquare-katakana("㍑㌦㍀");  # Output: リットルドルポンド

# Square technical terms
say desquare-katakana("㌲㌹㌾㍗");  # Output: ファラッドヘルツボルトワット

# Square building/location terms
say desquare-katakana("㌀㌱㍇");  # Output: アパートビルマンション

# Mixed text with square Katakana
say desquare-katakana("Price is ㌦100 per ㍑");  # Output: Price is ドル100 per リットル

Reverse Conversions

# Katakana to circled forms
say encircle-katakana("アイウエオ");  # Output: ㋐㋑㋒㋓㋔
say encircle-katakana("カキクケコ");  # Output: ㋕㋖㋗㋘㋙

# Katakana to square forms
say ensquare-katakana("キロ グラム メートル");  # Output: ㌔ ㌘ ㍍
say ensquare-katakana("ドル ポンド");  # Output: ㌦ ㍀
say ensquare-katakana("ワット ヘルツ");  # Output: ㍗ ㌹

# Mixed text conversion
say ensquare-katakana("The price is ドル for ワット");  # Output: The price is ㌦ for ㍗

# Note: Not all characters have circled forms (e.g., ン has no circled equivalent)
say encircle-katakana("アン");  # Output: ㋐ン (only ア gets circled)

API Reference

Functions

to-hiragana(Str $text) returns Str

Converts Katakana characters in the text to Hiragana.

to-katakana(Str $text) returns Str

Converts Hiragana characters in the text to Katakana.

to-fullwidth-katakana(Str $text) returns Str

Converts half-width Katakana characters to full-width Katakana.

to-halfwidth-katakana(Str $text) returns Str

Converts full-width Katakana characters to half-width Katakana.

hentaigana-to-hiragana(Str $text) returns Str

Converts Hentaigana (historical kana variants) to modern Hiragana. Characters with multiple possible readings are converted to a list separated by middle dots (・).

hiragana-to-hentaigana(Str $text) returns Str

Converts modern Hiragana to equivalent Hentaigana variants. Dakuten (゛) and handakuten (゜) are split from characters before conversion. Multiple variants are joined with middle dots (・).

split-sound-marks(Str $char) returns List

Splits dakuten (゛) and handakuten (゜) sound marks from kana characters.

decircle-katakana(Str $text) returns Str

Converts circled Katakana (㋐-㋾) into their component characters. These represent individual kana syllables enclosed in circles.

desquare-katakana(Str $text) returns Str

Converts square Katakana (㌀-㍗) into their component characters. These represent technical terms, units, and abbreviations in square boxes.

encircle-katakana(Str $text) returns Str

Converts regular Katakana characters into their circled forms (㋐-㋾). Note: Not all kana have circled equivalents (e.g., ン).

ensquare-katakana(Str $text) returns Str

Converts regular Katakana technical terms into their square forms (㌀-㍗). Only predefined technical terms are converted. Processes longer terms first to avoid partial matches.

Supported Character Sets

Modern Kana

Half-width Katakana

Complete support for half-width Katakana (U+FF65-U+FF9F):

Historical Kana

Hentaigana (Historical Variants)

Complete support for Unicode Hentaigana block (U+1B000-U+1B12F), including historical variant forms of all kana syllables used in classical Japanese manuscripts. Many Hentaigana characters have multiple possible readings depending on context.

Modern Extensions

Foreign sound combinations for transcribing loan words:

Square Katakana (Enclosed Forms)

Character Conversion Behavior

Testing

Run the included test suite to verify functionality:

raku t/01-basic.t

The test suite covers:

Technical Notes

Installation

zef install Lang::JA::Kana

From Source

git clone https://github.com/slavenskoj/raku-lang-ja-kana.git
cd raku-lang-ja-kana
zef install .

License

Copyright 2025 Danslav Slavenskoj

This library is free software; you can redistribute it and/or modify it under the Artistic License 2.0.

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests for:

See Also