Rand Stats

PublicSuffix

zef:jjatria

NAME

PublicSuffix - Query Mozilla's Public Suffix List

SYNOPSIS

use PublicSuffix;

# The effective TLD of a host name
say public-suffix 'www.example.com';
# OUTPUT: com

# New TLDs are valid public suffix by default
say public-suffix 'www.example.unknownnewtld';
# OUTPUT: unknownnewtld

# Accept host names in Unicode
say public-suffix 'www.example.香港';
# OUTPUT: 香港

# Accept host names in punycode
say public-suffix 'www.example.xn--j6w193g';
# OUTPUT: xn--j6w193g

# Shortest domain that can be registered
say registrable-domain 'www.example.com';
# OUTPUT: example.com

# Returns a type object if registrable domain is not found
say registrable-domain 'com';
# OUTPUT: (Str)

DESCRIPTION

This module provides functions to query Mozilla's Public Suffix List: a community-maintained list of domain name suffixes. The data in this list can be used to determine the effective top-level domain of a host name, or to test whether two hosts share an origin, as well as other similar validation functions. This most commonly used when validating the scope of HTTP cookies to prevent supercookies, but it has a variety of other uses.

FUNCTIONS

Host name validation

The functions described below take host names as their parameter, and run some validation on their input before processing. When given a malformed or otherwise invalid host name, the functions below will throw a X::PublicSuffix::BadDomain exception with the reason for the failure as its message.

For a domain to be valid, it must be a non-empty string, with a maximum length of 253 octets, and no more than 63 octets per label.

Domains can be provided as either UTF-8 strings or their ASCII punycoded variants. The returned strings will use the format of the strings provided. In other words, if you provide a UTF-8 string, you will receive a UTF-8 string back, while giving an ASCII string will generate an ASCII string in return.

Providing partially punycoded strings is not supported, and the behaviour of these functions with that input is undefined.

public-suffix

sub public-suffix ( Str $host ) returns Str

Takes a host name as a Str and returns the public suffix for that host, or the type object if no public-suffix is found or if the host is the string representation of a IPv4 or IPv6 address. This function will throw a X::PublicSuffix::BadDomain exception if the host name is not valid.

According to § 3.2 of the URL living standard, the public suffix is "the portion of a host which is included on the Public Suffix List".

registrable-domain

sub registrable-domain ( Str $host ) returns Str

Takes a host name as a Str and returns the registrable domain for that host, or the type object if no registrable domain is found or if the host is the string representation of a IPv4 or IPv6 address. This function will throw a X::PublicSuffix::BadDomain exception if the host name is not valid.

According to § 3.2 of the URL living standard, the registrable domain of a host is "the most specific public suffix, along with the domain label immediately preceding it, if any".

AUTHOR

José Joaquín Atria jjatria@cpan.org

ACKNOWLEDGEMENTS

The code in this distribution takes inspiration from a number of similar Perl libraries. In particular:

In addition to the distributions mentioned above, the API was mostly inspired by the publicsuffixlist Python module by ko-zu.

This module owes a debt of gratitude to their authors and those who have contributed to them, and to their choice to make their code and work publicly available.

Copyright 2022 José Joaquín Atria

This library is free software; you can redistribute it and/or modify it under the Artistic License 2.0.