Rand Stats

Lingua::Stopwords

cpan:CHSANCH

Build Status

NAME

Lingua::Stopwords - Stop words for several languages.

SYNOPSIS

    use Lingua::Stopwords;

    # Get the stopwords list, the first parameter is the iso code language and the second is the list type
    my $stopwords = get-stopwords('en', 'snowball');

    my $text = "This is a test and it has some stopwords. This is a test for a Perl 6 module to extract them";

    # It will return a Set of words without the stopwords
    my $text-parsed = $text.subst(/<:punct>/, '', :g).words.grep: { !$stopwords{$_}};

    say $text-parsed.join(' '); # OUTPUT: This test stopwords This test Perl 6 module extract

    # $stopwords is a a SetHash, so you can add more words
    $stopwords<Perl>++;

    $text-parsed = $text.subst(/<:punct>/, '', :g).words.grep: { !$stopwords{$_}};

    say $text-parsed.join(' '); #OUTPUT: This test stopwords This test 6 module extract

    

TABLE OF CONTENTS

DESCRIPTION

This module provides Stopwords for several languages.

Supported languages

For each language, this module provides Stopwords list from different sources. The all type list contained all the list available for the language.

LanguageISO codeList type
Catalancaranks-nl
Danishdasnowball, ranks-nl, all
Dutchnlsnowball, ranks-nl, all
Englishensnowball, ranks-nl, all
Finnishfisnowball, ranks-nl, all
Frenchfrsnowball, ranks-nl, all
Galicianglranks-nl
Germandesnowball, ranks-nl, all
Hebrewheranks-nl
Hungarianhusnowball, ranks-nl, all
Italianitsnowball, ranks-nl, all
Norwegiannosnowball, ranks-nl, all
Portugueseptsnowball, ranks-nl, all
Russianrusnowball, ranks-nl, all
Spanishessnowball, ranks-nl, all
Swedishsvsnowball, ranks-nl, all

REPOSITORY

Fork this module on GitHub: https://github.com/chsanch/perl6-Lingua-Stopwords

BUGS

To report bugs or request features, please use https://github.com/chsanch/perl6-Lingua-Stopwords/issues

AUTHOR

This module was inspired by Perl 5's module Lingua::Stopwords.

The snowball stoplists by this module were created as part of the [Snowball project](see http://snowball.tartarus.org).

The Ranks NL stoplists by this module were created by Ranks NL.

Christian Sánchez chsanch@cpan.org.

LICENSE

You can use and distribute this module under the terms of the The Artistic License 2.0. See the LICENSE file included in this distribution for complete details.

The META6.json file of this distribution may be distributed and modified without restrictions or attribution.