Lingua::Stem::EnBroken(3pm) - phpMan

Command: man perldoc info search(apropos)  


Lingua::Stem::EnBroken(3pm)    User Contributed Perl Documentation    Lingua::Stem::EnBroken(3pm)

NAME
       Lingua::Stem::EnBroken - Porter's stemming algorithm for 'generic' English

SYNOPSIS
           use Lingua::Stem::EnBroken;
           my $stems   = Lingua::Stem::EnBroken::stem({ -words => $word_list_reference,
                                               -locale => 'en',
                                           -exceptions => $exceptions_hash,
                                            });

DESCRIPTION
       This routine MIS-applies the Porter Stemming Algorithm to its parameters, returning the
       stemmed words. It is an intentionally broken version of Lingua::Stem::En for people
       needing backwards compatibility with Lingua::Stem 0.30 and Lingua::Stem 0.40. Do not use
       it if you aren't one of those people.

       It is derived from the C program "stemmer.c" as found in freewais and elsewhere, which
       contains these notes:

          Purpose:    Implementation of the Porter stemming algorithm documented
                      in: Porter, M.F., "An Algorithm For Suffix Stripping,"
                      Program 14 (3), July 1980, pp. 130-137.
          Provenance: Written by B. Frakes and C. Cox, 1986.

       I have re-interpreted areas that use Frakes and Cox's "WordSize" function. My version may
       misbehave on short words starting with "y", but I can't think of any examples.

       The step numbers correspond to Frakes and Cox, and are probably in Porter's article (which
       I've not seen).  Porter's algorithm still has rough spots (e.g current/currency, -ings
       words), which I've not attempted to cure, although I have added support for the British
       -ise suffix.

CHANGES
        2003.09.28 -  Documentation fix

        2000.09.14 -  Forked from the Lingua::Stem::En.pm module to provide
                      a backward compatibly broken version for people needing
                      consistent behavior with 0.30 and 0.40 more than accurate
                      stemming.

METHODS
       stem({ -words => \@words, -locale => 'en', -exceptions => \%exceptions });
           Stems a list of passed words using the rules of US English. Returns an anonymous array
           reference to the stemmed words.

           Example:

             my $stemmed_words = Lingua::Stem::EnBroken::stem({ -words => \@words,
                                                         -locale => 'en',
                                                     -exceptions => \%exceptions,
                                     });

       stem_caching({ -level => 0|1|2 });
           Sets the level of stem caching.

           '0' means 'no caching'. This is the default level.

           '1' means 'cache per run'. This caches stemming results during a single
               call to 'stem'.

           '2' means 'cache indefinitely'. This caches stemming results until
               either the process exits or the 'clear_stem_cache' method is called.

       clear_stem_cache;
           Clears the cache of stemmed words

NOTES
       This code is almost entirely derived from the Porter 2.1 module written by Jim Richardson.

SEE ALSO
        Lingua::Stem

AUTHOR
         Jim Richardson, University of Sydney
         jimr AT maths.au or http://www.maths.usyd.edu.au:8000/jimr.html

         Integration in Lingua::Stem by
         Jerilyn Franz, FreeRun Technologies,
         <cpan AT jerilyn.info>

COPYRIGHT
       Jim Richardson, University of Sydney Jerilyn Franz, FreeRun Technologies

       This code is freely available under the same terms as Perl.

BUGS
TODO
perl v5.30.3                                2020-08-23                Lingua::Stem::EnBroken(3pm)

Generated by $Id: phpMan.php,v 4.55 2007/09/05 04:42:51 chedong Exp $ Author: Che Dong
On Apache
Under GNU General Public License
2025-02-22 00:40 @3.17.181.131 CrawledBy Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)
Valid XHTML 1.0!Valid CSS!