|
NAMELingua::ZH::WordSegmenter - Simplified Chinese Word SegmentationVERSIONVersion 0.01SYNOPSISuse Lingua::ZH::WordSegmenter; my $segmenter = Lingua::ZH::WordSegmenter->new(); print encode('gbk', $segmenter->seg($_) ); DescriptionThis is a perl version of simplified Chinese word segmentation.The algorithm for this segmenter is to search the longest word at each point from both left and right directions, and choose the one with higher frequency product. The original program is from the CPAN module Lingua::ZH::WordSegment (http://search.cpan.org/~chenyr/) I did the follwing changes: 1) make the interface object oriented; 2) make the internal string into utf8; 3) using sogou's dictionary (http://www.sogou.com/labs/dl/w.html) as the default dictionary. METHODS
SEE ALSOLingua::ZH::WordSegmentAUTHORZhang Jun, "<jzhang533 at gmail.com>"COPYRIGHT & LICENSECopyright 2007 Zhang Jun, all rights reserved.This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
Visit the GSP FreeBSD Man Page Interface. |