20 packages returned for Tags:"tokenizer"

Package type
Sort by
Options
The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks... More information
A .NET class library that makes it easier to parse text. The library tracks the current position within the text, ensures your code never attempts to access a character at an invalid index, and includes many methods that make parsing easier. The library makes your text-parsing code more concise and... More information
Pre-release version. API might change later. A lemma is the canonical form of the word. For example, the words "run", "runs", "ran" and "running" can be lemmatized to "run" XLemmatizer tokenizes and lemmatizes English sentences. How to use: 1) Creates a new instance of Lemmatizer 2) Calls... More information
The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks... More information
WQuery enables parsing and then editing a HTML code with the assistance of a fluent interface just like in the case of a jQuery library. WQuery is a part of a Wojdav Bootstrap Mvc package. The parsing of the HTML code is based on a WHtmlParser library. For now, a WHtmlParser contains some... More information