Morphologic library for developers

Informatic offers mophologic modules for solution of wide range of problems from searching tools to analysis of text information. For exampe, information retrieval systems, analysis systems, electronic document catalogs etc.

The morphology modules of “Informatic” are based on advanced text processing technologies, linguistic and mathematical algorithms, that can be used in context retrieval in documents with all word forms, synonyms search, grammar check and spelling check. The modules are also useful for analyses of information in date bases.

The morphology modules of “Informatic” are successfully used in the systems developed by Samsung, Syngenta, Quantum Art, ALP, Yandex and many others.

Proposed morphology modules:

Developer’s tool Kit
 Spelling Check
 Grammar Check
 Russian Morphology (and 7 another languages)
 Russian Thesaurus
 Russian Hyphenation
Indexer for Microsoft

Developer’s tool Kit

Home

Spelling check

The Spelling check (Speller) is a full module of spelling check with suggestions and adding new words with all word forms to the user dictionary.

The “adding words” module is designed in 2 variants:

• with built-in screen interface

ГThis module generates a list of the added word paradigm hypotheses, ranked by complience with the Russian grammar. The right hypothesis is among the first three ones in 90% of cases. The user should choose the right paradigm from the list and the word with all the forms is added to the dictionary. If a word has more complicated paradigm the user may choose it from the other part of the list. The User Dictionary may be linked to other linguistic modules of Informatic.

• without screen interface

This module also generates a a list of the added word paradigm hypotheses, ranked by complience with the Russian grammar. API allowes to get the list of word forms for each hypothesis and its charachteristics (part of speech, for example). Chosen hypothesis could be added to the users dictionary which will be later used for the check. The User Dictionary may be linked to other linguistic modules of Informatic.

Realized for languages:
- Russian;
- Ukrainian;
- English;
- French;
- German;
- Spanish;
- Italian;
- Portuguese.
The product is designed as a dynamic library (*.dll) for Windows.
Prices of morphology modules details
Home

Grammar Check

Grammar Check (Russian Grammar) works with more than 40 different Russian grammar rules including comas punctuation. Testing showed that the module is able to flag more than 50% of widely spread grammar and syntactic errors.

The product is designed as a dynamic library (*.dll) for Windows.
Prices of morphology modules details
Home

Russian Morphology

Russian Morphology includes several modules of the Russian words morphology analyses and helps to solve the following problems:

Normalising words to the dictionary form is useful for searching of word form using the other word form. This modules gives the dictionary form of a word according to the information from the main dictionary or user morphological dictionary of unlimited volume. The part of speech and 4-bites digital hash-code, which may be used for text indexing, generates for every input word;

Synthesis of all forms of assigned word. The module shows all forms of assigned word, if it is in the main dictionary or in the augmented user morphological dictionary of unlimited volume.

The library allows to do exact analysis of known words by the dictionary including more than 240 000 vocabulary entries, ingenerating more than 4 millions word forms. The library helps to forecast very reliably grammatical charachteristics and paradigme of unknown words according to the complex of inflexion rules.

Key charachteristics of module:
- common dictionaries have more than 240 000 words;
- system of quick augmentation of dictionary: in 99% cases the system determinates inflexion type of added word itself;
- system of unique word identifiers generation: each known word get it’s own unique identifier, that allows to organize a compact index of arbitrary text corpora which allows all word forms search.

The library helps to include Russian morphology into information retrieval systems. It supports all capabilities of morphology analysis for known and unknown Russian words: determination of grammatical charachteristics of the word, normalising to the dictionary form of a word, getting of requisite word forms.

The product is designed as a dynamic library (*.dll) for Windows.
Prices of morphology modules details
Home

Thesaurus

Thesaurus (Russian Thesaurus) - show of Russian synonims, antonims and conjugates.

Russian Thesaurus comprises more than 70 000 Russian words and expressions. Thesaurus data consists of synonyms (about 10 000 groups of synonyms, and about 3.7 synonyms in the each group), antonyms and related words.

Abilities of the Thesaurus:

• distinction of Russian words without reference to their form in the text;

• The Thesaurus proposes synonyms and antonyms for each word in the same form that the base word has.

The product is designed as a dynamic library (*.dll) for Windows.
Prices of morphology modules details
Home

Hyphenation

Hyphenation (Russian Нyphenation).

The module allows to assign the quality of hyphenation: Books or Newspaper's and a code of hyphenation symbol. The letter «ё» is kept.

The product is designed as a dynamic library (*.dll) for Windows.
Prices of morphology modules details

Microsoft Indexing Services

Home

Russian Indexer for Microsoft

Russian Indexer for Microsoft - allows to widen much the capabilities of Microsoft Indexing Service and Microsoft SQL Server by the work with Russian documents: You can make search in all wopd forms on a basis of morphological analysis. The module is meant for system integrator and developers of applications, using morphological search.

Russian Indexer for Microsoft allows You:

• make full text search index in Microsoft SQL Server regarding Russian morphology, that simplifies administration and indexes using problems noticeably;

• raise exactness, полноту и speed of search;

• use effective search over your company's site/Internet-shop.

Using of Russian morphology allows:

• determine delimiters and word forms correctly;

• use stop-words' list.

Advantages of Russian Indexer for Microsoft:

• it is developed according to Microsoft specification;

• supports document formats: MS Office, XML, HTML;

• can to widen formats-list;

• works with catalog’s (WEB,document archive);

• works with fields of data base tables;

• is supplied with installer;

• иhas a stop-words vocabulary, thematic-oriented;

• has a common dictionary - 280 000 wthematicords (4,5 million word forms).

Prices of morphology modules details