A set of software libraries (framework) for processing natural language texts using an associative-ontological approach was developed by specialists of the St. Petersburg Institute of Informatics and Automation of the Russian Academy of Sciences (http://www.spiiras.nw.ru/
, Research Automation laboratory https://sial.iias.spb.su
Software modules for processing texts in natural language are designed to solve the tasks of:
- single loading of content from Internet sites or continuous monitoring of Internet sites;
- associative text search and texts thematic classification;;
- quality assessment of texts, of advertising materials filtering and automatically generated texts;
- composing the text’s abstract;
- plotting the specified subject area graphic map;
- preprocessing the texts in natural language for solving problems of content search, forming a search index, and subsequent processing in analytical systems.
The main distinguishing feature of the associative approach is the organization of search for documents that meet the condition of semantic links in the document between all the words of the search query.
The framework is first and foremost oriented to developing the specialized search systems, news monitoring systems and news aggregators, systems monitoring content and information - analytical systems.
Internet Monitoring allows for identifying resources of the required content, as well as for identifying relevant issues, according to the Internet community.
Monitoring of internal document flow allows to operatively include all newly created documents in the search database for a quick search of separate documents and all documents related to them by topic or by links.