Most important
Система массового ввода бумажных документов
The ultimate document capture system Cognitive Forms system was designed by Cognitive Technologies company, headed by Olga A. Uskova (Head of Cybernetics Department). The present solution is a flagship product in the market of intellectual software solutions. The research team included the Department professors as well as post and under graduate students. System Overview: The system is a software complex to establish technological chain procedures for general input of standardized documents. System modules are installed on PC in local network. These interacting modules form a conveyer of data processing and provide for the input of up to 70 000 pages per day. Types of processed documents: There are different types of paper documents we use in various aspects of our life: form sheets, blanks, forms, etc; invoices, payments orders, policies, contracts, ID papers, etc. Documents are groups according to formatting style and consequently processing level: - Rigidly structured documents (fields to fill in are fixed in size and location) Ex: Exam test, blanks, insurance forms, medical insurance application, tax declaration, etc. - Non-rigidly structured documents or flexible documents. The content and the sequence of parts is the same but the size of fields to fill in can be different. Ex: invoices, payment orders, etc. - Жестко структурированные документы, у которых расположение и размеры полей фиксированы. Такие документы еще называются «совпадающими на просвет». К ним относятся анкеты, экзаменационные тесты, бланки, страховые формы, запросы на выплату медицинской страховки, налоговые декларации и т.п. - Гибко структурированные документы или «гибкие документы» — документы, у которых состав и порядок следования их частей по горизонтали и вертикали одинаков, но части могут отличаться по размерам или масштабу. К ним относятся счета, счета-фактуры, товарные накладные, платежные поручения и т.п. Main technological stages: -Scanning: Filled in forms are loaded into scanner and as a result we get a package of TIFF files in CCITT Group 4 format. -Sorting and assembly: At the sorting stage documents, scanned as a single pack for efficient use of scanner, are split into separate documents – first page is detected. Completeness is controlled by recognition of key fields. If the document is complete, it goes to recognition stage. If not – to correction stage. Sorting correction stage: Operator checks the document for incompleteness reason Recognition stage: Recognition module determines the type of the graph vision of the page, finds input fields and recognizes and saves in special data base information on the page type, input data and their location; Verification stage: Operator checks the fields which could not be recognized by the system Export stage: Final stage – the document is converted into standard DBF, XML format or to external system.

Our contacts

Research group

Back to list