Регистрация / Вход
Прислать материал

SOFTWARE SYSTEM FOR DATA PROCESSING

Name
Daniil
Surname
Vorobyev
Scientific organization
NUST MISIS
Academic degree
Master
Position
Engineer
Scientific discipline
Information technologies
Topic
SOFTWARE SYSTEM FOR DATA PROCESSING
Abstract
Technical solutions are suggested for structure, functions and organization of data processing tools based on WEB technology systems and cloud resources. We formulate the strategic direction of the development of a software system fo higher-performance processing of large volumes of electronic data for the problems of intellectual structural analysis: classification, machine learning and prognostication in quantum physics.
Keywords
Big data, neural network, data mining, data base, VASP
Summary

Today is the day we can say that over the past decades have seen intensive development of automated information systems (AIS), including in areas such as network technology Internet ways of storing and representing knowledge, programming languages and tools, artificial intelligence techniques, algorithms, distributed and cloud computing etc.

Scientific and technical achievements in the field of artificial intelligence influenced the formation of new and transformation of the old classes of information systems -. Intelligent information systems, data mining, expert systems, decision support systems, etc. Unfortunately, all modern tools in most cases scattered, decide chastnopredmetnye task or

development "of the system" that integrates in its structure all of the most advanced tools (approaches, striving to improve the efficiency of the use of information stored in electronic form.

 

Specificity IDPSimposes a significant imprint on the methodology and technology of its development. Technology for creating IDPShas its essential features and is different from the design and development process each information and software systems. These differences are to a large extent determined by the fact that Gozo is an intelligent information systems (MIS) based on the ideas, principles and techniques of artificial intelligence (AI). Such systems require for their realization the specialized approaches, methods and technologies are largely different from the classical methods of software development sisge However, the ever-increasing interest in artificial intelligence contributes to more widespread adoption and use of intelligent systems in different application areas. The development of artificial intelligence systems (expert-oriented systems, data mining systems, translation, machine vision t i. E.) Is gradually coming to an industrial scale, it acquires industry features. This leads to the fact that such systems are beginning to be brought to the same requirements as the traditional program products. Specificity of Gozo lies in the fact that the development and implementation of intelligent information systems rather long and laborious process, which has not yet been fully worked out, and often requires new

efficiency and quality of the resulting system is largely determined by the talent and expertise of its developers. In addition, the problem of implementation is also connected with the choice of appropriate means of development, which is a separate challenge. Modern analytical information systems acquire the characteristics and features of software systems, based on data mining algorithms, expert systems, artificial intelligence systems, and machine learning. We single out the following basic features IDPS, defining it as a tool for expert support

- Ability to solve a wide range of tasks in a formalized problem area.

- The ability to extract knowledge published and present them in the form of formal models of knowledge.

- Modeling of intellectual property rights mechanisms.

- The use of domain knowledge.

- The use of heuristic methods for solving problems.

- Ability to explain the decision.

-    High performance. Etc.

cloud computing technology. Cloud computing is called a complex based on virtual machines, capable of a minimum of delay, on demand, to increase or decrease its performance.

Remote computing service cloud can be called, if it satisfies the regulations formed the American NITS (National Institute of Standards and Technology) (Standards Acceleration to Jumpstart Adoption of Cloud Computing (SAJACC). All provided cloud services divide into three basic

- Infrastructure as a service - infrastructure as a service. The user is given an unlimited set of virtual machines whose configuration configures the user, the user is also able to bind them to the network and thus to build a virtual cluster of the configuration that he needs. At this level, the user has no control over the iron only, everything else is subject to it.

- Platform as a service - platform as a service. The level where the user is abstracted from the level of virtual machines, networks, clusters. At this level, the user as it has at its disposal an infinite size disk, the computer with infinite memory and super-fast multi-core processor. In fact, the platform is constructed so that organizes cluster that looks like a single machine.

- Software as a service - Software as a Service. Users remotely using any program. There is a cloud version of 1C, Photoshop, offisnyh programs. User configuration is available only to its account in proramma and configuration of the program in its session.

Data Mining Platform. The developed platform can refer to a category of SaaS

The system also includes client-side extension functionality, and it is expected that the system will have a part of the PaaS functionality.

The developed platform has the following features (SaaS BI):

- As service is available workspace (authentication and authorization means download and edit files).

- A library of algorithms and analysis subsystems, including: clustering and classification, the construction rules and decision trees, neural networks, genetic algorithms, statistical algorithms, and so forth..

- Means of construction of automatic collective solutions based on algorithms.

- Analysis of the effectiveness of training tools on the data.

- Visualization of analysis results.

By strategic plans include the possibility of expanding user features for the creation and adjustment of individual sections of the library of algorithms (PaaS), including:

- Editor to generate new algorithms based on the meta-language.

- Ability to upload algorithms to public access.

- Ability to build scripts using analysis algorithms.

- Integration with business processes of users / customers (the server automatically at the end of the analysis or training will be specifically configured to send data to the customer's server and thus will

- Work with the server via the API (automatic data entry in Gozo).

Strategic development plan. intelligent data processing system (IDPS) is a software and hardware system, configured to use different classes of recognition, language and

focused on providing services in the field of scientific and applied problems of a variety of processing

The complex integrates incorporates a high-performance hardware (including high performance computing servers and extensive disk storage) and efficient software (software) designed for a wide range of applications of data mining and permitting the setting and the expansion of supported features.

The system supports and implements the concept of "public cloud computing» (public cloud computing) and offers the following categories of the "cloud" services:

- Software as a service (Software as a Service, SaaS) - users of the system through the Internet access g specialized software and a web interface that runs in any modern browser, have the opportunity to use it to solve their problems, including resource-intensive;

- Infrastructure as a service (Infrastructure as a Service, IaaS) - users of the system can be rented for long term resources of a computer infrastructure when needed, such as computing servers and storage systems, as well as to attract to solve their problems intellectual resources provided by the staff, providing support to the functioning and development of the system;

 

IDPS main functions:

- The system provides a means for solving the problems of the following areas: statistical data processing (correlation and regression analysis of variance and covariance analysis, etc...), Pattern recognition (classification with training), clustering (classification without training), identification (identification of distinctive signs of the objects), forecasting (the definition of tendencies of development process of extracting knowledge published (data mining) and text (text mining). The open architecture and API system set of supported classes of problems can be simply amended, including by the use of the decisive third-party modules developers.

also HTML, from a spreadsheet in Excel and the OpenDocument format, from relational databases, as well as from veb¬prilozheny and services, such as Google spreadsheets, databases or Zoho Creator Yandex service. Furthermore m th data can be entered, modified and edited directly through the system interface. Export of data and may be in different file formats, in particular, CSV, XML, HTML, PDF. RTF, Excel JPEG, PNG. In addition to these, the system can be integrated into other converters imporga / eksporga providing work with specific formats and data sources.

- IDPScontains tools that provide opportunities for visualization and graphical representation of the original data and the results of their processing in various forms, including in the form of all sorts of graphs and charts, as well as to generate a variety of reports that can be published as part of the system or exported for later use outside IDPS.

- The system includes interactive tutorials, guides and test modules on the subject of intellectual data processing designed to educate users effectively address the relevant problems through the system, as well as to increase their level of knowledge about the models and methods of data processing. In addition, the SIS provides opportunities for the organization on the basis of its e-learning courses and conduct automated testing knowledge of different subject areas.

- For each user IDPS automatically creates a website that functions within the system and available to other users. On its Web site, the user can aggregate and publish any materials that do not contradict the rules adopted in the system, and limit access to them when needed. In addition to the content, the users within a certain range can change the design and structure of their sites.

- The system supports a variety of communication services for communication between a user and receive feedback from them, which may include the exchange of electronic messages (including chat in real-time) and mailing lists, subscribe to news and information on the changes occurring in the system ( for example, about the changes on the sites of any users), thematic forums and discussions, surveys and polls.

management process as the site as a whole and the entities within the site: page layouts, templates, O '

Design solutions in the development of intellectual repository of knowledge and automation of information support of managerial decision-making, together with the data mining technology, artificial intelligence methods, models, reporting, electronic databases on various subject areas will serve as a tool to improve the efficiency of research, innovative educational activities, as well as in industry, economics, medicine, and so on. d.

The practical implementation of the project allows you to use it as a self-adjusting, adaptive, open intellectual information system with integrated functions of expert systems! data mining (data mining).