You are using an outdated browser Internet Explorer. It does not support some functions of the site.

Recommend that you install one of the following browsers: Firefox, Opera or Chrome.


+7 961 270-60-01

  • Using the mean distance criterion to identify novelty in the data

    The article discusses the features of identifying novelty in data, as well as general methods for identifying it. Since the absence of noise in the training information is a determining factor for building high-quality classifiers on it in supervised machine learning, such a practically important special case of the search for novelty is considered, when it is determined in separate classes of training data after all outliers have been eliminated in these data. For greater definiteness, when searching for novelty, its geometric interpretation in the space of object feature values.

    Keywords: data, classifier, outliers, novelty, novelty detection, geometric approach, statistical criterion

  • Correction of training samples taking into account errors in measuring the characteristics of objects when constructing classifiers according to the methodology of teaching with a teacher

    Noise in training samples, the main part of which is made up of outliers and novelty, is considered. The analysis of the main causes of outliers in training samples is given. The essence of the main existing approaches to determining outliers in training samples is considered. Based on the use of the nearest neighbors method, a modified method for comparing generalized distances from objects to classes is proposed. For the main types of metrics used in the spaces of feature values, the justified values ​​of the safety factors used in this technique are found. For a programmatic assessment of the quality of the training sample and a reasonable choice of the method for correcting outliers, it is proposed to use the permissible fractions of corrected and removed outliers in it. An algorithm for analyzing the presence of outliers in a set of training examples is given. An estimate of the complexity of the algorithm by the length of the input of the problem is given. An algorithm for evaluating and correcting training samples has been developed.

    Keywords: classification problem, classifier, decision function, training sample, precedent, erroneous data, analysis, correction, artificial intelligence, compactness hypothesis, novelty, learningg

  • Classifiers for the construction of complex objects in multidimensional spaces

    Is devoted to the actual problem of constructing classifiers objects given by a point in a multidimensional space of feature values. The principle of linear normal classification of objects in multi-dimensional space of attributes can be used to build a classifier in the case of many complex structures, in general, are inseparable one hyperplane. In such cases, proposed to use a set of hierarchically related normal separating hyperplanes, which is called the normal hierarchical classifier.

    Keywords: recognition, classification, feature space, the geometric method

  • Linear classification of objects using normal hyperplanes

      Is devoted to the actual problem of constructing classifiers objects given by a point in a multidimensional space of feature values. A version of the geometric separation of sets by hyperplanes normal to the center-distance data sets. This approach to separating planes reduces the computational operations performed. This author separability criterion allows a normal quite effective in terms of computational complexity the exact solution of the normal separation, which requires only a linear search of points separated sets. Proposed in the article the approach to classification of sets in the multidimensional space of values ​​of their attributes can be used as a starting point for building effective in terms of computational complexity classification not only for normally separable sets, but also for more complex variations thereof. This is the most significant practical importance of materials submitted by the authors.

    Keywords: recognition, classification, feature space, the geometric method