Special Session on Data Cleansing, Selection and Pre-processing for Machine Learning


25th International Conference on
Knowledge-Based and Intelligent Information & Engineering Systems KES 2021

Szczecin, Poland
8 -10 September 2021


    A success of machine learning and a usefully of machine learning models depend on the data used. The data used in the machine learning training should be provided in a way assuring an optimizing of the learning process as well as an accepted or the best possible level of generalization of final models. In the traditional approach, the process of data preparation precedes a machine learning stage. Although the data preparation process can be integrated with the learning process or may be an internal mechanism of a given machine learning algorithm, may be also performed interactively with special tools, or as batch processing. Data preparation is also a crucial observing the massive growth in the scale of data.

    Data preparation for machine learning may include a number of activities or processes, from data cleansing, noise detection and elimination, data editing, instance selection, noise reduction, eliminate the outliers and detecting wrong or distorted labels, resampling, feature extraction and selection, to elements of data transformation. Data preparation can be also merged with an elimination of the class imbalance problem.

    Data preparation can be a process relatively easy or complex, and can arrange simple tools or requiring complex computations. The tools can also base on visualization. Visual data analytics is on the rise especially in multi-dimensional business applications. However, addressing data preparation problem for machine learning may mean not having prior knowledge how the process should be carried-out. The process of data preparation is also a crucial for the Big Data scenario, including data streams.

    We encourage to submit very recent applications and if possible unprecedented. Additionally, new theoretical or empirical approaches are welcome. Topics of submitted papers can also focus on industrial or commercial applications.


    The topics of interest for this session include, but are not limited to:

    • Data selection
    • Data editing
    • Data cleansing
    • Data engineering
    • Feature selection and extraction
    • Instance selection
    • Data normalization
    • Data transformation
    • Data quality
    • Imperfect data
    • Data pre-processing
    • Imbalanced data processing
    • Undersampling
    • Resampling
    • Data visualisation
    • Data validation
    • Data pre-processing technologies
    • Application of intelligent techniques for data cleansing, selection and transformation
    • Other related topics


    All contributions must be of high quality, original, and must not have been previously published elsewhere or intended for publication elsewhere.

    The papers will be reviewed by the International Program Committee. The best submissions will be selected for presentation and will be included in the conference proceedings.

    The conference proceedings will be published in Elsevier's Procedia Computer Science open access journal, available in ScienceDirect and submitted to be indexed/abstracted in CPCI (ISI conferences and part of Web of Science), Engineering Index, and Scopus.

    Authors of selected papers may be invited to submit extended versions of their papers for publication as full journal papers, for example in the KES Journal or other journals.

    Submitted papers should be prepared in the Procedia style and should be limited to 10 pages. All papers must be submitted electronically through the PROSE online submission and review system.

    Guidance notes for the preparation of paper is available


    Submission of papers: 4 May 2021 (extended - strict deadline!)
    Notification of acceptance: 14 May 2021
    Camera ready papers submission: 28 May 2021

    Conference: 8 - 10 September 2021

    More details is available at the KES 2021 website.


    Antonio J. Tallón-Ballesteros, University of Seville, Spain -
    Ireneusz Czarnowski, Gdynia Maritime University, Poland -