Scientific Program

Conference Series Ltd invites all the participants across the globe to attend 4th International Conference on BigData Analysis and Data Mining Paris, France.

Day 2 :

Keynote Forum

Mikhail Moshkov

King Abdullah University of Science and Technology (KAUST), Saudi Arabia

Keynote: Extensions of Dynamic Programming: Applications for Decision Trees

Time : 09:30-10:00

Conference Series Data Mining 2017 International Conference Keynote Speaker Mikhail Moshkov photo

Mikhail Moshkov is professor in the CEMSE Division at King Abdullah University of Science and Technology, Saudi Arabia since October 1, 2008. He earned master’s degree from Nizhni Novgorod State University, received his doctorate from Saratov State University, and habilitation from Moscow State University. From 1977 to 2004, Dr. Moshkov was with Nizhni Novgorod State University. Since 2003 he worked in Poland in the Institute of Computer Science, University of Silesia, and since 2006 also in the Katowice Institute of Information Technologies. His main areas of research are complexity of algorithms, combinatorial optimization, and machine learning. Dr. Moshkov is author or coauthor of five research monographs published by Springer.



In the presentation, we consider extensions of dynamic programming approach to the investigation of decision trees as algorithms for problem solving, as a way for knowledge extraction and representation, and as classifiers which, for a new object given by values of conditional attributes, define a value of the decision attribute. These extensions allow us (i) to describe the set of optimal decision trees, (ii) to count the number of these trees, (iii) to make sequential optimization of decision trees relative to different criteria, (iv) to find the set of Pareto optimal points for two criteria, and (v) to describe relationships between two criteria. The applications include the minimization of average depth for decision trees sorting eight elements (this question was open since 1968), improvement of upper bounds on the depth of decision trees for diagnosis of 0-1-faults in read-once combinatorial circuits over monotone basis, existence of totally optimal (with minimum depth and minimum number of nodes) decision trees for Boolean functions, study of time-memory tradeoff for decision trees for corner point detection, study of relationships between number and maximum length of decision rules derived from decision trees, study of accuracy-size tradeoff for decision trees which allows us to construct enough small and accurate decision trees for knowledge representation, and decision trees that, as classifiers, outperform often decision trees constructed by CART. The end of the presentation is devoted to the introduction to KAUST.

Keynote Forum

Fuad Aleskerov

National Research University Higher School of Economics,Russia

Keynote: Effective choice and ranking of alternatives in search and recommendation problems

Time : 10:00-10:30

Conference Series Data Mining 2017 International Conference Keynote Speaker Fuad Aleskerov photo

Professor Fuad Aleskerov is a leading scientist in mathematics and multicriterial choice and decision making theory. Fuad Aleskerov is the Head of the International Laboratory of Decision Choice and Analysis and the Head of the Department of Mathematics for Economics of the National Research University Higher School of Economics (Moscow, Russia). He has published 10 books, many articles in leading academic journals. He is a member of several scientific societies, member of editorial boards of several journals, founder and head of many conferences and workshops. He has been an invited speaker on numerous international conferences, workshops, and seminars.


The problem of the high computational complexity of most accurate algorithms in search, rank, and recommendation applications is critical when we deal with large datasets. Even the quadratic complexity may be unadmissible. Thus, the task is to develop efficient algorithms by consistent reduction of information and by the use of linear algorithms on the first steps.

The problem of whether functions of several variables can be expressed as superposition of functions of fewer variables was firstly formulated by Hilbert in 1900 as the Hilbert’s thirteens problem. The answer to this general question for the class of continuous functions was given in 1957 by Arnold and Kolmogorov. For the class of choice functions this matter was studied only by our team.

A new effective method for search, ranking, and recommendation problems in large datasets is proposed based on superposition of choice functions. The developed algorithms have low computational complexity so they can be applied on big data. One of the main features of the method is the ability to identify the set of efficient options when one deals with large number of options or criteria. Another feature of the method is the ability to adjust its computational complexity. The application of the developed algorithms to the Microsoft LETOR dataset showed 35% higher efficiency comparing to the standard techniques (for instance, SVM).

The proposed methods can be applied, for instance, for the selection of effective options in search and recommendation systems, decision support systems, Internet networks, traffic classification systems and other relevant fields.

Keynote Forum

Omar M Knio

King Abdullah University of Science and Technology, Saudi Arabia

Keynote: Data enabled approaches to sensitivity analysis, calibration and risk visualization in general circulation models

Time : 10:50-11:20

Conference Series Data Mining 2017 International Conference Keynote Speaker Omar M Knio  photo

Omar M Knio completed his PhD from MIT in 1990. He held a Post-doctoral position at MIT, before joining the Mechanical Engineering Faculty at Johns Hopkins University in 1991. In 2011, he joined Mechanical Engineering and Materials Science Department at Duke University. In 2013, he joined AMCS Program at KAUST, where he served as Deputy Director of the SRI Center for Uncertainty Quantification in Computational Science and Engineering. He has co-authored over 100 journal papers and two books.


This talk discusses the exploitation of large databases of model realizations for assessing model sensitivities to uncertain inputs and for calibrating physical parameters. Attention is focused on databases of individual realizations of ocean general circulation model, built through efficient sampling approaches. Attention is then focused on the use of sampling schemes to build suitable representations of the dependence of the model response on uncertain input data. Non-intrusive spectral projections and regularized regressions are used for this purpose. Bayesian inference formalism is then applied to update the uncertain inputs based on available measurements or observations. We illustrate the implementation of these techniques through extreme-scale applications, including inference physical parametrizations and quantitative assessment and visualization of forecast uncertainties.