Day 1 :
Indiana University, USA
Time : 10:15-11:00
Thomas Sterling is a Professor of Intelligent Systems Engineering at the Indiana University School of Informatics and Computing. He serves as the Chief Scientist and Associate Director of the Center for Research in Extreme Scale Technologies (CREST). After receiving his PhD from MIT in 1984 as a Hertz Fellow, he has been engaged in research fields associated with Parallel Computing System Structures and Semantics. He is the co-author of 6 books and holds 6 patents. He was the Recipient of the 2013 Vanguard Award.
Data analytics in its many forms has rapidly expanded to engage scientific, industrial, and societal application domains. But as more problem spaces yield to this expanding genre of computing, the demand for capabilities is expanding. Simultaneously, high performance computing (HPC) systems and methods is experiencing significant change in form and function with the asymptotic convergence with nano-scale semiconductor feature size and the end of Moore’s law even with exascale performance anticipated in the early years of the next decade. Historically these two processing domains have been largely independent but now a growing consensus is driving them together, aligning their respective modalities and catalyzing a synergistic convergence. A major premise of US Presidential Executive Order leading to the National Strategic Computing Initiative stipulates that the merger of big data and numeric intensive computing be a constituent of national exascale charter. This presentation will describe the significant shift in system architecture and operational methodologies that will be required to simultaneously respond to the challenges of the end of Moore’s law and the graph processing approaches, potentially dynamic that will augment the more conventional matrix-vector oriented computation. It will discuss the likely importance of dynamic adaptive resource management and task scheduling essential to dramatic improvements in scalability and efficiency for exascale computing and how these changes will be applied to knowledge discovery.
University of Derby, UK
Time : 11:20-12:05
Fionn Muragh is Professor of Data Science and previously he was into Big Data in Education, Astrophysics and Cosmology. He was the Director of National Research Funding across many domains including Computing & Engineering, Energy, Nanotechnology and Photonics. He has been the Professor of Computer Science, including Head of Department, and Head of School at many universities. He was the Editor-in-Chief of the Computer Journal for more than 10 years, and is a Member of the Editorial Boards of many other journals.
Geometric data analysis allows for “letting the data speak” and integrates qualitative and quantitative analytics. Scope and potential are major in many fields. Case studies here are large scale social media analytics, related to an area of social practice and an area of health and well-being. The interesting survey of Keiding and Louis, “Perils and potentials of self-selected entry to epidemiological studies and surveys” points to very interesting issues in big data analytics. My contribution is in the discussion part of this paper. Through the geometry and topology of data and information, with inclusion of context, of chronology and of frame-models, we are addressing such issues of sampling and representativity. The case studies to be discussed in this presentation are related to mental health and to social entertainment events and contexts in the latter case with many millions of Twitter tweets, using many languages. Particular consideration is given to use and implementation of our analytical perspectives. This includes determining the information content of our data clouds, and of mapping onto Euclidean-distance endowed semantic factor spaces, as well as the ultrametric or hierarchical topology, that is characteristic of all forms of complex systems.
InfoCodex AG-Semantic Technologies, Switzerland
Time : 12:05-12:50
Carlo A Trugenberger has earned his PhD in Theoretical Physics in 1988 at the Swiss Federal Institute of Technology, Zürich and his Master’s in Economics in 1997 from Bocconi University, Milano. An international academic career in theoretical physics (MIT, Los Alamos National Laboratory, CERN Geneva, Max Planck Institute Münich) led him to the position of Associate Professor of Theoretical Physics at Geneva University. In 2001, he decided to quit academia and to exploit his expertise in Information Theory, Neural Networks and Machine Intelligence to design an innovative semantic technology and co-founded the company InfoCodex AG-Semantic Technologies, Switzerland.
The majority of big data is unstructured and of this majority the largest chunk is text. While data mining techniques are well developed and standardized for structured data; numerical data, the realm of unstructured data is still largely unexplored. The general focus lies on information extraction, which attempts to retrieve known information from text. The Holy Grail however is knowledge discovery, where machines are expected to unearth entirely new facts and relations that were not previously known by any human expert. Indeed, understanding the meaning of text is often considered as one of the main characteristics of human intelligence. The ultimate goal of semantic artificial intelligence is to devise software that can understand the meaning of free text, at least in the practical sense of providing new, actionable information condensed out of a body of documents. As a stepping stone on the road to this vision I will introduce a totally new approach to drug research, namely that of identifying relevant information by employing a self-organizing semantic engine to text mine large repositories of biomedical research papers, a technique pioneered by Merck with the InfoCodex software. I will describe the methodology and a first successful experiment for the discovery of new biomarkers and phenotypes for diabetes and obesity on the basis of PubMed abstracts, public clinical trials and Merck internal documents. The reported approach shows much promise and has potential to impact fundamentally pharmaceutical research as a way to shorten time-to-market of novel drugs, and for early recognition of dead ends.