Tutorial Sessions/Invited Talks

All tutorials and invited talks are free to registered conference attendees of all conferences held at WOLDCOMP'11. Those who are interested in attending one or more of the tutorials are to sign up on site at the conference registration desk in Las Vegas. A complete & current list of WORLDCOMP Tutorials can be found here.

In addition to tutorials at other conferences, DMIN'11 aims at providing a set of tutorials dedicated to Data Mining topics. The 2007 key tutorial was given by Prof. Eamonn Keogh on Time Series Clustering. The 2008 key tutorial was presented by Mikhail Golovnya (Senior Scientist, Salford Systems, USA) on Advanced Data Mining Methodologies. DMIN'09 provided four tutorials presented by Prof. Nitesh V. Chawla on Data Mining with Sensitivity to Rare Events and Class Imbalance, Prof. Asim Roy on Autonomous Machine Learning, Dan Steinberg (CEO of Salford Systems) on Advanced Data Mining Methodologies, and Peter Geczy on Emerging Human-Web Interaction Research. DMIN'10 hosted a tutorial presented by Prof. Vladimir Cherkassky on Advanced Methodologies for Learning with Sparse Data. He was a keynote speaker as well (Predictive Data Modeling and the Nature of Scientific Discovery). In addition, we had one tutorial held by Peter Geczy on Web Mining.

DMIN'11 will host the following tutorials/invited talks:

Tutorial A
Speaker:	Gary M. Weiss, Fordham University, USA
Topic:	Smart Phone-Based Sensor Data Mining
Webpage	http://www.cis.fordham.edu/faculty/Gary-Weiss.html
Date & Time	Tuesday, July 19, 6:00-8:30pm (new)
Location	Ballroom 1
Description	Smart phones have exploded in popularity in recent years and are now the most common computing devices, having surpassed personal computers. While smart phones, and other related devices such as tablet computers, now run sophisticated operating systems and include substantial processing power and memory, they are more than computing and communication devices—they are sophisticated sensors. This becomes clear when you realize that these devices typically contain a: GPS sensor, acceleration sensor (accelerometer), audio sensor (microphone), image sensor (camera), light sensor, direction sensor (compass), proximity sensor, temperature sensor, and pressure sensor. The availability of these sensors in mass-marketed mobile devices creates exciting new opportunities for data mining and data mining applications. In this tutorial I will survey the data mining applications that can be built using these sensors, the data mining methods used to extract information from these sensors, and the practical and architectural issues that relate to data mining of sensor data from devices with relatively limited resources (e.g., battery life). I will also discuss how sensor data from a population of smart phones can be pooled (crowdsourcing) to provide useful knowledge and interesting applications. This tutorial is intended for anyone interested in the topic and those from other research areas (e.g., wireless networks) should be able to learn much from the tutorial.
Short Bio	Gary Weiss is a faculty member in the department of Computer and Information Science at Fordham University. He earned his B.S degree from Cornell University, his M.S. degree from Stanford University, and his Ph.D. from Rutgers University. Prior to coming to Fordham he worked for over 15 years at AT&T Bell Labs and AT&T Labs. Until recently, his research has focused on how various real-world factors, such as class imbalance, affects the ability to learn from data. This led to several KDD workshops on Utility-Based Data Mining and a special issue of the Data Mining and Knowledge Discovery journal on this topic. For the past two years Dr. Weiss has led a dozen students on the WISDM (Wireless Sensor Data Mining) project. Recent work has focused on mining accelerometer data from smart phones and this has led to publications on cell phone-based activity recognition and cell-phone based biometric identification. Dr. Weiss has I have published over forty papers in the areas of machine learning and data mining as well as several in the area of expert systems and object-oriented programming.

Tutorial B
Speaker:	Michael Mahoney, Stanford University, USA
Topic:	Geometric Tools for Identifying Structure in Large Social and Information Networks
Webpage	http://cs.stanford.edu/people/mmahoney/
Date & Time	Monday July 18, 5:45-8:15pm
Location	Platinum Room
Description	Abstract The tutorial will cover recent algorithmic and statistical work on identifying and exploiting "geometric" structure in large informatics graphs such as large social and information networks. Such tools (e.g., Principal Component Analysis and related non-linear dimensionality reduction methods) are popular in many areas of machine learning and data analysis due to their relatively-nice algorithmic properties and their connections with regularization and statistical inference. These tools are not, however, immediately-applicable in many large informatics graphs applications since graphs are more combinatorial objects; due to the noise and sparsity patterns of many real-world networks, etc. Recent theoretical and empirical work has begun to remedy this, and in doing so it has already elucidated several surprising and counterintuitive properties of very large networks. Topics include: underlying theoretical ideas; tips to bridge the theory-practice gap; empirical observations; and the usefulness of these tools for such diverse applications as community detection, routing, inference, and visualization. Audience This tutorial will provide an opportunity for the data analysis community, including both mathematically-oriented researchers as well as practitioners, to learn about recent algorithmic advances for dealing with very large social and information networks. Many of these algorithmic tools have implicit geometric properties associated with them; and these geometric properties often have implicit statistical properties and consequences that indicate where these tools are more or less useful in real-world applications. As such, this tutorial should be of interest to and accessible by a large fraction of the data analysis community - including both: established researchers who have done work in this or related areas, as well as researchers whose interests are not directly in the topic of the tutorial; and graduate students and postdocs, as well as junior and more senior researchers. Many of the algorithmic and statistical techniques to be discussed have a strong overlap with seemingly-different problems and questions in statistics, optimization, numerical analysis, and machine learning - these connections will be highlighted throughout. Relatedly, many of these questions have been studied by researchers in theoretical computer science, scientific computing, statistics, machine learning, and data analysis; the complementary aspects of these different approaches, including their applicability to solving real-world problems from different application domains, will be emphasized. Depending on one's background, one can expect to benefit in different ways from the tutorial. In particular: Practitioners of machine learning and data analysis should gain just enough insight into the theoretical underpinnings of relevant algorithms to see how and why algorithms work well or fail to work well in real-world settings; Application-oriented theorists should gain insight into how the inner-workings of algorithms have practical implications for machine learning and data analysis on large networks, as well as learn about interesting theoretical problems raise by recent empirical findings; and Knowledgeable members of the data analysis community should gain a broad overview of the area of large-scale graph mining and network analysis, including where data analysis methods with which they are familiar are well-suited or ill-suited.
Short Bio	Michael Mahoney is currently at Stanford University. His research interests focus on algorithmic and statistical aspects of algorithms for large-scale data problems in scientific and Internet applications. Currently, he is working on geometric network analysis; developing approximate computation and regularization methods for large informatics graphs; and applications to community detection, clustering, and information dynamics in large social and information networks. He has also worked on randomized matrix algorithms and applications in genetics and medical imaging. He has been a faculty member at Yale University and a researcher at Yahoo, and his PhD was is computational statistical mechanics at Yale University.

Invited Talks

Invited Talk
Speaker:	Peter Geczy, AIST, Japan
Topic:	Data Mining and Privacy: Water and Fire?
Date & Time	Tuesday, July 19, 01:20-2:20pm (+ 40 minutes buffer)
Location	Ballroom 1
Description	Data mining research and practice have been experiencing an extraordinary growth over the past decade‒so have privacy concerns. Progress in data mining has been pushing the envelope of reachable depth, information and knowledge extracted from vast amounts of data‒increasingly exposing your innermost characteristics, behaviors and habits. Advanced data mining techniques and analytics have been significantly benefiting organizations in both commercial and noncommercial sectors‒yet providing an unprecedented potential for abuse. Is the interplay of data mining and privacy a conflict in making? This pertinent matter has been approached variously. Privacy preserving data mining has been tackling the issue from algorithmic and technology angles. Laws and regulations enacted by countries have been addressing the issue from legislative angles. Best practices and conducts instituted by commercial and international bodies have been exploring self-regulatory angles. Bridging data mining and privacy requires interdisciplinary endeavor. We will concisely survey the status quo and highlight selected promising directions.
Short Bio	Dr. Peter Geczy is a chief scientist at The National Institute of Advanced Industrial Science and Technology (AIST). He also held positions at The Institute of Physical and Chemical Research (RIKEN) and The Research Center for Future Technologies. His interdisciplinary scientific interests encompass domains of data and web mining, human interactions and behavior, social intelligence technologies, privacy, information systems, knowledge management and engineering, artificial intelligence, and adaptable systems. His recent research focus also extends to the spheres of service science, engineering, management, and computing. He received several awards in recognition of his accomplishments. Dr. Geczy has been serving on various professional boards and committees, and has been a distinguished speaker in academia and industry.

Invited Talk B
Speaker:	Nitesh V. Chawla, University of Notre Dame, USA
Topic:	Connecting the dots for personalized healthcare
Webpage	http://www.cse.nd.edu/~nchawla/
Date & Time	Monday, July 18, 01:20-2:20pm (+ 40 minutes buffer)
Location	Ballroom 1
Description	Proactive personalized medicine is expected to bring fundamental changes, offering recommendations of lifestyle adjustments and treatments to avoid diseases a patient has high risk for developing in the future. Due to common genetic, molecular, environmental, and lifestyle-based individual risk factors, most diseases do not occur in isolation. No matter how unique our medical experiences, chances are that other patients among millions have experienced genetic and environmental risk factors that closely mirror ours. In this talk, I will present our work that builds a comprehensive recommendation system, called CARE (Collaborative Assessment and Recommendation Engine), by pulling in experience of millions of patients to answer the question. I will also present our work on multi-relational representation of disease networks using both genetic knowledge, based on previously discovered gene-disease associations and phenotypic data from real patient histories.
Short Bio	Nitesh Chawla is an Assistant Professor in the Department of Computer Science and Engineering at the University of Notre Dame. He directs the Data Inference Analysis and Learning Lab (DIAL) and co-directs the Interdisciplinary Center of the Network Science and Applications (iCenSA) at Notre Dame. His research is primarily focused on machine learning, data mining, and social and dynamic networks. His work has led to applications in various domains including biology, medicine, finance, security, social science, fraud detection, intrusion detection, and text categorization. He is on the editorial board of IEEE Transactions on Systems, Man and Cybernetics Part B. He has received various awards and acknowledgements. He received the NAE FIE New Faculty Fellowship in 2005. His current research is supported form NSF, DOD, NWICG, NIJ, and industry sponsors.

Contact

Robert Stahlbock
General Conference Chair

E-mail: conference-chair@dmin--2011.com

Nikolaos Kourentzes

Programme Chair

E-mail: programme-chair@dmin--2011.com

Philippe Lenca, Gary M. Weiss

Tutorial Chair

E-mail: tutorial-chair@dmin--2011.com

This website is hosted by the Lancaster Centre for Forecasting at the Department of Management Science at Lancaster University Management School.