Overview
Data mining and machine learning have been incorporated to make decision and prediction within a wide variety of digital services, ranging from search engines to e-commerce to social media platforms, thereby nurturing the booming digital economy. In these scenarios, the accuracy and efficiency of prediction and decision are the objectives of optimization, but the potential risks from erroneous predictions and decision are less important.
The performance driven machine learning models typically exploit subtle statistical relationships (might be spurious correlations) among features for explanation, prediction, and decision. Because the spurious correlation would change across data and over time, mistakes made by those machine learning algorithms may bring tremendous risks, such as lack of stability, explainability and fairness on prediction and decision, limiting their applications on the high-stacks areas such as healthcare, industrial manufacturing, financing, and the administration of justice.
Causal inference and causal discovery have recently attracted substantial attention in the data mining and machine learning community, it can help to distinguish the causal relationship and spurious correlation from data. The causal relationships, as a basic and effective tool for explanation, prediction and decision making, have been utilized in almost all disciplines. Traditionally, causal relationships are identified by making use of interventions or randomized controlled experiments. However, conducting such experiments is often expensive or even impossible due to cost or ethical concerns. Therefore, there has been an increasing interest in causal inference and discovering causal relationships based on observational data, and in the past few decades, significant contributions have been made to this field by computer scientists. Moreover, the causal relationships are explainable and invariant across data and over time, which can be utilized to improve the stability, explainability and fairness of predictive modelling, and decision making.
Inspired by such achievements and following the success of our previous Causal Discovery workshops (CD 2016 - CD 2022),
CDPD-2023 continues to serve as a
forum for researchers and practitioners in data mining and other disciplines to
share their recent research in causal discovery in their respective fields and
to explore the possibility of interdisciplinary collaborations in the study of
causality. Based on the platform of KDD, this workshop is especially interested
in attracting contributions that link data mining/machine learning research with
causal discovery, and solutions to causal discovery in large scale data sets.
Papers accepted by the workshop are to be published in Proceedings of Machine Learning Research and/or presented on the workshop day.