Beyond Correlation: A Practical Guide to Causal Inference in Data Science
Causal inference answers questions beyond correlation, such as whether interventions truly cause outcomes. This session will introduce foundational concepts in causal inference and relevant Python tools. Attendees will leave with an improved intuition for how to use causal inference methods in applied data science.
Causal inference is required to answer many real-world business and policy questions. Did a marketing campaign increase sales? Did a new product feature improve customer retention? Did an environmental toxin make people sick? While traditional machine learning excels at identifying correlations, we need different methods to establish causation. This presentation will provide an overview of applied causal inference in data science. The intended audience is students and data professionals who understand the basics of machine learning and want to expand their toolkit beyond traditional predictive models. Attendees will leave with exposure to key concepts and tools in causal inference, and an intuition for how to start implementing it in their practice. This presentation will use a publicly available dataset as a touchstone use case while guiding the audience through two topics:
- Foundational concepts of causal inference: this section will explain key concepts such as counterfactuals, causal graphs, and treatment effects.
- Practical tools for causal inference in Python: this section will provide an overview of key libraries such as DoWhy and EconML.
Emphasis will be placed on the data science perspective and the intersections and differences between causal inference and machine learning.