Cookies on this website
We use cookies to ensure that we give you the best experience on our website. If you click 'Continue' we'll assume that you are happy to receive all cookies and you won't see this message again. Click 'Find out more' for information on how to change your cookie settings.


Machine learning to enhance methods of causal inference in epidemiology


Fatemeh Rahimian

Reza Khorshidi

Kazem Rahimi


There are multiple approaches to causal inference; while randomised clinical trials stay the gold standard for establishing causal relationships, there is an increasing demand for better statistical and machine learning methods that can leverage the opportunities arising from wider access to large-scale observational studies (such as electronic health records and mega cohorts). Although several statistical methods of causal inference exist, their underlying assumptions tend to become untenable when signals are not very strong, or when there is no expert knowledge of hypothetical causal mechanisms. Such methods tend of have limited capacity in capturing multiple and non-linear interactions. These limitations have led to a situation where study conclusions, in particular when based on complex and noisy big data, have been highly sensitive to the type of statistical model chosen. More recently, machine learning and deep learning models have proven very effective in learning from big noisy data and capturing complex nonlinear interactions. However, with a few exceptions, their application has been largely restricted to outcome classification, automated diagnosis or risk prediction. In order to use machine learning and deep learning for causal inference, they need to be further developed for predicting the effect of interventions, modelling structural (or causal) relationship between an intervention and outcome, and answering counterfactual questions.

A DPhil student is sought to join the Oxford Martin programme on Deep Medicine to apply and develop deep learning methods, embedded in epidemiological or alternative study designs, to test the reliability and usefulness of such models, when applied to simple and more complex clinical questions. The primary focus of the work will be on electronic health records of millions of individuals. The projects will initially focus on clinical questions where the causality of associations is well established, and then move to more complex policy and healthcare questions, for which no reliable answer exists.


This project would be suitable for a candidate with strong quantitative background (e.g. MSc in biostatistics, epidemiology or computer science) and interest in applied research methods that are likely to have a major impact on population health.  This project will be part of a new interdisciplinary programme entitled ‘Deep Medicine’ at the George Institute for Global Health. The research team provides expert individual supervision and support from several of experienced and enthusiastic researchers with backgrounds in clinical medicine, statistics, epidemiology, computer science and informatics. Further support in grant writing, high-impact scientific publications and career development will be provided.

As well as the specific training detailed above, students will have access to a wide-range of seminars and training opportunities through the many research institutes and centres based in Oxford.