Regression models for twin studies: a critical review.
Carlin JB., Gurrin LC., Sterne JA., Morley R., Dwyer T.
Twin studies have long been recognized for their value in learning about the aetiology of disease and specifically for their potential for separating genetic effects from environmental effects. The recent upsurge of interest in life-course epidemiology and the study of developmental influences on later health has provided a new impetus to study twins as a source of unique insights. Twins are of special interest because they provide naturally matched pairs where the confounding effects of a large number of potentially causal factors (such as maternal nutrition or gestation length) may be removed by comparisons between twins who share them. The traditional tool of epidemiological 'risk factor analysis' is the regression model, but it is not straightforward to transfer standard regression methods to twin data, because the analysis needs to reflect the paired structure of the data, which induces correlation between twins. This paper reviews the use of more specialized regression methods for twin data, based on generalized least squares or linear mixed models, and explains the relationship between these methods and the commonly used approach of analysing within-twin-pair difference values. Methods and issues of interpretation are illustrated using an example from a recent study of the association between birth weight and cord blood erythropoietin. We focus on the analysis of continuous outcome measures but review additional complexities that arise with binary outcomes. We recommend the use of a general model that includes separate regression coefficients for within-twin-pair and between-pair effects, and provide guidelines for the interpretation of estimates obtained under this model.