Introduction: In the UK, most people with lung cancer are diagnosed at a late stage when curative treatment is not possible. To aid earlier detection, the socio-demographic and early clinical features predictive of lung cancer need to be identified. Methods: We studied 12 074 cases of lung cancer and 120 731 controls in a large general practice database. Logistic regression analyses were used to identify the socio-demographic and clinical features associated with cancer up to 2 years before diagnosis. A risk prediction model was developed using variables that were independently associated with lung cancer up to 4 months before diagnosis. The model performance was assessed in an independent dataset of 1 826 293 patients from the same database. Discrimination was assessed by means of a receiver operating characteristic (ROC) curve. Results: Clinical and socio-demographic features that were independently associated with lung cancer were patients' age, sex, socioeconomic status and smoking history. From 4 to 12 months before diagnosis, the frequency of consultations and symptom records of cough, haemoptysis, dyspnoea, weight loss, lower respiratory tract infections, non-specific chest infections, chest pain, hoarseness, upper respiratory tract infections and chronic obstructive pulmonary disease were also independently predictive of lung cancer. On validation, the model performed well with an area under the ROC curve of 0.88. Conclusions: This new model performed substantially better than the current National Institute for Health and Clinical Excellence referral guidelines and all comparable models. It has the potential to predict lung cancer cases sufficiently early to make detection at a curable stage more likely by allowing general practitioners to better risk stratify their patients. A clinical trial is needed to quantify the absolute benefits to patients and the cost effectiveness of this model in practice.
ASJC Scopus subject areas
- Pulmonary and Respiratory Medicine