Automating incidence and prevalence analysis in open cohorts (2024)

Type of publication:
Journal article

Author(s):
Cockburn N.; Hammond B.; Gani I.; Cusworth S.; Acharya A.; Gokhale K.; Thayakaran R.; Crowe F.; Minhas S.; *Smith W.P.; Taylor B.; Nirantharakumar K.; Chandan J.S.;

Citation:
BMC medical research methodology. 24(1) (pp 144), 2024. Date of Publication: 04 Jul 2024.

Abstract:
MOTIVATION: Data is increasingly used for improvement and research in public health, especially administrative data such as that collected in electronic health records. Patients enter and exit these typically open-cohort datasets non-uniformly; this can render simple questions about incidence and prevalence time-consuming and with unnecessary variation between analyses. We therefore developed methods to automate analysis of incidence and prevalence in open cohort datasets, to improve transparency, productivity and reproducibility of analyses. IMPLEMENTATION: We provide both a code-free set of rules for incidence and prevalence that can be applied to any open cohort, and a python Command Line Interface implementation of these rules requiring python 3.9 or later. GENERAL FEATURES: The Command Line Interface is used to calculate incidence and point prevalence time series from open cohort data. The ruleset can be used in developing other implementations or can be rearranged to form other analytical questions such as period prevalence. AVAILABILITY: The command line interface is freely available from https://github.com/THINKINGGroup/analogy_publication .

Link to full-text [open access - no password required]