An online clinical codes repository to improve validity and reproducibility of medical database research

Frequently asked questions


Go to the 'browse articles' tab in the top menu and select an article from the list. All code lists associated with that article are then displayed. You can choose to explore and download some or all of them as csv files. Code lists are published on assuming a Creative Commons Attribution 3.0 Unported License (CC BY 3.0) and anyone can download and freely use a file of all codes associated with an article. This site is still under active development and, in future, we will implement: searching and downloading of codes by disease group, keyword and/or code group; methods for downloading code-lists and article metadata in machine readable form and an API for downloading code-lists programatically.
Uploading clinical codes is a simple procedure which should take only a few minutes. First, you will need to register with (login/signup in the menu bar) and then select 'upload codes'. You will first need to enter some article metadata and then you can upload multiple codelists to that article as delimited text files. In the case of the deposition of codes from research articles, we request that code-lists be uploaded by one of the article authors or by someone with the express permission of the authors.
We recommend that all clinical code lists associated with primary research using Primary Care Databases or other Electronic Medical Record databases should be deposited in the repository following acceptance and prior to publication in a peer-reviewed journal, in the same way that DNA sequences are deposited in a database prior to the publishing of studies in molecular biology. Authors are also invited to upload codes from studies that have already been published. Code lists from other, non-academic publications such as governmental reports and reports by major medical organisations are also eligible for uploading. Furthermore, we are in the process of uploading all code lists for all years of the UK Quality and Outcomes Framework (QOF) Business Rules. If you have any questions about the suitability of your codes for deposition, please contact us.


In healthcare, diagnostic codes group and identify diseases, disorders, symptoms and medical signs in order to define morbidities and mortality. In addition, procedural codes identify specific health interventions, surgical procedures, medical tests and results carried out by medical professionals; pharmaceutical codes identify drugs and medications; while other codes can represent patient characteristics such as occupation, social circumstances, ethnicity and religion. These clinical codes provide standardised and highly expressive means for medical professionals to maintain databases of electronic medical records which can be used in observational medical research. There are currently several clinical coding systems used in different countries and regions. For example, Read codes (named after Dr James Read) are the standard clinical coding system used in General Practice in the United Kingdom while the World Health Organisation provides the International Statistical Classification of Diseases and Related Health Problems (ICD).
Clinical codes are an essential tool for research using Electronic Medical Records (EMRs). The population of interest and their disease conditions, exposures and outcomes are all typically defined using lists of clinical codes. The development of an appropriate and reliable code-list is not a simple or quick process, as anyone who has been involved can testify. Access to relevant lists developed for previous studies can greatly speed up this process. Access to historical code-lists also allows researchers and clinicians to make incremental improvements to disease and other definitions, building on and avoiding unnecessary replication of previous work. Detailed knowledge of how a diagnosis, exposure or outcome was defined in a study is also necessary for comparing between studies. However, access to historical code-lists has previously been extremely limited: academic journals do not routinely require the clinical code-lists used in a study to be published alongside the study, and until now there has been no central repository for researchers to deposit code-lists in a standardised way. One consequence has been a huge deal of 'reinventing the wheel' each time a new EMR-based study is undertaken.