OpenTrials: An Open Database for Clinical Trials
Data gathered in clinical trials is critical to effective medical treatment, as they allow doctors to make informed decisions in patient care. Researchers and policymakers also rely on clinical trials when designing research proposals and crafting regulations, respectively. However, approximately only half of the results from clinical trials are published. Further, positive results are published more frequently than negative results, representing a publication bias in the availability of clinical trial data.
With an aim of increasing access to data and improving transparency in clinical trial processes, Open Knowledge International is developing OpenTrials, an open access, online database of materials from clinical trials worldwide. The Laura and John Arnold Foundation (LJAF) is funding phase one of development for OpenTrials and is working through the Center for Open Science (COS). LJAF seeks to make lives better by determining areas of society that underperform and by then increasing accountability, transparency, and availability of relevant information within these systems; COS shares a similar mission, but with a focus on research, while Open Knowledge International seeks specifically to ensure that knowledge is widely available and accessible across the globe. The OpenTrials project, under the direction of Dr. Benjamin Goldacre, combines the aims of these organizations in the arena of clinical trial research.
How OpenTrials Works
OpenTrials will be built as structured open data and will provide an indexed online location for all documentation associated with clinical trials, hosting these documents using an easily searchable format that will tie all relevant documentation to each individual trial. Numerous documentation and data will be available for each trial:
- Registry entries
- Industry registers
- National registers
- Relevant academic journal articles
- Regulatory documents containing descriptions of a given trial
- Structured data on methods and results (systematic reviewers or other researchers will extract this data as presented on OpenTrials)
- Clinical study reports
- Additional documents
- Blank consent forms
- Case report forms
- Ethical approval paperwork
- Patient information documents
By hosting and linking all documentation associated with an individual trial, OpenTrials will expand on the concept of “threaded publications,” introduced in 1999 by Chalmers and Altman and undertaken in the Linked Reports of Clinical Trials project in 2011. As stated in the publication by Goldacre and Gray (2016), by providing access to a vastly increased scope of threaded information associated with individual trials, OpenTrials hopes to achieve numerous ends:
- Improve discoverability
- Improve and assist audit on accessibility of information
- Boost demand for structured data
- Facilitate annotation and research
- Raise standards around open data, specifically in evidence-based medicine
- Tackle inefficiencies in search, research, and data extraction
Funding and Data Sources
While the funding for phase one of OpenTrials is insufficient to manually populate a fully complete database, during this initial phase of the project the developing team hopes to create a functional data schema for the empty database and then begin to populate this database through several mechanisms:
- Donations of structured and linked data sets
- Web-scraping (using scripts to automatically assemble data from a wide range of sites) and importing data that is already publicly available
- Manually populating the database for a small subset of trials as a demonstration of a perfect database and to determine the effort required to achieve a fully complete database
- Providing a means for users of the site to upload missing components of a complete data set (“curated and targeted crowdsourcing”)
Currently, OpenTrials is gathering data from several available sources: ClinicalTrials.gov, EU Clinical Trials Register, Health Research Authority, WHO International Clinical Trials Registry Platform, and PubMed. Additionally, OpenTrials will include evaluations of risk bias done by the Cochrane Schizophrenia group and intends to eventually incorporate systematic review data from Epistemonikos.
OpenTrials has two different prototypes for the presentation of the data on the site: one for researchers and another for patients. The presentation for researchers highlights basic information about a trial and emphasizes several methods of presenting available data that would allow researchers to assess the risk of bias or to note methodological shortcomings in a trial. Additionally, the researcher view of the site clearly indicates instances of data missing from the database and offers easy links for researchers to upload any relevant information and documents. The patient view of the site aims to provide search options for determining region-specific ongoing trials for a particular condition or drug. Additionally, a patient would be able to filter for eligibility requirements to determine participation options.
Challenges Involved in OpenTrials
Developers of OpenTrials anticipate several difficulties throughout the course of building this database. One basic technical problem encountered when gathering and presenting available data is the different formats and dictionaries used by various sources. This problem gets more complicated when considering drug and condition names, as well as geographical and company names. Inconsistent structured data presents another challenge, requiring developers of the site to determine which data is accurate. However, uncovering such inconsistencies while even attempting to build OpenTrials is valuable to the medical community.
Another challenge OpenTrials must confront is the integration of open data principles into the field of medicine. The term “open data” describes a set of principles and practices for making data publicly available and usable. The concept of open data has become well known outside of healthcare but remains almost completely unutilized in clinical trials. Introducing the use of open data to the field of medicine via OpenTrials includes both the technical difficulties of gathering, refining, organizing, and presenting data and the social and political challenges involved in fostering collaborative activity between the different researchers and organizations involved in clinical trials. OpenTrials developers will also have to address any intellectual property and patient privacy issues they encounter as they work to incorporate open data into the field of healthcare research.
They also believe the results of phase one of their project will demonstrate the value of a complete usable database of linked clinical trial materials, which will increase the availability of further funding for future phases of the project. In particular, the population of the database and development of new features within the site will require significant additional funds.