
At this moment Real World Data (RWD) is used in clinical research in a variety of forms and cases. There are clinical studies including real world data, like PRO (patient reported outcome) data but also data from Electronic Health records (EHR), claims databases or other patient generated data. Real world data is being used for finding the best study set-up, for enhanced recruitment of patients and for outcomes and safety. The last often in the form of PRO data although EHR data are more and more used.
To enhance clinical trials in the future we would like to:
• analyse RWD for every phase III study to find the most optimal study design both for recruitment as for targeting the patients who will benefit most and who are likely to be treated in the future with the specific drugs. In most cases this will be the more severe patients as new drugs are often not reimbursed as first in line therapy
• recruit study patients via pseudonymised data lakes. The physicians treating the patients of interest can be included in the study and others can be left out. Which will enhance the efficiency of the improvement process and will speed up the recruitment phase.
• map data of PROs and other relevant RWD directly to the study patients.
• copy relevant data that is recorded in the electronic health records to the eCRF. Of course after consent of the patient. It will be filled automatically after having passed a number of automatic checks and directly after the physician has included the patient in the study. The physician only has to complete or change the missing data in his own system after which it will be synchronised with to the study database. In this way differences between the electronic health records and the eCRF are kept to a minimum.
• copy clinical study data that was generated outside the electronic health records to those electronic health records such that the treating physician has a complete overview of the patients health and relevant findings.
• Inform patients and their treating physicians directly about the results of the study that hey participated in.
However, there are a number of challenges and hurdles to be taken to overcome this. During the PhUSE Single Day Event in Londen on 22May2019 we discussed a number of them which are summarized below:
– Privacy and the corresponding security issues are a problem when using RWD. In clinical research the protection and security around the data is high although the risk for hacking is a fact that continuously needs to be taken care of. But the protection of the data of electronic health records needs to be ensured such that caregivers become confident in the security and proportionality measures that are taken when they are used for clinical research.
– Mapping patient data from different sources needs to be taken care of as well as the corresponding increase in privacy loss.
– Nowadays it is often the case that people collecting the data are not aware of the analysis challenges that are faces. Both Srinivas Karri from Oracle Health Sciences and Guy Garrett of Achieve Intelligence indicated that the walls of the silos in clinical research need to be broken down to improve efficiency and to facilitate the technological improvements that are necessary to make it possible to directly use electronic health data.
– The data landscape of electronic trials needs to be changed such that the data is Findable, Accessible, Interoperable and Reusable. These are now known as the FAIR principles (See https://www.nature.com/articles/sdata201618) .
– Choices need to be made in how RWD will be made available for exploration before studies are started. Access to centralized, controlled and standardized data without ownership for pharmaceutical companies will probably be the best solution. This will limit the number of duplications of data and also the risk of pharmaceutical companies having unknowingly more copies of the same anonymized patients. A good example of this was given by Adam Reich of IQVIA who had a presentation about the simulated database of the National Cancer Registration in England, the Simulacrum. In this way the analysis could be pre-prepared without the users actually accessing the real RWD.
– RWD is dirty. This is not a problem when you take account of that since you will have enough data to analyse. However, the data scientist should be making the choices in how to analyse and use the data and analysts now working in clinical research will have to change their analysis techniques so that they accept missing and incomplete data.
– There are a number of standards used in RWD which does not match with the CDISC standard used in clinical research. They have other structures but moreover they use different terminology lists which make it difficult to transform the data from one standard to another. Re-mapping data in different structures will result in new copies of the data, which is not good in terms of data storage and reproducibility. A solution for this might be temporary restructuring algorithms or macro’s which act like a looking glass for every different standard. The data can then be analysed as was it structured according to the specific standard without being stored as such.
So we have still a long way to go. However, there are more and more studies and initiatives (like the EHR2EDC project: see https://www.eithealth.eu/ehr2edc) starting with the implementation of the electronic health records in clinical trials.
I am curious about your opinion so feel free to add and comment on this article.
Berber Snoeijer, ClinLine