When I started to work in clinical research, 25 years ago, there was already a barrier between data collection, data management and data analysis. The interaction between data management occurred after the data was collected and the interaction between data analysts and data management was the clean data statement. Data collection and data analysis were not interacting at all. Nowadays this way of working still exists. Data collection, management and analysis are performed by different people. The level of communication between these different groups varies between companies. When I tell people the mission of ClinLine is improving efficiency in clinical data processing, I often hear that people involved in the clinical data process are not aware of what other departments are doing while they are working in the same process. In many companies, people collecting and cleaning the data have no interaction with the people who are analysing and reporting the data.
Is that efficient? Yes and no. Yes, because people can focus on their own tasks and are not troubled by all kinds of additional explanations, requests or changes. The tasks are standardised and as long as the data analysts are involved in the study set-up, the study will likely be performed as required. However, when people do not understand the challenges that the other departments face, there will an efficiency loss to some extend, . For example, what does the data analyst know of the challenges during data collection. What to do when data is invalid or unavailable at collection time? What to do when the answer cannot be entered into the eCRF? How to handle text entries? The data collection people decide what data is valid to use and the data manager, afterwards, decides whether the data seems right for analysis. However, all the choices that are made during collection and data management might have an effect on the final analysis results.
These challenges are much bigger when real world data (RWD) is used for clinical trials. The efficient way to use this data directly for clinical research is to have a direct link between the patients’ health records and the clinical database without human interaction. After signing the informed consent, the required data will then be transferred to the clinical study database.
However, electronic health record systems are not designed for clinical studies and data is entered by physicians, only because they need to do it because of regulations or because of their own interests. The physician already loses too much time in entering all the data during patient visits, so they are not likely to add entries they think are unnecessary. In addition, depending on the system they use, they will enter a lot of information in free text fields, because that is easier. It requires good text mining techniques to be able to use this additional data which contains often a lot of valuable information.
There are big differences between the various information systems that are used in clinical practice. But also will different groups of physicians, for example working in different hospitals, have different protocols which has an effect on where and how well they register their data. We need a thorough understanding of the registration processes and the systems they use to be able to efficiently and correctly use their data for research.
It is impossible to standardize all the data that might be needed for research beforehand. It is also impossible to make the right choices in data selection and data enrichment without a deep understanding of the challenges and choices that are made for the specific study analysis. Therefore, before a study starts an anonymous review of the necessary data and possibly available data will give insight in most of the challenges and corresponding solutions. These have to be discussed and handled beforehand in cooperation with those who select, enrich and analyse the data. Most of the issues are then solved beforehand and in addition alerts are to be specified to identify any new issues.
So, when using real world data, the process will change from data collection, data management and data analysis towards data selection, data enrichment and data analysis whereby the interaction between people performing these activities will have to be optimal to get an efficient and fast clinical study data process. Then the barriers need to go down and close cooperation will improve the efficiency in the total clinical data process.