Geen categorie

Will RWD break down the barriers in clinical data processing?

When I started to work in clinical research, 25 years ago, there was already a barrier between data collection, data management and data analysis. The interaction between data management occurred after the data was collected and the interaction between data analysts and data management was the clean data statement. Data collection and data analysis were not interacting at all. Nowadays this way of working still exists. Data collection, management and analysis are performed by different people. The level of communication between these different groups varies between companies. When I tell people the mission of ClinLine is improving efficiency in clinical data processing, I often hear that people involved in the clinical data process are not aware of what other departments are doing while they are working in the same process. In many companies, people collecting and cleaning the data have no interaction with the people who are analysing and reporting the data.

Is that efficient? Yes and no. Yes, because people can focus on their own tasks and are not troubled by all kinds of additional explanations, requests or changes. The tasks are standardised and as long as the data analysts are involved in the study set-up, the study will likely be performed as required. However, when people do not understand the challenges that the other departments face, there will an efficiency loss to some extend, . For example, what does the data analyst know of the challenges during data collection. What to do when data is invalid or unavailable at collection time? What to do when the answer cannot be entered into the eCRF? How to handle text entries? The data collection people decide what data is valid to use and the data manager, afterwards, decides whether the data seems right for analysis. However, all the choices that are made during collection and data management might have an effect on the final analysis results.

These challenges are much bigger when real world data (RWD) is used for clinical trials. The efficient way to use this data directly for clinical research is to have a direct link between the patients’ health records and the clinical database without human interaction. After signing the informed consent, the required data will then be transferred to the clinical study database.

However, electronic health record systems are not designed for clinical studies and data is entered by physicians, only because they need to do it because of regulations or because of their own interests. The physician already loses too much time in entering all the data during patient visits, so they are not likely to add entries they think are unnecessary. In addition, depending on the system they use, they will enter a lot of information in free text fields, because that is easier. It requires good text mining techniques to be able to use this additional data which contains often a lot of valuable information.

There are big differences between the various information systems that are used in clinical practice. But also will different groups of physicians, for example working in different hospitals, have different protocols which has an effect on where and how well they register their data. We need a thorough understanding of the registration processes and the systems they use to be able to efficiently and correctly use their data for research.

It is impossible to standardize all the data that might be needed for research beforehand. It is also impossible to make the right choices in data selection and data enrichment without a deep understanding of the challenges and choices that are made for the specific study analysis. Therefore, before a study starts an anonymous review of the necessary data and possibly available data will give insight in most of the challenges and corresponding solutions. These have to be discussed and handled beforehand in cooperation with those who select, enrich and analyse the data. Most of the issues are then solved beforehand and in addition alerts are to be specified to identify any new issues.

So, when using real world data, the process will change from data collection, data management and data analysis towards data selection, data enrichment and data analysis whereby the interaction between people performing these activities will have to be optimal to get an efficient and fast clinical study data process. Then the barriers need to go down and close cooperation will improve the efficiency in the total clinical data process.

Feel free to comment or send me an E-mail (to More information about our activities can be found on


Building efficient multiple purpose SAS macros

Recently, I designed a reporting SAS macro with the requirements that it is efficient and of course has all the other capabilities that the users desired. But what is an efficient SAS macro?

What is efficient?

Does it mean that it is short in code? That it is short in run-time or that it has a lot of functionalities in it? When you have an overall look on the process that you want to cover you also need to consider the creation and maintenance time of macros and the total number of macro’s that is needed for the complete process. A lot of small macros might take more time to create and to maintain than 1 big macro. But that will also depend on the content of each macro. Maintaining a big macro that is very diverse can take more time and effort than maintaining a number of small macros.

How many macros?

So at the design stage of each set of macro’s a number of considerations are important. First of all, the desired end products must be kept in mind at all times. Based on that, look at the level of overlap between the different outputs that are desired. If the overlap is large, say more than 70%, then creating 1 multi-purpose macro might be more efficient than creating a macro for every single variation of the output as it probably will mean that the functionality for creating the different outputs can be shared. When the overlap is smaller, but there are some shared functionalities, then creating a set of separately designed reusable sub-macros is more beneficial to prevent re-programming and make it easier to maintain the set.

Building blocks

The design of a multi-purpose SAS macro is more difficult than simple macros in ensuring that the set-up and functionalities are clear, that nothing is overlooked and that it will be easy to maintain afterwards. My advice is to divide the macro in different building blocks according to these functionalities. For example, the calculations and statistics can be separated from the actual reporting. The different logical building blocks are preferably created as different submacros. This has another efficiency gain as each block can be handled by a different programmer who is an expert in the specific part.

The creative process

A good macro design will improve the efficiency around building the macro and maintaining the macro. Designing the macro will take some time, but you will gain a lot by limiting the amount of re-work and necessary changes afterwards. Designing is a really creative process in which you will have to think of all possible input and output requirements, checks and functionalities. It requires a lot of imagination, re-thinking, content knowledge and focus.

The functional specifications

But only thinking is not enough. It will have to be written down in clear functional specifications based on the user requirements. When writing those and re-reading, often the architect and key users will come up with additional questions and limitations that need to be addressed as well. Doing that before the actual programming phase will also speed up this process and limit the re-work and changes afterwards.

So going back to the original question: What is an efficient SAS macro? That will be a macro which is

  • not too diverse,
  • will combine functionalities if there is a lot of overlap,
  • consists of different building blocks separating functional steps in the process,
  • is designed according to well written functional specifications and
  • will share functionalities with other macros by separately designed sub-macros.

 Please feel free to add or share your own ideas around this subject.

Berber Snoeijer, ClinLine

I you like support on the design of your own SAS macro’s you can contact us via See also our website

Where are we and where do we want to go …

At this moment Real World Data  (RWD) is used in clinical research in a variety of forms and cases. There are clinical studies including real world data, like PRO (patient reported outcome) data but also data from Electronic Health records (EHR), claims databases or other patient generated data. Real world data is being used for finding the best study set-up, for enhanced recruitment of patients and for outcomes and safety. The last often in the form of PRO data although EHR data are more and more used.

To enhance clinical trials in the future we would like to:

•       analyse RWD for every phase III study to find the most optimal study design both for recruitment as for targeting the patients who will benefit most and who are likely to be treated in the future with the specific drugs. In most cases this will be the more severe patients as new drugs are often not reimbursed as first in line therapy

•       recruit study patients via pseudonymised data lakes. The physicians treating the patients of interest can be included in the study and others can be left out. Which will enhance the efficiency of the improvement process and will speed up the recruitment phase.

•       map data of PROs and other relevant RWD directly to the study patients.

•       copy relevant data that is recorded in the electronic health records to the eCRF. Of course after consent of the patient. It will be filled automatically after having passed a number of automatic checks and directly after the physician has included the patient in the study. The physician only has to complete or change the missing data in his own system after which it will be synchronised with to the study database. In this way differences between the electronic health records and the eCRF are kept to a minimum.

•       copy clinical study data that was generated outside the electronic health records to those electronic health records such that the treating physician has a complete overview of the patients health and relevant findings.

•        Inform patients and their treating physicians directly about the results of the study that hey participated in.

However, there are a number of challenges and hurdles to be taken to overcome this. During the PhUSE Single Day Event in Londen on 22May2019 we discussed a number of them which are summarized below:

–       Privacy and the corresponding security issues are a problem when using RWD. In clinical research the protection and security around the data is high although the risk for hacking is a fact that continuously needs to be taken care of. But the protection of the data of electronic health records needs to be ensured such that caregivers become confident in the security and proportionality measures that are taken when they are used for clinical research.

–       Mapping patient data from different sources needs to be taken care of as well as the corresponding increase in privacy loss.

–       Nowadays it is often the case that people collecting the data are not aware of the analysis challenges that are faces. Both Srinivas Karri from Oracle Health Sciences and Guy Garrett of Achieve Intelligence indicated that the walls of the silos in clinical research need to be broken down to improve efficiency and to facilitate the technological improvements that are necessary to make it possible to directly use electronic health data.

–       The data landscape of electronic trials needs to be changed such that the data is Findable, Accessible, Interoperable and Reusable. These are now known as the FAIR principles (See . 

–       Choices need to be made in how RWD will be made available for exploration before studies are started. Access to centralized, controlled and standardized data without ownership for pharmaceutical companies will probably be the best solution. This will limit the number of duplications of data and also the risk of pharmaceutical companies having unknowingly more copies of the same anonymized patients. A good example of this was given by Adam Reich of IQVIA who had a presentation about the simulated database of the National Cancer Registration in England, the Simulacrum. In this way the analysis could be pre-prepared without the users actually accessing the real RWD.

–       RWD is dirty. This is not a problem when you take account of that since you will have enough data to analyse. However, the data scientist should be making the choices in how to analyse and use the data and analysts now working in clinical research will have to change their analysis techniques so that they accept missing and incomplete data.

–       There are a number of standards used in RWD which does not match with the CDISC standard used in clinical research. They have other structures but moreover they use different terminology lists which make it difficult to transform the data from one standard to another. Re-mapping data in different structures will result in new copies of the data, which is not good in terms of data storage and reproducibility. A solution for this might be temporary restructuring algorithms or macro’s which act like a looking glass for every different standard. The data can then be analysed as was it structured according to the specific standard without being stored as such.

So we have still a long way to go. However, there are more and more studies and initiatives (like the EHR2EDC project: see starting with the implementation of the electronic health records in clinical trials.

I am curious about your opinion so feel free to add and comment on this article.

Berber Snoeijer, ClinLine

Improving efficiency in clinical research with quality procedures: A paradox?

A quality system and corresponding procedures are often seen as a necessary evil. People have to comply to rules and regulations which they believe are slowing down their processes and activities. They experience that a great part of their work is administration which is not the occupation they applied for. I actually once had a colleague who quit her job in clinical research because of all these extra registry activities. A quality requirement can be missed when there is a lot of pressure, especially by people who are creators and analysists and less structured by nature. When an audit is planned, a lot of checking and extra work is needed to get it all fixed and complete.

This is contrary to the principles of the regulations and guidelines on which the procedures are based. These principles have the intend to improve the efficiency and effectiveness of clinical studies; Being clear in responsibilities will make sure that things will happen by the right persons at the right moment and are not overlooked. Traceability of data will make sure that you can reproduce and validate your results. Documentation and descriptions of the process will make sure 33that the processes are easily transferable to other employees and stakeholders. And for clinical research, rules regarding blinding, safety reporting and so on will make sure that the results are sound and that the safety of (future) patients is guaranteed. Standards will make sure that data is easily transferable and reports can easily be reviewed. These are all measures that every good process engineer would implement to get his company running smoothly and effectively. So what is going wrong here?

Procedures are in most cases written by people who have the drive to create certainty and clarity. However, their pitfall is that they tend to include too many details and are creating new rules for every problem and exception. This makes procedures extensive and complex. In addition, procedures are in many cases copied from standards or examples which do not reflect the actual activities in the company.

Each company is different, has different processes, different size and different missions and visions. Of course, there are distinctions between CROs and pharma companies but also within these two subbranches of our industry there are a lot of variations. That means that while the principles are the same, the procedures between companies should differ. A big sized company needs more registration than a small sized company. Simply because there are more people involved. For a small sized company, the same requirements can often be secured by means of alternative processes which need less registration but includes more responsibility for a particular role.

A pharma company has often one standard for their processes on which the procedures are based while a CRO has many clients, so they need to be more flexible to comply to the requirements of their different clients. What I have seen a number of times is a big pharma company auditing a small CRO. The auditor expects the CRO to have the same detailed procedures as they have themselves and the CRO, being client focussed, is changing their processes towards all these expectations. At the end the CRO processes get more and more complex and tend to outgrow the companies means.

So what I am proposing here, is to stay flexible without overlooking the rules and regulations and what principles they were based on. While writing the procedures, keep in mind the normal processes of the company and what is needed to be efficient in these processes. Keep the procedures lean and mean so that they can be used in daily practice. Then your procedures will support your processes and improve the efficiency instead of decline the efficiency.

I am curious about your opinions so feel free to add any comment. If you like to learn more about our quality system improvement programs just contact us via I am also happy to take an hour to talk and advice you about your situation.

Manage your SAS program as a project

Improvement of efficiency can be achieved at many levels. As a manager of SAS programmers I saw a lot of efficiency loss by programmers who make spaghetti-programs. That is what I call programs in which modifications regarding the same data process or item are spread throughout the program. This often happens when the programmer adapts an existing program based on feedback or new requirements. Many (unexperienced) programmers tend to make his/her adaptations at the end of the program resolving issues that were invoked earlier in the program. This results in a spaghetti of code. These spaghetti-programs are very hard to review and are prone to errors and bugs which costs a lot of extra time in the process of data analysis and reporting.

So how about program planning? Can we make it more structured without being inflexible and without adding too much preparation time? Trying to resolve it with strict guidelines on program structures does not improve the process since it introduces redundancy in the program and programming process. Each program has its own topics and focus and will therefore need its own structure. In addition, a good programmer likes to be in control and will be demotivated by strictness and directiveness. He likes to start and find his own ways to make it the best program possible.

Therefore, a few years ago I started to implement the project management principle of the
work breakdown structure (WBS) for SAS programs. Before starting, the programmer will prepare a simple scheme in which the functionalities of the program are divided into main functionalities and sub functionalities. Depending on the complexity of the program, the programmer can decide how many levels of sub functionality he wants to add and whether he wants to create separate macro’s for a sub-functionality or not. The WBS for programs will help the programmer to work in a structured way when preparing the program. When a change or addition is requested, the programmer changes the program at the right place thus preventing spaghetti code and errors. As a bonus these programs are also easier to review making the total process faster, better documented and easier to re-use.

For more information or questions you can contact us at or check out our website

Our Vision and Mission

After a few months of incubation, I am proud to have finalised the vison and mission of ClinLine:

Within pharmaceutical industry and health care, efficiency in clinical data processing is an aspect that has had less focus and importance than in other industries. Many believe that striving to a high efficiency might affect the quality, integrity and/or security of data and processes and in the end might compromise the care and health of patients.

However, with the ever increasing costs for health, it becomes more and more important to look at the efficiency of our business processes and see how we can use our resources in a more optimal way.

At ClinLine, we are convinced that it is possible to obtain a higher efficiency and at the same time to maintain the quality and integrity of data and processes. This can only be done by combining your business goals and vision with optimal processes, optimal technical architecture, quality requirements and optimal human interaction. 

Only by looking at these different aspects as a complete system, the best and most efficient and effective process can be defined and implemented. Our mission is to support research organisations and pharmaceutical companies in this process.