top of page

DCL Learning Series

Improving Time to Market for Drug Development: Content Structure and Systems Integration


[Marianne Calilhanna] Hello and welcome to the DCL Learning Series. Today's webinar is titled "Improving Time to Market for Drug Development: Content Structure and Systems Integration." My name is Marianne Calilhanna, I am the VP of Marketing at Data Conversion Laboratory and I'll be your moderator today.

A couple of quick things before we begin: this webinar is being recorded and it will be available in the On-Demand Webinars section of our website at While we're presenting, we invite you to submit questions at any time during this conversation. We'll save 15 minutes at the end to answer any questions that have been submitted.


Technology plays a critical role in the life sciences, where accuracy, traceability, and compliance, along with speed to market, is critical. Improving content and data management, IT systems, compliance, and program management streamlines research and drug development. Data Conversion Laboratory, Court Square Group, and JANA Life Sciences have developed this learning series to address how technology can contribute to your success. This is the third of seven individual webinars, and the other webinar topics are listed here on this slide. Today's webinar builds on our last one about metadata and taxonomies in the drug development life cycle. You can watch a recording of that webinar on our website, and we'll push a link to that page via the chat box in just a moment.


I'm really delighted to introduce our panelists today. We have Howard Shatz, SPL expert and Project Manager at Data Conversion Laboratory. Welcome, Howard. Keith Parent, CEO at Court Square Group. Ron Niland, President, JANA Life Sciences. And do take note of Ron's name. We'll, uh – there's a little Easter egg in our presentation today.

This webinar is brought to you by Data Conversion Laboratory, or DCL, as we are also known. Our mission is to structure the world's content. DCL's services and solutions are all about converting, structuring, and enriching content and data. We are one of the leading providers of XML conversion services and an industry expert with SPL conversion for global pharma companies. If you have complex content and data challenges, we can help. And now I'm going to turn it over to Keith from Court Square Group.


[Keith Parent] Thank you, Marianne. Hi, guys, My name is Keith Parent. I'm the CEO at Court Square. Court Square Group is a company – we have an audit-ready compliant cloud environment which provides qualified hosting for numerous applications within the life science space. Anything that's qualified and validated, it can run in our ARCC cloud. And then we also push and talk about our content management system, which is RegDocs365. So, if these are any of the needs that you would want to have, I'd love to talk with you about it.


[Ron Niland] Hi, my name is Ron Niland; I'm with a company called JANA, Inc. We were founded in 1973, we're a third-generation family-owned company, and we formed a division, JANA Life Sciences, which is focused on just that exclusively. As a company our sweet spot is technical documentation. We do everything from schematics to parts catalogs, to user manuals, maintenance manuals, and procedural documentation, SOPs, work constructions, and what have you. That data and documentation work and sort of helps us to approach companies when it comes to IT systems and operational excellence. And all of our work is done in an ISO certified matter, when it comes to technical documentation, that is.


[Marianne Calilhanna] OK, we're gonna launch a quick poll before we begin. We're interested in just understanding where you sit in, in your area. So, if you could select, from the options listed below, what functional area-slash-role best describes your situation and your job. The options are research, development, regulatory operations, clinical operations, data, and IT management.

So, I'm gonna leave the poll open just for a few more seconds. I see some votes are still coming in. Um, and I'm going to close the poll and share the results with this group. OK, it seems that most of you have taken part. Thank you.

So. Interesting. Most of you are in the data IT management space. Looks like we have about a tie with development, regulatory, and clinical, and 20% are research, so thank you all for the work you do, in this important area of life sciences. OK, I'm going to close those results, and we're going to get back to the reason you are here. Keith, over to you.


[Keith Parent] Great, thank you, Marianne. So I just want to let the audience know that this is going to be a very conversational type webinar. Ron, Howard and I will go back and forth across the slides, we'll start to talk about the different key concepts. We want to, we really want to talk about content, strategy, management, how we structure things in life cycle management of some of the content. And then kind of an overview of information architecture, what we think about when we're talking about technology road mapping, how we do system alignment, interoperable – operability between systems. And then dataflows and automation. And then collaboration.

We know that this is taking the forefront in a lot of the life science companies we deal with on a very regular basis. And each one of us works with different audiences at a very different set of time. So our goal was to look at these key concepts and kinda talk with you about how they, they go through all the different systems you're going to probably touch over the course of the drug development life cycle. Next, Marianne. So, Ron, you want to pick it up?


[Ron Niland] Sure, yeah, so what we were intending to do was focused moreso, if you will, on the aspects of drug development, whether it's pharmaceutical, small molecules, or biotech, larger molecules, and sort of give you that picture, if you will, in the day of the life of someone working on a drug product. So having said that, we're intending to go through this continuum from research all the way through to the idea of launching a product today. Next slide, please.

So on this slide here, we just wanted to give you a sense of information that you're probably all very well aware of. And it has to do with the funnel in terms of what's coming into the pipeline. And what ultimately makes it out the other end as a drug product, a drug approval. And the reason we wanted to show this slide was just to give you a sense of the numbers. And these are rounded numbers, if you will, but are derived from data, for sure. But on the higher end you've got within research tens of thousands of compounds that may be screened and ultimately going into the aspects of pre-clinical and clinical.

You can see it's one in one hundred, one in a thousand kinds of numbers, But the fact of the matter is, when you think about your informational structuring, you probably need to think about that higher end of the spectrum with the numbers. On the right side, you can get a sense of the time scale, and some products, like the Covid vaccine, made it to market within a year or so. But others will take as many as 10 years to get to regulatory approval. And so we wanted to give you a sense of that time, because, you know, the time, and the volume, are things that will impact the structure, meaning you have to build a foundation that's going to last when it comes to your information architecture. Next slide.


[Keith Parent] I think as we go through the slides, Ron, as well, I think just to let people know that that volume of data that, that people deal with, it's going to be ever-changing over the course of the, the drug development life cycle. And that timeframe that we're talking about, if you think about the amount of people that would be working on data, and the amount of turnover that happens, typically, with any, any life science company, one of the reasons why putting in the metadata, the taxonomies and all the structure that we're talking about, is, is for us to really realize that, over the course of time, the more we can stay consistent throughout the life cycle, the better it's going to be for, for you toward the end of that life cycle, when we finally get to that point, where we're putting the labeling together, and all the other things that are gonna happen from the FDA submission perspective.


[Ron Niland] That's a great point, too, Keith. In terms of the turnover in the Bay area for biotech companies, the average turnover can be anywhere from 10 to 30% in a year. So if you're factoring, let's say 20%, the rule of 72 will tell you within 3 to 4 years you can have a whole new workforce with a much smaller company for sure.


[Keith Parent] But I can guarantee you, Ron, that many of the startups that we work with, you know, they go from 10 or 15 people to 100 people in almost the blink of an eye. So, and typically, when you're hiring people, you're hiring them out of the industry, and they already have an idea of using certain applications. So, they'll have in their head an understanding of content from one way. And many times, people are using different applications, or, or talking about the same content differently. And that's where a lot of the times, you get people, you know, naming things differently, and doing stuff like that. So, so our goal here is to really talk to the whole audience about how we can put things in place early on to help later on in the cycle. So that's kind of a step, step forward from that.


[Ron Niland] Yeah, and along those lines, that turnover with, you know, highly seasoned people with deep experience will basically mean they're coming to these even nascent-stage companies with high expectations around information management. Going to your point, Keith, around sort of naming things differently, on this slide, we just wanted to give the group a sense as to what happens, right, with a drug. And, early in research, you've got that chemical compound that may be the identifier, and then as it goes through development, Perhaps your finance group is assigning a code, or the program management group, but then you work to get the non-proprietary name, the generic name, and then ultimately a brand name. And this continuum really means that you need to think about understanding how to map names to products or projects as early as possible because there's this sort of fractionation that can happen.


[Keith Parent] I can tell you, when we've done patient repository's systems in the past, part of it is always asking what, what pharmaceuticals have the people taken over time? What, what drugs are they taking over the last X number of years? I'm looking at a potential opioid study right now where they're looking at a five-year look-back kind of cycle. When you think about that, people have so many different drugs that they've taken, and they come in with so many different names, whether they're generic or, or prescription drugs. So, having that terminology down, it is something that we really have to get down for us. And that's, that's what we're going to talk about today. Did you want to talk about our Easter egg yet, or what?


[Ron Niland] Oh, yeah, so, I think on this slide here, you can see there's a product. It just so happens that it has my name associated with it: Ron Niland. "Nilandron." Anyway, all right, next slide.

[Keith Parent] Ron is not on Nilandron.


[Ron Niland] So, speaking of Nilandron, on this slide, we just wanted to give you a sense of product names for that compound over time with various aspects of development. And so in this slide you can see the different sort of functional areas, if you will, on the lower end are the systems associated often with them, whether it's a LIMS system in pre-clinical or SAS in data management, whereas up on top, you can see that the compound where the project has taken on different names, as it's gone through this continuum, if you will. So, that was just an illustrative slide. Next, next slide.


[Keith Parent] So, now we wanted to get into an area where we're starting to talk about: how do we categorize data? How do we put together an information architecture? We're starting out, if you'll notice up on the top, the chevrons are gonna kind of mimic that first, that last slide that we had and talk about the continuum of, of the drug development. So, here in the early phases, you know, everybody has a different focus. There are different departments. But everybody understands meetings, and monthly calendars, and budgets, and things like that. So we know that we're going to have some metadata that's going to be focused around those types of things. Next, Marianne.

However, one of the things that we want to start thinking about is, when you get into looking at lots of products, or dealing with multiple projects within a company, some startups may only have 1 or 2 products. So it's a very small group of people that are there. But you may have a large pharmaceutical company or a large biotech, and you may have lots of projects going on. If you think about the, the metadata we, we mentioned in the last webinar, we talked a lot about taxonomies and metadata and how they all kind of work together. Here, we're talking about using where you are in that cycle or what type of project or department you're in as part of the metadata creation. And using an individual contributor, or the project name or the program name that it's under, or even the department, and putting those together as part of the metadata, and those can add together. And what that gives you, it gives you the ability later on to view data or search for data, or pull that data out, specific to very distinct criteria. And our goal is to get as much granularity as we need, or as high-level as we need, so that you can find that data, find that information later on.


[Ron Niland] Keith, you brought up an interesting point on that subject and that was, you could be in a nascent stage company with just a few projects, but if in fact you're thinking in that future state that you'd like to be partnering with a larger company, just remember they're going to be looking at your data and kicking the tires really hard, and their expectations around the structuring are going to be very, perhaps, different than those of your colleagues. It's just something to consider.


[Keith Parent] Sure. We talked, in the early phase, we're going to have LIMS systems. So a Laboratory Information Management System could be used for lots of areas. It could actually be in the early discovery or it could be actually later on in manufacturing where you have a LIMS system that's actually looking at different parts of the manufacturing process making sure you're using it as part of the quality system. But if you look at this just simple illustration, you're going to see the number of areas that a LIMS system would actually touch and talk to, whether it's stability studies, sample tracking, invoicing, inventory, even just documentation, all of those things take a, take a place within the LIMS system and are going to generate data.

So, when we talked about, in that earlier slide that Ron showed, we talked about all the different compounds and the volume of compounds, this is where these kind of early systems are going to have volumes and volumes and volumes, and being able to knock that, that amount of data down to a smaller number is going to be important, or be able to use your reporting effectively and being able to look at the data the right way is going to be really important. Ron?


[Ron Niland] Yes, so, when it comes to research and that aspect of automation, high throughput screening is a reality. And the fact of the matter is that the volume of data is going to be so significant that you need to think about these compounds of libraries, uh, and at the same time, compound libraries, sorry. And, at the same time, you need to think about these assays, where some are proven, established assays, and others are sort of in the stage of maybe emerging technologies. 

And the fact of the matter is, each, each of those are going to be producing a wealth of information. And the idea here on this slide is just to, again, show you that here, we've got several hundred wells. And the hotspots represent more promising targets. But then the question is, well, where are you storing the information for those, whether they're red, orange, yellow, or green? And the fact of the matter is, you may have to separate between these and make determinations as to what you're going to ultimately store.


[Keith Parent] Yeah. We've done projects in the past – so, so what you're seeing here are just some examples of different things that are out there. But to add a lot of heat maps to different things based on looking at the data. I know we had a lot of people in data management on – they're probably used to using SAS, or R, different things to generate data from clinical trials. And being able to put that data together for the reports to be used in some of the submissions. Early on in the process, there's just so many large volumes of data that they – we're trying to get, get rid of the negatives as fast as possible, so that's where the high throughput training comes in. Being able to do that kind of work, also, the ability to add AI into this opportunity here and be able to look at nuances between the data, those are going to be other interesting things that are happening. Marianne, next?

This, this slide is actually going to be a build-along slide. What we want to do is, when we start with early research, you've got a lot of different researchers out there doing stuff, they're grabbing their own data, they're putting stuff together, they're, actually, this came from a project we did specifically with a company that said, how do we take it and understand when we go from this early research to when we're going to be making it into a, a compound, or a product that, we're going to be, you know, putting something behind? And the concept here was, you go from the individual researchers, they always go to a research review committee. Once they get to that research review committee, that committee will look at it.

A key area, and probably everybody on this webinar is probably gonna recognize, once finance puts a, puts a number on something, now, all of a sudden, it has, it has reality. So now you go from this whole pre-clinical area where everybody's doing kind of the Wild West, going all over the place to, once it has a finance number, and people are tracking it financially, now, all of a sudden, it's gonna be in development. Now we're looking at it, how are we dealing with that? It crosses lots of different departments. So, now, we start talking about where that departmental structure, and the departmental meta– metadata takes into account where you are in the process, and who's doing what to that data.

So, finding data, using data, using the content across this, whether it's in data– databases or in documents, is going to be a huge ability for us to use, utilize metadata and taxonomies to be able to find that data throughout these departments. And the systems that these departments are going to use, because every single one of these systems or departments is going to be using a different system. There may be some overlap. And I think if you saw on that early slide, we showed how quality is across the board, across all departments. That will be one of those systems where everybody's going to be looking at quality, to see how it's tied together. But they're still going to want to have how it's tied to each of the different individual departments. Next?

So going, going further when we're getting ready to submit, you know, all that work upfront and putting things together gets us to the point where we want to be able to get ready to submit data to the, to the FDA, using reference models, and in this case, the EDM reference model for submission processing. Those are important. Why did the government put some of these reference models out there for us? One of the big problems was that they were getting so much data in over time that it was very hard for the FDA reviewers to be able to look at all that data. By putting it into a very certain format, and they can look at that data, it made life a lot easier for them to be able to go through documents, click on links between documents, and be able to have that data available for them to do their review cycle, and cut down on that review. So taking, you know, a five-year review down to months. And if we look at last, last year, the, the amount of time it took to do the review on all the Covid drugs coming in and being able to get to the, be able to use some of those vaccines, it was astronomical how fast they were able to do that and get that done. Howard, how about some comments on what you get involved with, putting the SPL together?


Or the XML for the product label.

[Howard Shatz] Yeah, I was going to comment on, was –

[Keith Parent] You can go back one slide, Marianne.

[Howard Shatz] So, comments that I would make are – yeah, OK, so, that even in the SPL and the, when they started requiring SPL for listing purposes, you can see they had to coordinate with all the different departments, and even also at the regulatory stage, which I'll be talking about in a couple of minutes, they had to do that, but apropos to what Keith was saying about the document management, so, the document, the electronic document management systems allows them to to better handle these documents coming in, but also even the taxonomies standardizing on the use of terminology and these things, it allows them to study the documents that they're reviewing.

But they can also compare them more readily to existing documents that are already in their system to see how other companies first roll out a product, for example, handled it and because if you're using different terminologies for active ingredients, you're not going to be able to compare two documents for the same, what is essentially the same active ingredient. So, it requires a great deal of coordination. And it also requires that companies standardize on the terms so that they can better communicate with the different departments within themselves and, and with the FDA.


[Keith Parent] Yeah, and I think, I think to that point, I think one of the biggest things that we're trying to get across as part of this webinar is that if you haven't thought about information architecture, you haven't thought about the metadata or the taxonomies that we introduced last webinar, here's why, here's some of the reasons, and each of your departments are going to do their own thing, if you don't think about a, an overall organizational information architecture. So, I think that's important. Ron, you were going to jump on?


[Ron Niland] Yeah, I just wanted to chime in a little bit here, and that was in terms of the participants in the meeting. There's, there's a large group that represent data management and IT and you're you're, you're the very group that should be spearheading this kind of an effort. But some of your best allies might be those in program project management, where they're helping individual projects progress, and then perhaps looking at a portfolio too. They need that data aligned so that they can report on it. And so, that's just something to think about in terms of building alliances within the organization, and driving this kind of change, if needed.


[Keith Parent] Next slide, Marianne. So, now, when we look at the ref– different reference models, one of the earliest things that I was involved with was a submission project. And if you look at this, for those of you that have ever done any kind of electronic submission to the FDA, you'll understand that a lot of data is pulled from different systems, whether it's pre-clinical, clinical, manufacturing, we get the labeling data, pharmacovigilance, all that data, gets prepared, and put it into an EDMS system, or an Electronic Document Management System. That document management system, that'll be tied to a system that will handle the eCTD submission format. And from there, you'll have a validator that'll make sure that it's going to pass the rules that get set up for how data will get sent to the FDA.

Then once it goes through that validator, it'll go to the FDA gateway or the EU gateway and be able to be submitted to the proper health authorities. The goal here is to make sure that all along the way, each one of these boxes up until now has been kind of a silo. Each department is siloed. But the reality is, as they go along the continuum, they've got to kind of get together and understand that there's rework to be done if they don't come up with an architecture that's made, that uses common terminology and make sure that we can have ease of use between systems, what'll happen is there's a lot of rework along the way. People copy things all the time.

You're going to hear us talk later on about the concept of a single source of truth. By not having a single source of truth, or not having that data that you can rely on, people will do a lot of rework along the way, whether it's your, your clinical operations folks, your regulatory operations folks, your data management, all along the continuum, they're going to be looking at their own systems. They map the format thing based on the system they're using at that point in time. Our goal is to say: let's think, let's think early on about that, that we are going to be using this data across multiple systems. How do we put it in a place where it's easy to use along systems? By using industry reference models, that will help, and we'll talk about that in a little, in a second. Next, Marianne?

There we go. So when I talk about reference models, we used this slide in one of the earlier presentations, and we really want to talk about the fact that data is a continuum all the way through. So when we talk about looking it at the sites, when you're at a clinical site, they're usually going to be using an eISF, the Investigator Site File, or looking at data. That's going to go into an eTMF. eTMF is Electronic Trial Master File. That trial master file coordinates all that data from different clinical sites and pulls that all together.

The EDM, the DIA EDM, and eTMF reference models are used specifically to gather data and get things ready for submission, and then once you get into regulatory, and you're doing all the different submission work that you're gonna be doing, whether it's correspondence, commitments with, with the health authorities, it's the other ones, you're going to be looking at RIM, which, RIM stands for Regulatory Information Management. I have been working for the last year on a DIA RIM working group, and we're about to submit the second version of that. I'll talk about that in a second as well as we talk about, we have a distinct slide directly on RIM. Ron, do you want to jump on that at all, or add to that?

[Ron Niland] No, but thank you.

[Keith Parent] OK. Let's go to the next slide.


[Ron Niland] Yeah, so maybe here, I'll jump in. When, when you transition into clinical, then, it's, this is where the rubber meets the road, right? That clinical trial is happening and you've got electronic health records being transferred into the company, but then at the same time, you may have your investigators funneling information directly in too. Your clinical operations staff may be doing site visits, may be going through patient charts; that needs to be factored in. Ancillary data associated with your trials, whether it's ECGs, or bone DEXA scans, are elements that need to, again, be transmitted. Very often, there's an image, imaging component there that needs to be factored in. You've got sponsor warehouses. Now, along these lines with ancillary data, sometimes you have these intermediary parties that are looking at these images, and then doing reviews, and maybe even diagnoses, and there's adjudication that happens, but that may happen in a sponsored data warehouse. And then you've got patient reported outcomes. So you can see, you know, you've got information coming from a multitude of sources, and there's a lot of bidirectional flow as well. Keith?


[Keith Parent] Yeah, we're actually going to be doing another part of the webinar series that we're talking about: How do we deal with some of the issues that you have here? Because one of the biggest problems that we see, is when you're dealing with lots of different hands in the pie and they're, they're moving documents around, some people will scan in documents from some of the sites. There'll be an image file, and it won't have been OCR'ed. So they can't look at the data itself. They can't search on the data anymore. It comes in as just an image file. So our goal is to figure out: how can we make it so that we retain the integrity of the data all the way through and the fidelity of the data as it gets to your sponsor warehouse?


[Ron Niland] And just one last point I'd add there is very often I've found in my 25-plus years of drug development that ancillary data is sort of the Achilles Heel of a program, so you need to be sensitive to that integrity, as Keith pointed out.


[Keith Parent] Yeah, so on this slide, we're talking specifically about the eTMF. And the eTMF here really talks about: how do we pull data? And you look at the categories and we have primary categories, subcategories, and the different content types across the board. Our goal here was just to show that, by having the metadata properties, and this, and this is the, the DIA, Drug Information Association, put together an eTMF spec. There have been multiple versions over time. The goal was to be able to look at the different types of content and be able to have metadata assigned to that so you can actually find it and set views up of it and be able to understand how the data flows. Next slide, Marianne.

So when you're looking at this at the site management, you're gonna see right away, there are 1572 document that you're gonna find at every site because they're gonna have information about the principal investigators. From those documents are going to be able to see the principal investigators' CVs. You're going to want to be able to look at the financial disclosure forms. All those things. And if you notice, they're tied back to different metadata elements that are there, so again, by using metadata efficiently, we can make it very easy to find volumes and volumes of documents over the course of a study. And if you think of a study that goes 3, 4, 5 years, and you have a hundred different sites, You're gonna have lots of documents. You're gonna have multiple versions of each one of these documents. So how you manage that, and how you're able to get to that content is gonna be really important.


[Ron Niland] Now, on this side, what we wanted to give you a sense of, again for those within data management in particular, is what happens when the data is aggregated across a series of systems. And at the core of a lot of the analyses for data management groups is SAS. So this this slide was produced, in fact, by, by SAS, but we thought it was very illustrative of how you might want to approach your informational framework, whether it's data management, or clinical or what have you. The idea is you need access to the information, then you need to be able to manage it. The idea is to manage it in sort of discrete subsets, putting variables in there and doing validation of that data.

With the idea, then, of analyzing the data and developing outputs that may be graphical it orientation, and using things like Spotfire, or Shiny off of R. And the outputs are really what you're looking to drive for inclusion in your submissions, perhaps. And so that could be the summary reports, line listings, graphical outputs. But this, this framework, while it looks simple, it's very, very empowering for the organization. Keith?


[Keith Parent] I think the sad part, Ron, here, is a lot of times, instead of actually doing real work of showing the data, and putting the data together the way you want to, a lot of time is spent in cleaning up data. And our goal would be to say, how can we get data in so that we have less cleanup work and more of actual, you know, representation of the data itself? You know, if we could get to that phase that, that would help cut down drug, drug development dramatically. You know, cleaning data is just such a waste of time, for the most part. Thanks, Marianne.


[Ron Niland] Yeah, so speaking of cleaning the data, so, just focusing a little bit more here on that data management group. And you're getting data in things such as the datasets, which you're looking at in the upper left, and that's effectively something like a simplified Excel worksheet. But the fact of the matter is, the data management group is pulling data from across the organization, and very quickly, this can turn into a big dataset. It's managed within a library system.And then, we saw the visual outputs. But the point I would make here is that, for your organization, you need to think about the idea of a data catalog.

So, what's a data catalog? I mean, we've talked in the past seminars about data dictionaries and getting your individual terms defined. The data catalog is even more than that. Yeah. There's the basic terminology for the organization, or the technical definitions. There's the lineage of data. There's the mapping of the databases, and understanding the relationships between the data and how the data is used and governed, these are all factors that come into play with the data catalog. And so, if there's any one group in an organization that's going to sort of understand the need for this, it's data management. Keith?

[Keith Parent] Next, Marianne.


[Ron Niland] OK, yeah, and this is just a simple graphical output of tumor response to show you what may have been generated using Java, and perhaps something like Spotfire again. But the point is this can't happen unless you really have judicious management of the underlying data in the metadata. And at the same time, what you're seeing here is tumor response. So you can imagine, with a clinical trial, that this could be very significant information, and highly material, but especially for smaller companies. So the aspects of security around that are something that you need to think about, and the aspects of even intellectual property sort of come into play here.


[Keith Parent] The other thing I was gonna say, Ron, on this is that I've actually started to work a lot with a lot of AI companies recently, looking at data and being able to use smaller datasets to be able to pull some of that data out or look at different cohorts of data. So I think that the industry is ripe for a lot of changes, particularly when they're looking at data from that perspective. Next. Ron?


[Ron Niland] OK, yeah, so on this slide what we wanted to give you was that sense of the regulatory group, and really how they're such an integral player to the aspect of drug development and acting as a primary interface with a myriad of functions, right, as well as data management. But in the case of regulatory, they're interacting with the finance group and registration, and pricing with the project portfolio group, and sort of understanding individual projects and their progression and getting into the queue, if you will, for submissions, whether it's an IND or an NDA, you're interacting with the clinical and medical elements within research and development, and those groups, then, are interacting with their own sort of vendors, whether it's labs or CROs.

And the same holds true for manufacturing. They, they've got, perhaps, contract manufacturing organizations that they're working with. And the same, again, holds true for marketing and the vendors that may be supporting them with promotional materials. So, it's just something to think about in terms of the complexity, and if you could shift to the next slide, then, Marianne, what you'll see here is that this is a group that's also setting the stage for regulatory strategy, the clinical trial registration, submissions of plans and safety reports to regulatory bodies.

They've been driving the common technical document development and acting as an intermediary with these healthcare health authorities, in terms of questions, answers, commitments, postmarketing commitments, and the like. And so this is a group that is sort of at the center, the wheel, if you will, in terms of that hub in all the different spokes. Keith?


[Keith Parent] But it's also a big part of the cost too, Ron, right? When you get to this point, there's a lot of work that goes into making sure that those submissions go in as clean as possible, and all the work that happens with the health authorities is really important at this point. Let's go to the next slide. I mentioned a little bit earlier that I've been working on a DIA RIM working group, and they're about to publish out the second version of the reference model. And if you think about all the different categories, we kinda showed that on some of the previous slides.

But a lot of submission planning and tracking, the investigational application, the market applications, all the labeling applications, the variations of CMC from a chemistry perspective, the integrations, the KPIs, all the correspondence with the health authorities. All those pieces, we've actually looked at from a process perspective to kinda put together. So if you're lucky, if you're interested in that, a great place to look would be at that RIM reference model that is going to be published, or is being published now. Next slide.

We wanted to take, on this slide itself, we wanted to basically take a holistic view and look at all the way through. Those of you that are familiar with submitting to the FDA, understand the Module 1 through 5 concept and how things fit into that concept. And then how RIMs kind of ties into that, and all concept of once you've submitted, how do we pull it all together? This is an overall slide. Next slide, please Marianne.

And here, again, gets down to the actual modules themselves and how things are tied together. From the eCTD perspective, we wanted to show the electronic Common Technical Document, what fits into each of the different sections, and how these things are kind of tied together, so that when you look at the different departments, and who's going to be working on what, you can see, kind of broken out, how that data has to come along.

And realistically, if you didn't have that metadata structure, or you didn't have something to tie these things together, it would be so difficult to try to pull all this data together, and make things happen. Notice each of the different modules has their own table of contents, so you can look through it, because there's volumes, and volumes, and volumes of data that, that go out here. Those of you that have been in the industry long enough know that when we first used to submit to the FDA, it used to be tractor trailers full of paper. Then it became computer systems. Then it became electronic submissions. So now, we're at a point now where we can actually skinny that down to understanding what data gets sent and how it gets sent. Any comment, Ron?


[Ron Niland] Yeah, it's just, you know, as we looked at those prior illustrations for the functional areas, but especially regulatory, where you can see the bubbles expanding. With each one of those, if you, if you think about it, right, each bubble, or node, then expands to yet another. And the aspect of, then, following this out, that series of nodes to the eCTD is sort of part of that aspect of doing your, your informational architecting, that roadmapping, and sort of trying to diagrammatically, if you will, sort of capture that visualization of the landscape and the data flows. And here, you can see, sort of a node went into, then, that table of contents, as Keith pointed out, which is very informing in terms of your, your informational structuring. [Keith Parent] All right, next?


[Howard Shatz] OK, this is where DCL typically gets involved. Regulatory has now, has now put together a CCDD, and in order to, part of that approval process is the package insert, the medication guide, whatever other written materials need to be included with the drug that goes to the pharmacist or the healthcare provider, what have you. So, back in 2005, the FDA began requiring that all that information, that text, be included in a Structured Product Labeling file, which is an SPL file.

Now, what you're looking at on the screen is rather intimidating, but we just wanted to give you a feel for what's in there. So we highlighted certain aspects, such as the DUNS number and the labeler name, and that all ends up being shown in this table when you go to the DailyMed or other sites. When you look at the label after all the package inserts, then you'll get the package images and then you'll get this table that summarizes all of the drug product information for that drug. The dosage form, the route of administration, the ingredients, the strength, the moiety, the inactive ingredients, and the packaging, as well as the manufacturing. So all that goes into this SPL file. and we've, at DCL, we've provided SPL since the inception.

Again, it was 2005 when it became required for approval submissions, and then in 2009, it became required for listing submissions and you have to certify these listing submissions every, every year. So, we're, we're, we're fluent and trusted in this field. But when we pull away and look at the biotech and pharma industry from a higher level, we recognize that SPL is U.S.-specific, and focused only on meeting the regulatory requirements of the FDA.

So, you have to use a certain name for the active ingredient, you have to specify the strength of the active ingredient in a certain way. You can't just say it's 10% alcohol for, I'm sorry, 70% alcohol, or 65%. Percentages is not how you express the strength in terms of the drug and you can't use measures like ounces and pounds. You have to use metric measurements. But all this is just a small, small attempt at the FDA to require the standardization across the drugs of a common terminology. But now, uh, the focus is on IDMP. So, if we can go bring up the next slide.

And so, this is an international standard. The FDA's, the SPL is based on an international standard. Health level seven just like IDMP, but the IDMP is much more sophisticated and they have these five main groupings: the substance forms, the dosage– the substances, the dose forms, [MPID,] the PhPID and the units of measurement. These are roles, again, moving towards standardization of the terminology that is being used, that you can compare apples to apples, not apples to oranges when you're going across, uh, the slide, the systems. Next slide, please.


[Ron Niland] Oh, Howard, can I just maybe add to this, too? The idea behind an IDMP is to enable the regulatory bodies across the world ultimately to be able to compare compounds to compounds. And so, what they really want to understand is like that effectiveness and the risk benefit associated with the product. And they also want to understand that it's the integrity of the product going in, meaning that the blockchain of custody if you will, around all of the source ingredients, and the evolution, if you will, of a product, realizing pharmaceuticals are produced in over, over 50 different countries today coming into the US.


[Howard Shatz] Yeah, and also I want to add that while the FDA, the SPL part is a subset, or a forerunner to IDMP that the FDA developed for its purposes. They are involved with the IDMP; they are in agreement with this need for this larger standardization and they're working on and they've even relayed it. There's a webinar that the FDA put together where they map the current SPL requirements to the IDMP requirements and shows where, yeah, we're getting, we're partially there.

But, basically, IDMP, so, again, SPL was really just a small subset, but IDMP itself is creating a worldwide standard, as Ron was mentioning, and it's, the focus, what I've found, and I think Ron and Keith have found also, is that the companies are focused on getting SPL that validates with the FDA, but they're not getting a sense of the larger picture of why this is going on. And, but this whole process of IDMP will help streamline – it's not just for the benefit of the regulatory agencies, but in fact it will help streamline if the different departments that we've been referring to understand the need to standardize. That will make the, the information flow much more streamlined.


[Keith Parent] Howard, one of the things I find, I found out when we first started talking with that, um, a number of companies, one of the biggest problems that we have from a labeling perspective has been the fact that every time they come to you, sometimes they come in with different versions of the same document that have some of the label information. And it's almost an extra effort on your part, your team's part, to actually have to do a lot of rework of making sure that it's different or where the, the documents came from. So, that's, that's the kind of thing we're talking about with this webinar, is that if you understand that your data itself and you have the proper data that you're using, it saves all that extra rework later on.

[Howard Shatz] Right. So the different time savings are involved there as well.


[Ron Niland] Yeah. So, the functional areas, we just wanted to close it out with manufacturing, and it's not to give it short shrift but rather to say that this is a group, that integrated systems is really, uh, where it's at, right? And increasingly there's that Internet of Things that's helping to drive analytics. And this can go all the way down to understanding how products moving from one location to another using cellular sensors and, you know, little devices put inside the boxes of a shipment that again, is in transit.

Um, this is a group manufacturing that is using the LIMS system. It's using the ERP system. It's, you know, it's taking that data and then conceivably analyzing it near time, near real time, if you will, and helping to understand what is happening with sort of the performance of either production or the delivery of product. And this information, if you've got your, your content and information structured properly, will enable better identification and root cause analysis and minimize defects and waste. So, again, we didn't want to give it short shrift, but the fact of the matter is, we're constrained in time here. But I think that's the last functional area, if I'm not mistaken, Keith. 

[Keith Parent] It is, Ron, but one thing I wanted to add was just the fact that even in this, in this area, quality takes such a major part of, of manufacturing. And that you almost have a separate quality department just very focused in on that manufacturing aspect, compared to the quality that you'd look at, whether it's computer systems early on, or some of the other areas of quality in the early research. So, you know, it's really important to understand that all the data that we capture all along the way feed into the systems that are going to be used for manufacturing. And then manufacturing generates that much more data on top of that. We want to be able to tie back to those drugs that were actually manufactured.

[Ron Niland] Yeah –

[Keith Parent] Next, Marianne?


[Ron Niland] Maybe a last quick note on that slide of manufacturing. Manufacturing's involved with the product development all the way from research through. You know, the chemists are producing smaller scale production and then as they go into clinical trial, there's the larger scaling up and then even larger yet with the launch. And so, the data that you're able to sort of generate is going to help the manufacturing group in terms of their forecasting. And, you know, that could make a world of difference. Because, when you're producing, when you're starting to scale up, you could have, you know, millions of dollars involved with the production of product.

[Keith Parent] Ron, want to handle this last slide?


[Ron Niland] Yeah, so today, you know, we were talking about sort of streaming workflows, and the idea is really to help you to sort of understand what the value of that is, and the issue is, on one hand, reduced reprocessing, rework. And, on the other hand, it's sort of getting the data into the hands of people that can sort of make decisions readily, and with conviction, knowing that they've got data that is true, right? The aspect of that streamlined workflow is an empowering enabler of automation.

And you, as you look at your company today, there's that question, right, of how much automation is there? How much manual process is there? Where do you want to shift from the manual process to the automated process? In order to get there, you need to understand your business process and do that business process mapping. And in doing that mapping, there's that very visual component that I was speaking to earlier. The idea of having just very consistent nomenclature in your company is going to be so key in this journey.


And, you know, just part and parcel here, is the aspect of metadata. The sooner, as Keith had pointed out in the early research slide, you can help that researcher understand their project, their project metadata, their program metadata, their department metadata, the better you're going to be serving the organization downstream, whether it's the pre-clinical group or early development, or, or later in development, with the data management group, that will be reaching all the way back to research for information, for sure.

And again, this data integrity and the nomenclature in particular, are going to help the IT group, the data management group, the program management group, to come together and understand how to develop really informing dashboards in the company based on real good key metrics. And those, those KPIs, and also with, with manufacturing is, we're alluding to.

The idea is really to help you to sort of not just improve your workflow, but to reduce your risk, and at the same time, improve your quality. In doing this, you're going to help the onboarding of staff, as we're alluding to, where there may be high turnover over the course of time with your company, and the onboarding and the training of your staff along these lines of helping them to understand what it means to have good content, and structure, and metadata, will enable you to convince your organization if you, if you're not there already, to embrace the idea of a single source of truth. And that single source –


[Keith Parent] You know, Ron, I can't tell you how many, I can't tell you how many times I've gone into a company, and they started out, and they're trying to do a collaboration project, and they're talking about new people coming on board and things like that. And the biggest complaint that we hear is they can't find anything. Because there's so many, it's so easy now to collaborate across different systems, with, with Teams, and Office 365, and all the different things that they put out there. And the problem is, that we, without putting that information architecture in place, and then metadata in place, you can't find anything, and it just makes everybody's life that much harder.


[Ron Niland] Yeah, and I guess, to this point, the number of parties that are involved, that's the series of stakeholders, or something that you'd be well served to identify, to map out as well, to help you to understand those people that are parties that you need to serve information to. So, do we have some time for questions, Marianne?


[Marianne Calilhanna] Yeah, we have a, just a couple more minutes. And I do want to remind everyone, you know, even if we can't get to your question, do feel free to submit it, and we're happy to follow up after, after this event. But we do have a question. Um, Keith, maybe you can speak to this. When people talk about a reference model from an industry, industry group, how strictly should that model, model be followed?


[Keith Parent] So, that's, that's very dependent on, on the customer themselves, how they want to do it within the company. We typically have, when we set up a reference model with some of the tools that we have, we do configuration workshop. And typically, you are only picking and choosing those reference model components that make sense to you. You only, there may be 50 elements out there, but you may only need 3 or 4 of them. So you don't have to use all of the elements. Because, to be honest with you, it could be overwhelming.

But if you use the ones that are the most applicable to what you need to do, again, we talk about that minimal viable product that it takes to get something going. Just identify which elements make the most sense to your company and use those. You'll always have the other ones there if you want to add them later on. Particularly if you're only going to be doing things in one country, you don't need to worry about multiple countries or if there's only, you know, one dosage, you don't have to worry about multiple dosage. So, those are the kinds of things that you should look at and only use what you need, because otherwise, it becomes overwhelming.


[Marianne Calilhanna] And follow-up question, somewhat, somewhat related. For a small- to mid-size company that might not be effectively managing metadata or document management, can you recommend three areas on which they should first focus?


[Keith Parent] I think number 1, one of the areas they should probably focus on is just, understanding the volumes of document that they have and how they're classifying them today. When you look at the configuration today and how people are using them, they're gonna find that people probably are using those documents differently than you thought they were. So I think a big thing is just looking at the process maps across the board and how documents and data are used. And from those process maps, you can actually derive some of the metadata you need. You look at how, the ways people search for documents today.

A real big tell is to look at how they named files. Many people will actually used the naming convention of a file as metadata, so the metadata will actually be in the file name. So we, when we go and do a configuration workshop, we're looking at a couple of different things. one is the volume of documents they have out there. How they're classifying things today, and then how they want to be able to find them in the future. And if you can look at those two things, you can actually come up with a good way to declassify and identify metadata that they may not even have thought about before. Does that answer the question, Marianne?

[Marianne Calilhanna] Yeah. Ron, did you want to add something?


[Ron Niland] Yeah, please, thanks. What I would suggest for that small- to mid-sized company is to think about whether you have in place an understanding of your technology roadmap, whether it's been visually sort of laid out in terms of the functional areas and the systems being used. And then once you understand the systems, then understand the flows between the systems, and if you capture that in a visual, I think that will be very informing.

And then, then you want to look at sort of key documents that are flowing through those systems and just try to understand that relative mapping. I think another thing to do is to look at your operational, procedural documentation hierarchy and ask yourself, you know, how well are your processes articulated in SOPs and work instructions? And therein should be process maps too that you can sort of draw upon in building out your, your technology roadmap if you haven't gotten one in place.


[Marianne Calilhanna] Great. Thank you. And our next webinar in this series is going to be about systems. So we do invite you all to go to the website, we'll be sending you some e-mail and making sure you can register for that next learning series. Thank you everyone for attending this webinar. The DCL Learning Series comprises webinars, a monthly newsletter and our blog. You can access many other webinars related to content structure, XML, standards, and more from the On-Demand section of our website at We do hope to see you at future events, and have a great day. This concludes today's broadcast.


[Keith Parent] Thanks, Marianne. Thanks, everybody.

[Ron Niland] Thank you.

bottom of page