DCL Learning Series
Current State of Best Practices & Technology for Life Sciences’ Content and Document Management
[Marianne Calilhanna] Hello, everyone, thank you so much for taking time out of your day to join us, and welcome to the DCL Learning Series. Today's webinar is titled "Current State of Best Practices and Technology for Life Sciences' Content and Document Management." My name is Marianne Calilhanna. I'm the Vice President of Marketing at Data Conversion Laboratory and I'm going to be your moderator today. Just a couple of quick things before we begin. This webinar is being recorded, and it will be available in the on-demand section of our website at dataconversionlaboratory.com.
We invite you to submit questions anytime during our program. You can do that via the box on the control panel, where it states "Questions." We will save 15 minutes at the end of our presentation to answer all your questions. If we don't get to your question, we'll follow up personally after the webinar.
Technology plays a critical role in the life sciences, where accuracy, traceability, and compliance, along with speed-to-market, is critical. Improving content and data management, IT systems, compliance, and program management streamlines research and drug development. Data Conversion Laboratory, Court Square Group, and JANA Life Sciences have developed this learning series to address how technology can contribute to your success. This is the first of seven individual webinars, and the other webinar topics are listed here. Today's webinar sets the high-level stage for this series. And we will cover basic terms and concepts related to content, strategy, operating standards, technology, and regulatory compliance.
I'm delighted to introduce our panelists today. We have Mark Gross, President of Data Conversion Laboratory. Keith Parent, CEO at Court Square Group. Ron Niland, President, JANA Life Sciences. Glenn Emerson, Senior Consultant at JANA Life Sciences.
[Glenn Emerson] Good afternoon, everyone.
[Marianne Calilhanna] This webinar is brought to you by Data Conversion Laboratory, or DCL, as we are also known. Our mission is to structure the world's content. DCL's services and solutions are all about converting, structuring, and enriching content and data. We're one of the leading providers of XML conversion services and an industry expert with SPL conversion for global pharma companies. If you have complex content and data challenges, we can help. Keith, can you tell us a little bit about Court Square Group?
[Keith Parent] Sure. Court Square is a managed service provider, specifically for the Life Science vertical. And we have what we call our Audit Ready Compliant Cloud Environment, where we host numerous content and management platforms.
[Marianne Calilhanna] Ron, can you tell us a little bit about JANA?
[Ron Niland] Sure thing, thank you. JANA is a third-generation, family-owned company. In the area of Technical Services, with our 150-plus technical writers, engineers, and program managers, we basically offer four components, and that is technical documentation that's supported by operational excellence, program management, and IT systems. And we operate within an ISO 9001:2015-certified environment, and we also comply with AS9100D.
[Marianne Calilhanna] So, before we begin, I'm gonna launch a quick poll. We'd like to find out and hear from everyone. Here we go: What is your role in the life sciences industry? So, if you can select what best describes your role, we'll give everyone just a couple of minutes to submit their answers.
See that people are still voting. OK, I'm going to, I'm going to close the poll now. Couple of people still submitting. OK, I'm going to, I'm going to close the poll and just quickly share the results. Can everyone see my screen? We have 20% of you who are in R and D, 10%, both in regulatory and clinical operations, 35% in data and IT management, and 25, corporate management, or other. Thank you for taking part. That helps us understand with whom we're speaking.
All right, Glenn, over to you.
[Glenn Emerson] Thank you, Marianne. So content strategy is a buzzword we hear about a lot today, and it means different things to different people whom you ask, and what their background is, but there's a good working definition that I like to give as an elevator speech. And that's the quote from Kristina Halvorson, that the goal is to ensure that content serves the business goal, or we can think information data in this case, or that it helps the user to complete a task. And a user can be a customer. A user can be an auditor, a user can be one of your researchers, or people that need to complete work to get your product to market. The goal of all of this is really to bring the stakeholders in your organization together, and agree on a strategy.
Because typically, what we find in any organization, although the business has clear goals, each operational area in the business may have their own interpretation of those goals, or how they seek to meet those goals. The important thing is to bring those people together, and to hear what the use cases are that are essential to the business, and then understand from each other where the handoffs have to occur, and where there are potential conflicts, roadblocks, maybe different interpretations of the vision, and to seek alignment.
Because what we're really talking about here is turning your information into data that can be automated with computer systems and computer processing, and Mark's going to talk to that in just a moment. In order to begin that journey of structuring data that can work across these different systems and organizations in your, in your company, we first need that high-level agreement and alignment. And with that, I'll turn it over to Mark.
[Mark Gross] OK. All right, thank you, Glenn. And thank you, Keith and Ron, for joining today. A lot of talent online here with us today. And thank you, Marianne, for assembling everyone for this series. So my role today is really to get everybody on the same page. We're defining some terminology that we're going to be using today and into other sessions that are coming along. First, an easy one, you'd think: What is content? Guess the short answer is anything that contains information. So, I think we used to think of it as just as documents: published documents, or the working documents, and all the versions in between. But as we grow more dependent on computers to keep track of our lives, it's a lot more than that. I mean, voice and video, images, SAS data sets, and all kinds of other stuff that we're using on a daily basis that form the pieces of what are information sets that we're using.
And what's a content type? Well, it's also, what it implies, what, for example, with documents it might be: a document fits into a particular format, and has a particular template and its particular metadata that you want to collect about that document, and just a quick "What is metadata?" Metadata is just all the information about the document or data that you're collecting about a document. So in an e-mail, that might be, metadata is who sent it and who– the date, and who you're sending it to, those kinds of terms. In a movie, it might be the director and the producer and the running time, language, things about that movie. So, in the life sciences, well, you have specific content types, like a clinical summary reports, and the informed consent form, and, I think, Keith, you wanted to jump in with just a little bit about that?
[Keith Parent] Yeah, Mark, I was; thanks for the prompt. One of the things that we see a lot in our client base is, is the use of content types to define different types of documents and have that metadata automatically tag to those documents. And the usage of that really helps to streamline the process, in that there's consistency and a quality that you'll get out of your documents, particularly if you're dealing with SOPs or if you're dealing with reports that have to go out to regulatory authorities.
[Mark Gross] OK, good, thank you. Marianne, next slide, please. So, let's talk about structure for, for a minute or two. So, if you look at the two documents there, there's one that a human sees and what computers see. So, you know, when you look at the document there's an implied structure there. I mean, if you look at, this is an article; there's a title, and there's people's names who are associated, and who those people are associated with, a summary, all kinds of things like that. If you're just dealing with one document, well, you can just type it on your word processor and make it look the way you want, that, and that's great, but if you're handling many documents, and if you want– and many authors are producing documents, and you want many people to be using those documents in many places, it helps to add some, some structure to that. The structure adds consistency and the ability to use the documents downstream. It also lets the computer do the job of making a document look pretty rather than your having to spend time on that.
So, I think that much of what happens, much of the structuring today, is using, is using XML, as an example of that. The computer sees part of that as a document like the one the human sees, but it's, it's, it's set up in XML, with all the term, all the tags that define the structure of that document. It's, you know, there's a lot of stuff there, it's a little daunting, but it's, conceptually it's, it's telling you that this is a title and this is the author's name and this is something else. And, and, and by doing that, you're letting, you're producing a document that a computer can read, the computer read, but also, the computer can also check that you got it right, and also can format it according to what the standards are. So if you're producing a journal article, if you're producing documents that are going someplace else, they can be shown, they can always show up the right way. A little more about structure over here is just as a closeup of the references section of an article.
And you can see the terms you might want to define. It would be like author's name and a journal title, and the page number, it might be dates, it might be other things over there. All that, to the human, you know, when you're reading it, that's obvious. You know what the person's name is. You know what a journal title is. But when you start dealing with – why can't the computer figure that out? – when you start dealing with international names and journal tiles that don't always look like titles, and abbreviations and things like that, it's not so obvious. So tagging that information in advance and putting that structure in allows things to be done.
Here's an example over here of what that would look like when it's in that XML code we spoke about. If you look here, the names, it's not just the name, but we're also of splitting apart the surname and the given name. And in the article, we were talking about the source and the date, and all those other kinds of things, all that can be included. And sort of mixing things here, some of that is text that'll show up in the article, some of that is the metadata that we spoke about before, like when it was published, and things like that. So, that's sort of an introduction of what these things look like when they're coded up and with a structure embedded in the document itself. Marianne, the next slide, please?
And the last thing I was going to speak about, it was a very, very quick introduction to this, and we'll do, we'll do a little more with it in following sessions, is to talk about, we spoke about metadata a little bit already, and then as a term that's used more and more today is a taxonomy. So the metadata here is, you know, when you're dealing with articles, you, really the metadata is things like who the author is, what it's about. When you're dealing with forms, and other kinds of complex documents that have lots of information, the metadata can actually be larger than the whole document is.
In this case, we're talking about a research template, and, you know, the terms, if you look on the left, there's the version, the dates, and the funders, and the meta– it's not just the funders, but where the funders are located, and what the funding, what the grant number might have been, this could be, this could be, so, all these are, are information about the document, which, sometimes we have a picture of the, of an iceberg. And the majority is below the surface, but it can be a lot larger. And what is a taxonomy? So, a taxonomy is, is really a way of classifying, further classification of the various information that you're using. Mostly the metadata that you're using an article, and putting it together in a structure that you can use it.
And why do you need a structure? Because you want to be able to refer to information in the same way across, across a large group of documents that might be collecting, for example, in, for example, a disease name. But this is, the one on the right, is the classification of diseases, and there are several different classifications of diseases. But a disease might be, there might be a common name that's used, there might be a clinical name that's used, there might be some scientific names that are used.
And if you're looking for all the articles relating to a particular disease, you want to make sure everything's collected. So if you have a taxonomy, and your search software, your other software is knowledgeable of taxonomy, it can go and collect, collect all the articles about it, regardless of what the terms are being used. And not just the term itself, but it might be categories within categories. Like, like, cancer is certainly a disease, and it's a very large, large category, but there are many different kinds of cancer. And there's sub-classification. So depending on how far you want to go in or how detailed you want to get. So by having taxonomies and classifications like that, that lets a computer do the heavy lifting of finding exactly what, what you need. Next slide, please. I think with this, we turn it back over to Glenn, who can talk about some of the information models.
[Glenn Emerson] Thank you, Mark. That was really good, and Keith. So we've talked at the high level about the content strategy, which is sort of the 10,000-foot view down what you want to do with your organization. Mark has taken that and explained how we turn that into structure and examples of structured information, because, again, the goal here is automation and process improvement through automation. The computers need things very explicit. They don't have judgment, They need explicit data. The issue, though, is that it still is people who have to structure that data. So, there's organizational change and process required. And, in a moment, Keith and Ron are going to talk a little bit about organizational capability, maturity, and governance. The information model is the way that that all fits together. So, if the content strategy is high-level, the information model comes from the ground up.
It starts from what Mark has just described and then introduces the tasks and procedures that people who work with that content need to follow to ensure consistency, and then, from there, it can build to whatever you need for your organization, but it becomes a central governance document. And next slide, please.
So, this really describes how different pieces in the organization fit together, and different organizations have different needs. Today, we hear more about content as a service. For instance, where you're providing a content database, as Mark has described, you've turned your information into data that can be mined through APIs from other systems for automated sharing. A common example, for instance, is companies that use Salesforce for customer relationship management mine their knowledge-based system for nuggets that the call center people can use during a call. That's, that's an example of content as a service.
You, we also talk a lot about publishing the multiple channels and media format from a single source. That's also done through automation. And we talk about having a consistent model and style to enable all of that. And this is all defined in an information model document. It, again, is how the people working with the information need to work so that the information is consistent to achieve these results. And integrators and other third parties who want to build services off of that system can use that as a roadmap. They know where the touchpoints are, they understand the document object models and things that they need to mind. Next slide, please.
So, I mentioned, too, governance, and I'm gonna just talk to this briefly. One of the popular sayings in the Agile management model is, because that's such a change in the software development community, is that culture eats strategy for breakfast. You've got people that are used to developing in a waterfall model, they have to think in terms of iterations and stories. It's a major change. The point is that you're driving an organizational change.
It doesn't happen on its own. You have to have consistent data. And to do that, you have to give people guidelines to follow, but you also have to build those incentives into how you run your organization. And sometimes there's remedial training, sometimes there's coaching and mentoring, but through these, you develop repeatable processes. They get the capability maturity.
And in the 15 years that I've been consulting in the field, one of the key differentiators between organizations that are successful with these digital automation strategies and those that run into frustration is whether or not they've developed organizational maturity. And it doesn't happen just because people went to a class. They need to have a culture that reinforces the work that they're trying to do and guide them very specifically. Without it you get, you get entropy. People will fall back into tribal wisdom. They'll talk to their neighbor in the next cube: "How did this work for you?" And pretty soon, the little things that they think they know become organizational wisdom. And you start to see a breakdown in the data or inconsistencies in the output. Next slide, please.
So, this leads to objectives and key results. We establish the organization goals as part of the strategy. Now, we actually get down to What are we going to achieve and how are we going to do it? We also need transparency and accountability on these goals. So having this stuff stated, for instance, if you want an HR process, where you reward people, if you've got an information model that states what the objectives are, that you've got clear objectives for, for establishing performance goals for all your people, it's very open, it's very transparent. There's no mystery as to why someone is performing better than another.
And, you know, as we all know, a goal that doesn't have any measurable outcomes and doesn't have a timeframe is really just a wish. This is how you actually turn it into action. And with that, I'm gonna say next slide, turn it over to Ron and Keith. Thank you.
[Ron Niland] Thank you, Glenn. So just building on the theme of organizational maturity, and realizing we talked about how the pieces need to be sort of fleshed out and how they then come together, I was going to focus, in the next few minutes here, on assessing your organization and its maturity, and give you some thoughts for consideration in terms of helping your company or your institution to evolve.
So one of the things you need to think about is this sort of circular model of leadership, decision-making people, and the processes. And leadership isn't just about individuals. It really is important that you think about individual groups and departments in the organization that can sort of set the tone, if you will, for where you want to go. And it's imperative that the leadership buy into such an initiative, but it needs to be, as Glenn was alluding to, baked into corporate goals and objectives, as well as individual goals and objectives.
When it comes to decision-making and structure, there are aspects here that we need to think about in terms of content strategy, content development, and it really comes down to understanding, sort of, who is doing what and when. So what are the roles of individuals, the decision-makers here, and those that may be perhaps playing roles that are supportive, i.e. reviewers of content and data development and submission. When it comes to people, it's important to build your model around– or, to build people around your model. And the idea is to develop a culture of excellence ultimately. And that's a, that's a never-ending journey, for sure.
But that's something that, again, as we go into the start of the new year, that we need to think about, as we embark on projects such as this. When it comes to the work processes and the systems, then you have to ask yourself some questions, like, Do we have a content strategy? Do we have data structures? If not, you know, then that's sort of very telling in and of itself.
And all of the work that you do as an organization needs to be sort of reflected in your quality documentation structure. And that from an ISO standpoint, is not just your policies and procedures, i.e. the SOPs, but also work instructions and enabling documents. Next slide, please.
So, when we're looking at your organization, or YOU are looking at your organization, and you're thinking about how mature it may be, you need to think about different domains. And not just maybe technical writers and writing groups, but also, those that may work in functions, like IT or even project management offices, the PMO.
And on this slide, what you're looking at are basic versus advanced sort of models that, when you look at these, should be very telling as to whether you have an organization that's less or more mature. But, suffice to say, you know, the larger companies such as, you know, the big biotech companies, such as Genentech, or Pfizer, they're working in that very advanced domain. Whereas nascent stage and mid stage companies may be sort of caught on the basic level with some elements, but perhaps very advanced in others. Next slide, please.
So, on this slide, what we're looking at are basically two things. Capability mapping and some insight to process mapping. So, with that journey of taking these pieces and fitting them together, the idea is really to understand your, your organization and the disparate parts. In this case, what we're looking at are a series of functions in an organization: this is Blue Cross Blue Shield, and they've got about a dozen different functions, and their core processes are identified.
Now, these processes then can be rank ordered by priority, or they could be rank ordered by chronological order in a process. But the idea is you want to have this higher level understanding, that then enables you to then drill down and create processes around each of these. And that's what we're looking at in the lower half. So, this is a screenshot or two from a system that does business process mapping. This particular system, it's called MEGA.
And what we're looking at here is, it's a rendering on the left side, of the backend of MEGA where you're creating processes, and those orange boxes basically represent the first steps in a process. And then, as you drill into them, to the right, you would see an individual process blown out. And that has, in this case, three swim lanes that are broken by roles, broken down by roles, and then you see connections to documents and systems here; systems are denoted with the green boxes. So this is, again, a continuum where capabilities drive the processes that drive the standards that drive the data and the information model. Next slide, please.
On this next slide, what you're looking at is a data dictionary. And we talked about taxonomies, and the taxonomy helps you to understand how to organize your data. The data dictionary is sort of an exercise that may even precede the idea of developing a taxonomy. Here are the ideas to go into your company and organization, and think about it from a process and timeline perspective as to, OK, where are we going with our goals, and the objectives, and the actions. And those then feed into timelines. Well, in the timelines, you've got a series of tasks.
But the question is, what do these tasks really mean? And the idea is, with the data dictionary, is to make it totally unambiguous as to what you mean by a particular task. And so putting a discrete definition around it, it's an imperative. The data dictionary then feeds up into the top here, as you can see, a few different things. It can run from your protocol measurements on the one hand, or statistical operations on the other. But suffice to say, this is an exercise, where it's time well spent.
And on this next slide, if you were to advance, we can then see how the pieces then come together. So to the left, we've got the data dictionary in the center. We've got the process mappings and to the right, you have the project timelines. Within that data dictionary, again, we define the terms we talked about, or identify different data types, whether it's alphanumeric, or currency, or Boolean. And then we understand that this is informing things like the metadata and the taxonomies. At the same time, those terms that you get in that dictionary will be associated with steps in your processes. And then those processes will be broken out by swim lanes. And then you can take the step of documenting your processes and folding that into your procedural documentation, i.e. your SOPs, and at the same time identifying connections to actual systems. And then lastly, the project timelines. Take, again, the tasks, and put it into a work-down, breakdown structure, where you're then identifying who's doing what and when and the interdependencies.
Next slide please. And lastly, what I'd like to do in this slide is talk about just some of the other considerations that you need to think about. And those are things like the Identification of Medicinal Product, IDMP, or General Data Protection Regulation, GDPR, or things like the DIA Reference Model, that fits into things like how you might organize your clinical trial-related information.
Each one of these are either regulations or standards that really will be mandating, to some degree, how you approach the development and compilation and submission of data. With IDMP there are no fewer than five ISO standards that come into play.
Whether it's related to the medicinal product on the one hand with ISO 11615, or going all the way deep down into, let's say, the pharmaceutical product where it's 11616, and basically you know, governments around the world, they want to understand where product is being sourced, all the way down to the incipient level of production.
With GDPR, as we know, if a person says they don't want to be in your systems, we need to take them out. And with the DIA reference model, this is going to really guide how a submission goes to FDA. So, that's an overview, and I hope you appreciated it.
[Keith Parent] Move on a minute; there we are. Now, we're going to pick up, um, I'm going to try to take a lot of the overview stuff that Mark and Glenn and Ron have spoken about, and I'll try to bring it down to practical level where, where we deal with on a very regular basis. So, where are we actually storing that data? Um, is it in-house is it on our in serv– servers inside, is it on somebody's laptop? Is it in a cloud-based system, is it in, you know, a hybrid environment somewhere between the mix?
Maybe some of you guys can see yourselves in the fact that you're storing things in your e-mail systems. It has, you know, that happens all the time, whether it's in a public or a private cloud environment or open and closed. And we'll go into a little bit of that later on. Next, slide, Marianne.
What we think about is, we think about some of the standard you want to live by, and the life science world, we particularly have to think about things like development, test, and production environments, that we put our systems out there. When we're dealing with content and document management systems, are we giving ourselves the ability to validate those environments? Making sure that as we're looking at content coming from one system to another, have we validated the process of moving that content?
We work with a number of companies that will, will go out and they'll buy compounds from another corporation, Maybe it's a large pharma that, that stopped work on a compound. Somebody gets the management team, they put things together and they want to start. So now they've got to take all the documents on, have you validated the process of moving those around? When you finish up a clinical trial, have you taken the act of archiving that data, or you have, you done any kind of file or folder locking of the documents that are there or the data that's there? So nobody else has that. You have a responsibility to make sure that nobody can change any of that data over the course of the trial. So that's your job, to make sure that, from a content perspective, you're governing who has access to that, and how they get to that. Next one, Marianne?
Data governance is a huge area that I think that each of the different, the previous speakers spoke a little bit about and take those into totality when you think about that. I know that, that Ron and Glenn and Mark all talked about different pieces that go into corporate governance, or for the data governance.
Particularly, when you start to think about, how is data going to be used across those silos? Ron put up a very nice picture of an area where there was different departments within a company. When you talk about corporate, legal, you talk about regulatory clinical operations, everybody may call a specific product, or a project, by something differently.
So the concept is, how do I put a data governance plan around defining how that data goes across? So that's where that data dictionary comes in. That's where some of the content types, and some of the metadata that we define, we may have a table of metadata that all points to the same type of thing. But it's used differently, depending on where it is in the drug development life cycle.
In this case, we also want to talk about security of the data. Our, only my internal people have access to it. Everybody's working with a lot of external people now. So, you may have a CMO that you're working with, you may have a CDMO doing some manufacturing for you, you may have a CRO that's gathering data, somebody that's doing an EDC system for you and pulling data in. So, working with external vendors, how are that, how is that data going to be transferred to you? And what are you doing with that data?
The granularity of that security: is it set up so that only a certain department can see that? Or are there are certain people that can have access into that? Do we have outside consultants that we need to look at that? Do I have partners in other companies that we need to have access into that data?
And then also inheritance: does the security in my environment get inherited all the way down? Those are a lot of things you have to think about. It's very tough sometimes when people go out and they're using systems that they can get that are real easy to allow other people to have access into it. But then it kind of goes against what they want to do as a corporate governance model.
A zero-trust model means that nobody has access to anything and you have to actually grant them access into that and be able to get into that. Or some people actually open it up wide and then shut things down. It all depends on your organization, how fluent you are in IT, and what you want to do from a security perspective.
And then, depending on the types of systems you have, whether they're open or closed: an open system is one that's using a public cloud environment. You may have a third party tool that you're using, that people have access into it, an SFTP site, things like that. Or it could be closed where it's all within your own corporate, and government, your corporate environment. And you have that governance of those systems by your IT department, by the security group within that department. All of those things are kind of tied together. Next, Marianne?
And we talk again a little bit more about that whole external collaboration aspect. Secure data sharing. How do we make sure that they only get access to the data that they really need to see? And not more than that. Again, part of that, that governance plan you put in place, that inheritance you might put on the data itself, how do we ensure that when somebody takes the data, they can't use that elsewhere after they are using another product? That, many of you may have been involved in projects where you're working with multiple joint venture partners, and they force you to have a laptop that's specific to that one environment. It makes life very difficult when you're trying to do your day job. And now you've got to work with a different computer just for that.
I know what there's a lot of CRO companies that we work with, that, the CRAs may actually have, um, three or four different laptops, because they're working on different projects, and they have to get through certain systems with the different sponsors they have. It makes doing your work a lot more difficult. Then you may just have outsourced resources, sometimes, particularly in the regulatory space, or in the clinical space, you may need to have a certain, whether it's somebody that's specific to that therapeutic area or specific to that regulatory area. You may need to have somebody that's a specialist in EU or Health Canada, and you need to have those people brought in on a project.
How do we give them the same governance and access into our data and content? I know that Mark talked a little bit early on about that content, content types, things like that. And when you're getting in a lot of documents, like we did a project where we were looking at 1572 documents across, you know, 150 different sites of a clinical trial, and trying to pull the content off of those documents. Um, it's important that you're able to use content as a service. I think Glenn actually talked about that as content, using APIs to get content out of particular documents. Next, Marianne? Next slide.
There we go. Um, and then, we're talking about technology. And this is, and I want you all to realize that this is going to be a Series of learning series, and we're going to dig deeper into each one of these topics that we went over in the first, um, you know, 20 or 30 minutes of this. We're actually going to dig a lot deeper into metadata, and content types, and things like that. The reference models we have to deal with. So, the DIA is a big one for reference models, particularly around ECBD, anything you're gonna submit to the FDA, or other health authorities.
The eTMF, if you're going to be doing trials, you're gonna have to have some type of eTMF. We're gonna get into that and show how technology can work with that. Can you have compliance built in from the start? Can I already have all the artifacts necessary to sustain an audit? You know, I talked earlier about having an audit-ready environment. The goal here is to make sure that your application that you're going to be using for content management are also audit ready.
How do we incorporate new technology? Every, it seems like every single day, whether it's Google or Microsoft or Amazon, somebody's coming out with a new tool for you to use.
How do we put those into the existing environments and get them to be useful? I know that Microsoft likes to throw new things out all the time, and people get, they've got Office 365 and Teams, and all of a sudden, there's a 10 million different applications you can add into that. Well, have you talked to your, your corporate governance? Can you use those? Are they secure enough? Or are you actually giving away the farm, because when somebody on a team can actually allow somebody else access into it, does that go against the corporate governance plan?
Do you have the audit trail built into the documents and the content itself to make sure that you only have the right people accessing that or doing things with that?
And then can I incorporate things like digital and electronic signatures into my environment? And if I do incorporate those, are they actually 21 CFR Part 11 compliant? If I put them into my, can I just take a picture of my signature and drop it into a file? Does that suffice? Or do I need to have something that's going to have a tag that's stored in a database that's incorporated into an actual signature workflow? So those are, those are a lot of things that we have to deal with on a very regular basis. And we're trying to work with companies to understand the technology, what technology does work and what technology doesn't work.
I know that Mark had mentioned earlier, he talked about different content and the different types of content, where you have voice, and video, you may have SAS datasets, you may have a whole bunch of different things. Even your e-mail, when you get back, when people think about that, and they pull over content from a, from a different company, you may have to take a whole database file of e-mail messages, and then have to convert those into PDFs to deal with. Even the concept of PDF: what is a PDF versus a Word document? And how do I work with that?
Mark had mentioned earlier about some of the HTML, they talk to you about some of the HTML behind the scenes of some of those documents. How do we take all those things and kind of combine them together from a technology perspective, and make sure we maintain the integrity of those documents?
As we go along, you know, one thing people don't realize is, there's a lot of times, you will see companies you're working with, maybe a CRO, and there's there's outside different sites that they have for a clinical trial. They're given a form and they print something out, they may scan it in. And then all of a sudden, it's sent to you as an image file, Well, that image file is no longer searchable anymore, because it's now an image. How do we take that, and then we OCR that? How do we do optical character recognition on that document so we can actually get at the data that's on that?
You may have actually lost fidelity of a document based on how people give you back, information back. And you may unwittingly know it yourself when you put something on the scanner and then send it across and not choose the right, the right configuration to make sure that it actually gets OCRed on your scanner in your own office. Those are things that we all, we try to work with people on, to put processes in place. I know we talked, Ron talked about that process mapping and the workflow and the work instructions.
Part of that is to make sure that, from a technology perspective, we're using the technology correctly, so that the fidelity of the document, the content, can be saved as we go through the process. Next, Marianne. And I think that when, one of the biggest problems that we may have is the integration of multiple applications. We may have, you know, best-of-breed applications across the board for doing different things. In this case, you may have EDC as a electronic data capture and you need to extract that data out, and it comes to you, well, what happens if it comes to you in a whole bunch of HTML files?
What are you going to do with those? How are you going to deal with those? You, you asked for PDFs and they came a different way. What do we do with that and how do we, how do we rectify that to get the data out of those? eCTD readers in different repositories: can I, can I look at, um, eCTD is an electronic common technical document. That's going to be the way the FDA is going to look at your submission. And I have different repositories. Can my repository, can I drag and drop onto my repository so that I can actually have my links between my documents preserved? Those are things that you always have to think about from a technology perspective.
EDMS, electronic document management systems, those different repositories, and how they interact with RIM systems. RIM is the regulatory information management. How am I dealing with those regulatory authorities? If I'm only dealing with the US, I've got the FDA. But if I'm dealing with the EU, or Health Canada, or over in Asia-Pac, I've got different health authorities. Um, you may have lots of clinical trials that are happening that are worldwide, all that data goes across repositories. You may actually have different, um, aspects of the same data, but have to be looked at differently by different health authorities. I'm working right now on a, on a group within DIA putting together a reference model around RIM and how that can work together. And then, obviously, you have quality and SOP management. Ron talked a lot earlier about some of the very specific standards of SOPs and what you want to do around quality documentation, and how those things fit together.
You've gotta make sure that you've got, maybe it's your training records, maybe it's your SOPs, all of those things tie together and you may have a quality issue that then ties back to an SOP that has to be updated. You want to make sure that those, those applications can be integrated together, those are all things that we're gonna hit, during this DCL Learning Series, we're going to make sure we're hitting some of the details around that. So, I encourage you all, if you like what you hear, to hear about that, if you like what you heard, each of the speakers talk about some of the overview aspect, as we dig deeper in, we're gonna get into a lot of practical examples. We're going to talk specifically about how we can get into that content, how we can use that content.
We'll give you some tips to see where you can find out information about that, that, those different reference models, whether it's from RAPS or it's from DIA or it's from some other organization that we've had, we've come in contact with. We're going to share that information with you and help you to get get along that, that pipeline and, and understand how each of the different departments within your organization can work together. And with that, next one, Marianne.
[Marianne Calilhanna] Hi, everyone. So I'm back and we would like to, um, before we go into the questions section, we're going to launch another poll. And we would like to ask each of you, where do you think processes at your organization could be improved? And you can select all areas that you think apply. That could be operating procedures and work instructions. Automation of business processes, internal communication, external communication, content structure and standardization. So, we're just going to give a couple of minutes. People are selecting the various places where they feel their organization could be improved. And I'm gonna give one more second, and I'm going to close the poll, and then I'm going to share the results.
So, we have a tie, 63% indicate that operating procedures and work instructions, instructions, as well as content structure and standardization are areas where there could be improvement. Similarly, automation of business processes and internal communication, sort of, uh, vying for second place. And external communication, about 19% of you share that.
OK, we are now moving into, um, question and answers, and please, if you have any questions, I invite you to submit it via the questions dialog box in the control panel. You're fortunate to have four experts in, you know, this area to answer anything you'd like. We're going to start with the question: Can you share the correlation between content and data modeling, and its implement, implementation and structure content management system?
[Glenn Emerson] Is that open to all of us, or did you want to–
[Marianne Calilhanna] Yeah, yeah, please. Please, why don't, I think each of you have something to say, so someone could start.
[Glenn Emerson] I'll start, and I'll be brief. So, the question is, the connection between content modeling, and, I'm sorry, what was the second part?
[Marianne Calilhanna] Um, the correlation between content-slash-data modeling and implementation in a structured CMS.
[Glenn Emerson] Oh, OK. Exactly. So the CMS needs to be configured, first of all, but also, it has to be used by people. And there was one of the top areas in the poll, one of the two tight areas, was the issue with content structure standardization. I think implicitly, that speaks as much to the people working with the data as it does to the data itself.
And if you don't have a clear understanding of the goals and objectives that, Keith, you know, had quite a bit of detail and examples from the data governance side, and Ron from the process side, you have to understand those. And you have to understand, in your organization, how they affect the different functional areas of the organization, and have alignment. Then you need a plan for people to create that data consistently. That's really the key. Because you're dealing with computers handling data. You're not dealing with people interpreting data. The computers really don't interpret; even with machine intelligence, they're very limited to what they can do. They are as good as what's fed to them.
So if you want your information model and your content management system to line up, you need to look at those goals. So what should the structure be based on those goals, and how will the structure answer the goal? And then how will you develop the people that will create that and manage it in the content management system?
[Keith Parent] Just, just to add to that, I think that if you look at, depending on where your data's coming from, if you're dealing with, if it's clinical trial data, do I have an eTMS set up, do I have an electronic trial master file set up? Do I have all the documents that go along with that, have I gotten that from a CRO, is it coming in to me? Do I have to use a special setup for that? Some of you may already know about the DIA eTMS reference model. The issue is really around, you want to use what works for your company.
And some people go all out, and they use every piece of the reference model. But the reality is, if you only use a small portion of it, it's the portion that works for you, but I think it's looking at your data, understanding how you're going to use that, and how are you going to use that across the silos within your company so that you can benefit? I think Glenn to talk to that early on, when he first talked about putting your content strategy together. How is the content going to fit into our overall corporate strategy? And I think tying that data and laying out how the data is going to be used is important in, in adding it into whatever content management system you use.
[Mark Gross] I think it's, it's interesting that we all have different views that we're going, looking at data and content and and, you know, my view is, you know, nothing starts till you have content and we still have to, you know, it's, it's very important to systems to be standardized and to be able to get all the content of the same, in a way that conforms to the system, but the reality is that you're getting content from lots of different places, and it is what it is.
It wasn't, people didn't always develop content in order for it to be taken apart and pulled together, and moved into a content management system in a particular way, especially if it's information coming in from different companies, from different organizations, and from different periods of time. So, you know, my view has always been, how do you take the content that's out there, and standardize it so can go into these things? So, it's, it's, on the one hand you want to work with the ideal that, the reality is that you still have to deal with things that are coming in from different places, but it's sort of like, it's not a hopeless task, there are procedures to do it, there are ways of handling it. And today with artificial intelligence and machine learning, all these other tools that we really didn't have 10 years ago, there's ways of pulling things together even when they were unstructured to begin with. So there's hope out there.
[Ron Niland] If I could defer...
[Marianne Calilhanna] Please, go ahead.
[Ron Niland] Yeah, I'll defer to the others just so we can get more questions.
[Marianne Calilhanna] OK, kind of a follow up to send to this, um, how much is the industry's dependence on Word a hindrance to achieve operational excellence?
[Ron Niland] Can I answer that one?
[Marianne Calilhanna] Yes you can.
[Ron Niland] Yeah, so, you know, it is interesting when you look at the industry, there's such a reliance on sort of tools such as Outlook and Word, but Word, in particular, for developing content, that our label of it is documents, but, in some ways, you know, we had a solution developed long before people started to think about the idea of bringing all this information together. Um, Word on its own isn't bad, but it doesn't necessarily get to the root of the problem. And the problem is understanding sort of the chunks of data that could be repurposed efficiently in the company.
And that's where XML and DITA come into play, and the idea of data modeling. So in some ways, companies that have a lot of Word documents, they, they need to go through this exercise of transformation, and I know DCL are an outfit that specialize in this, but, you know, it is a tool, and yet its effectiveness in doing content and content management and sort of information modeling is, it's, it's very lacking.
[Marianne Calilhanna] All right. If there's nothing else to add, I have a question here that really is sort of the golden ticket: are there metrics for either costs or savings for implementing these common practices and common standards within an organization?
[Keith Parent] Can I jump ahead? I think there's two things that you have to deal with when you, when you're looking at costs, whether it's, it's a cost savings or implementation costs. One is, how much rework are you doing today? Because you're not, you don't have these standards in place.
How much do you have to change data? Or how much data can't you find within your organization? It's very hard to understand lost productivity when you're dealing with products that may have different synonyms within a company and different corporate entities are using different names, particularly if, let's say, a company buys another company, and all of a sudden, they've got to inherit that data, and they've got to translate maybe disparate types of data or content into their system. So, you know, the rule of thumb is really, the faster you can put one of these things in place, and start to use it, the more it's going to save you money over time, whatever cost you put it into one of these things, it will save you.
You may choose the wrong system, but it's still going to get you a better consistency than, than not doing anything or using things that cause added rework. So, I think the important part is putting a strategy in place, you know, first, and then looking at a system that fits your needs. Some systems are just way out of– way overblown, way too expensive, and there are systems that are more reasonable, that fit more of what your organization's like. But as your organization matures, you may need to go to another system because now you need that next level. You now have put processes in place. You may have put work instructions in place and now need to have some of that structure that wasn't there before. So I think as, as every organization grows and becomes more mature, there's a place for different applications and content management systems that fit.
[Glenn Emerson] To build on what Keith said, if I may. So I wanted to say that one of – I agree with everything he said, and I just wanted to build on it a little bit. One of the questions we run into is, where do we find the money to do these things? And what he was saying is you may have processes where you're already losing productivity, and in a content strategy phase, that's one of the things we look at: an ethnographic view of where are your inefficiencies now? So sometimes these things start to pay for themselves. And we look for those opportunities.
So that, it's not just additional money, you're spending on operations, but you're actually making operations more efficient through things that a lot of times, people assume these lost productivity things. That's the way things have always been done, so nobody questions it. And it's, we want to come in and look at that itself and say Yeah, from experience, we see where you can change the way you're doing things, as Keith was pointing out, to build efficiency into the process and change where you're allocating your expenditures.
[Keith Parent] Well, it's interesting because Mark, Mark and I actually talked with, talked together about a potential project where a number of your clients, Mark, use SPL, and they they give you documents upfront that you guys have to put those structured product labels together. What's one of the biggest problems you guys run into right now?
[Mark Gross] Well it's often, you don't have all the information together, is really what's going on. So we're pulling information from lots of different places. So that is the problem. And to your point, I mean, that's very important, but I think your earlier point of, you're never gonna get it, you're never gonna get a totally right in terms of what's going to happen 10 or 15 or 20 years from now, because things are constantly changing and developing, and I think the biggest mistake is to say, well, I don't have a perfect crystal ball out there. So, let me wait another six months, 12 months, 18 months, before I do something. And, and, and getting something done and getting, you know, getting that 90% solution or 95% solution, really pays a lot of dividends very quickly, so, you know, to hold off, to get frozen in place is, is counter-productive, I think.
[Ron Niland] If I could just add one, maybe, comment or closing response to the question here, if we're talking about the idea of efficiency, in order to understand an organization's efficiency, you need metrics. Metrics need to be associated with processes. Processes need to be fed information. Information needs to come in the form of data. Going back to that aspect of a Word document, a Word document isn't data, right. Only when it gets broken down into these bits and pieces, the chunks, do you know what the data really is that can drive, then, the dashboards and the metrics and the corporate scorecards. And so, when you do that, you've got that effective report card that you can show to your organization of how you're moving that needle of efficiency over time. And this is a journey for sure.
[Marianne Calilhanna] Well, thank you. And, one quote that's always been one of my favorites, to comment on what Mark said: "Done is better than perfect." Well, we've come to the end of the hour. Thank you, everyone. Thank you to our panelists. Thank all of you who've taken time out of the day to join us at this webinar. Stay tuned, we have a lot more planned in this series, but the DCL learning series comprises more than just webinars. We have a monthly newsletter. We have a blog. You can access this and many other webinars related to content structure, XML standards, SPL from our website at dataconversionlaboratory.com. We hope to see you at future webinars. Have a great day. This concludes today's broadcast.
[Mark Gross] OK, thanks, everyone.
[Glenn Emerson, Ron Niland] Thanks.
[Keith Parent] Thank you, everybody, take care, people.
[Ron Niland] We're done! I guess that's... perfect!