DCL Learning Series

Bridge the Gap: Integrate AI with Structured Content for Pharma Content Excellence

Marianne Calilhanna

Hello, and welcome to today's webinar. Today's DCL Learning series webinar is titled "Bridge the Gap: Integrate AI with Structured Content for Pharma Content Excellence." My name is Marianne Calilhanna, and I'm the vice president of marketing here at Data Conversion Laboratory. Before we begin, I want you to know that this is being recorded, and we'll make this recording available in the on-demand section of our website at dataconversionlaboratory.com. We'll save some time at the end of this conversation to answer any questions you have. However, as questions come to mind, please feel free to submit them at any time via that little dialogue box that says questions.

All right. I would like to briefly introduce my company, Data Conversion Laboratory, or DCL as we are also known. We are the industry-leading XML conversion provider, and we also provide many services that are adjacent to conversion. Those are the services you see referenced here, things like semantic enrichment of content, migrations to content management systems, and various tools for content analysis. DCL's core mission is transforming content and data into structured formats. We believe that structured content is fundamental to fostering innovation and foundational for your own AI initiatives.

We have a lot to cover today, so I am really excited to introduce today's speakers. My colleague David Turner is here today. David is with us from just outside of Dallas, Texas, and David is DCL's digital transformation consultant. He's an industry veteran in the areas around content management and content structure, and he's particularly adept at demonstrating the business benefits of digital transformation and helping organizations identify ROI to gauge their investments in systems, structure, and semantics. Welcome, David. So nice to see you.

David Turner

Thanks so much, Marianne, and thank you, everybody, for joining. I'd like to actually introduce our other panelist today, which is the great Regina Lynn Preciado. Hi, Regina, welcome. I've got a whole slide here dedicated to you. So tell us a little bit about yourself.

Regina Lynn Preciado

Hi, I'm Regina Preciado. I'm the senior director of content strategy solutions at Content Rules. I've been in structured content strategy a long time. I've been in life sciences a long time, and I'm an expert by this point in structured content authoring, component content management, content reuse and automation, and AI readiness, which is not on the bio. And of course, anyone who wants to follow up with me afterwards, I'll drop everything to talk about dogs with you. So thank you, David and Marianne. I'm so happy to be here.

David Turner

All right. Well, let's go ahead and let you talk a little bit just quickly about who is Content Rules.

Regina Lynn Preciado

Sure. Content Rules is a consultancy dedicated to solving complex content problems. Whatever challenge you have, we've probably seen it in our 30 years of being in business. So that's all I'll say for now.

4:03

David Turner

Outstanding. Well, thank you so much. We're glad to have you here, and everybody we're thankful to have you as well. We're here today to talk about a very popular topic. It's the topic that seems to come up everywhere you go, and that is the topic of AI. So to that end, we've got just a fun little poll to get things started. Crank it up there. All right, here we go. Have you attended either a webinar or a conference session related to AI in the past six weeks? So off and running. I don't get to participate in the poll, but I will say my answer would be yes. Regina, I'm going to ask you the same thing.

Regina Lynn Preciado

Yes.

David Turner

Yes is you as well. Probably both, right? Should put that in there. All right, so let's let that thing poll and go. Marianne, are the results coming in? I can't see them.

Marianne Calilhanna

Yeah, the results are coming in. We're going to give it just a couple more seconds. Interestingly, just before this webinar, I too received another webinar invite for AI in drug development, which I just chuckled at.

David Turner

Got to love it. Got to love it.

Marianne Calilhanna

Okay. Oh, we still have a couple more votes coming in. Okay, the majority of voted. I'm going to close the poll and then I'm going to share the results.

David Turner

All right. Oh, look at that. I'm so surprised. I felt like it was going to be 100% yes because it looks like –

Regina Lynn Preciado

I'm happy they came to ours.

David Turner

What's that?

Regina Lynn Preciado

I'm happy everyone has come to this one because I think this one has particular relevance to an area that other people are not necessarily talking about.

David Turner

Absolutely, absolutely. The idea where we got this is, Marianne had been talking about all these things that were happening with AI in life sciences. We do a lot of work with scholarly publishers and that kind of world, and it seems like so much is happening out there in terms of AI applications, in terms of just the research side of the house, the R&D. But on the side of the house where we really deal a lot, which is the whole content related to the drug development process, which I think Regina is also where you tend to be really, really involved, it seems to be slow at best. I'm not sure. I know some people have probably implemented some small things AI wise, but generally people that I'm talking to say "Yeah, we're really waiting, we're thinking about it. We haven't really done much to implement."

So we thought maybe we should explore some of the reasons why. So to that end, we asked ourselves, why is AI adoption lacking in the content part of the business? So the first thought was, well, maybe it's a lack of really good use cases, but I'm pretty sure that is not the case. In fact, I've got a no, I kind of sign here. I think there's plenty of use cases. First of all, the one thing that we all normally think about is generative AI, or "GenAI," or a lot of people just call it ChatGPT, or when you use CoPilot or whatever the other tools are, there are plenty of those. Can you talk about maybe some of the basic GenAI use cases that you've been seeing or talking about?

7:58

Regina Lynn Preciado

Yes. In fact, I was just recently talking to some of the companies that are on the forefront, the early adopters, the one who, when they hear us say AI adoption in content is lagging, are going "What? We've been doing it for a year." There are some really good use cases out there, as we know, creating first drafts, and I think we will talk about some of these, and there are some real reasons why you can't just go in, install the tool, and expect to use it. One of the reasons is, the health authorities are still kind of issuing the guidelines on what you can use, so nobody wants to adopt and then be in violation of that. So I don't know, I think –

David Turner

But in terms of just GenAI use cases, I guess we can't think about first draft creation wherever, maybe create me such and such label, create me such and such protocol document. Really, I guess there's use cases around diagrams, first drafts of those.

Regina Lynn Preciado

Diagrams, and then there's just a lot because we do have a lot of repetitive or redundant content throughout an entire submission. So because of that, for humans, we now have a submission structure where there's a lot of summaries because the people reviewing this don't need maybe to know all the details about something. They need the summary. So we have a lot of layers across all the documents today, currently included in a submission, where GenAI is actually helping – right now I think we're using it to create those layers. Ultimately, we'll be able to use it to not have to provide so many layers and copies and redundancies. But I'm getting ahead, I know, of what we're going to talk about today, so somebody stopped me.

David Turner

Yeah. So I would think another typical use case, everybody out there is probably there thinking "Well, when I type up my label or my protocol document in my Word document, I get the predictive text that Microsoft is – " Sure. So yeah, there are those use cases, and those things are happening. We're talking about the AI adoption lagging, we're talking about really the bigger huge things that companies are going to invest in. I would say, there's also a lot of use cases that are really beyond generative AI. We've got a couple of screenshots here of a tool that we have here at DCL, that is really built on NLP, it's AI-based, but it's not the GenAI that everybody thinks about.

Harmonizer goes and it looks at all your different pieces of content and identifies where you have duplicate content, potentially reusable content. It can also be used to really large groups of content together. So we've seen organizations use this where they say "We've got a certain piece of regulatory text. It's got to be in all of our documents and it's got to be exactly the same across all these documents." This runs through there, and it lets Hey, this one out of 60 does not match up." So I certainly think that's there. I think there's some potential use cases around things like extracting data from a PDF. We've been doing a lot of tests on that or training a PDF to look at a label or protocol and be able to recognize the sections and pull those out from a PDF, it's more than just texts, but really putting some context around those texts.

Regina Lynn Preciado

Yes, one use case –

David Turner

What are some of those things you've heard about?

11:57

Regina Lynn Preciado

Sorry, David. One of the use cases I've worked with is actually in two different place, but some teams that are working on the CMC content, so for the quality module three of the submission. Some of that manufacturing quality data is based on what everyone refers to as external reports, I'm sorry, internal reports. Even though they come to the pharma company from an external source, we call them internal reports because it's not just a pass through to a submission. Data comes in. Traditionally, writers are copying and pasting out of a PDF or maybe a Word document, but often a locked PDF, into the documents that ultimately go for the submission.

And there is a role here for AI, not necessarily generative, but an NLP, which is natural language processing, and some NLU, natural language understanding, because the different manufacturers or the different partners that are sending in this data use different systems and they structure their data and information differently. So we're able to throw a system in there to find the types of data that we then need to put in our quality documentation. I think most companies right now are still a human in a loop looking at it, making sure, because of course when our sources are inconsistent, our results cannot be 100% accurate and consistent. But what it can do is a much faster process of finding this information, understanding that this information from here is in one place, and from here, it's in another place. It may be sort of one's in US and one's in UK English, but it's the same thing.

The machine can kind of harvest all of this, compare it, package it together so that your expert writers are not having to do that time-consuming part to get to the part where they really bring their expertise to the content to say "Yes, this is what we need in the submission and it needs to go here." Again, that's a CMC example. The similar process happens in safety, in clinical, and other documents. But again, these are some use cases beyond generative AI. I think generative AI is newer and shinier and it's where we're all getting our hands on it to see how this really works, but it's part of a larger ecosystem.

David Turner

Yeah. I've heard some really interesting use cases lately that I don't think would fall in the GenAI bucket, but I thought what would be really useful in this pharma space. One I heard about was an organization looking at AI as kind of a companion or a little tool, a little chat tool to maybe be an assistant to help somebody who's typing up a content operations procedure to be able to stay on track. "Hey, the next thing you need to write is this. The next part you need to do is that." Or in a similar vein, I've heard of an organization taking AI to look at all their different standard operating procedures and compare them to GCP and things like that.

Regina Lynn Preciado

And what people are actually doing to get us more lined up with the procedures we have all agreed and in some cases proven to the regulators that we're doing. Mm-hmm.

David Turner

And then there's the whole idea of could it predict to recommend reuse. Somebody started to type something up and it says "Hey, I recognize this paragraph from somewhere. Would you consider using that paragraph and bringing the whole thing up instead of having the person write the whole paragraph and create a duplicate?"

16:00

Regina Lynn Preciado

And we know that one reason this is important, I was talking about the redundancy and the repetition of information because of different purposes written by different people at different times, reviewed by different people at different times.

David Turner

Yeah.

Regina Lynn Preciado

Every time we use a slightly different word or phrase to mean the same thing, we introduce friction for the reader.

David Turner

Yes.

Regina Lynn Preciado

They go "Oh, no, did that mean the same thing as before? I better double check." Every time we come up with a better way to say something than whoever wrote it a year ago that we've copied and pasted and now we're tweaking, we're actually introducing cost because we're introducing – somebody has to decide, is this the same meaning? And I think about, David, you mentioned labels. Labels are a good example because everybody knows labels eventually get localized for – the printed-out label that comes in the box is localized into many different languages. Every time somebody tweaks the content and introduces different wording, it has to be re-translated or translated differently.

David Turner

Yes.

Regina Lynn Preciado

So there are costs to every copy edit, which is painful for a lot of us who do just naturally want to start with something and copy edit. The other reason for this consistency is findability. If you're a reviewer and you're trying to find – "Okay, I want to go find the details. I clicked a cross-reference, and now I need to go back to where I was and I'm going to find – " and we use different words, you can't find it. So again, AI is very helpful in the searching because through natural language understanding, the AI can start going "All right. Well, over here we said this and over here, we said that." And the person who's doing the search really wants to find that, and it's the same thing.

You'll see this in your companies where people are rolling out internal search engines to search the big blob of inconsistent information that every 100-year-old enterprise has. And you'll see that it's not a hundred percent because the blob it's searching is not a hundred percent, but it's getting us closer. At least we feel like we found something faster and then we have the opportunity to start training it and go "Yeah, that's what I wanted. No, it wasn't." Or "Oh, look, we have 15 versions of this information."

David Turner

Well, we could probably come up with a dozen other cool ideas for this where we could go, and I think about identifying candidates for a drug trial. I think about some applications around drug fraud. I'm going to throw this actually back out to our audience with another quick poll. Marianne, you can throw the poll up there. I think it's coming. There we go. So just out here to the group, are you currently working on some sort of AI initiative related to content? I'd be interested to see what the results are. Again, we're talking about the bigger things, not the fact that Microsoft Word is predicting text in your document, but something a little more larger and something requiring like a company investment, if you will.

Regina Lynn Preciado

Yeah, an enterprise level or pilot. And I'm thinking too, in life sciences in particular, content is an interesting word because together, data and I'll call it prose, sentences, make up the content. In other industries, there's not quite as much convergence of data is content and content is data, so.

David Turner

Interesting on the poll results. Are you surprised by this?

Regina Lynn Preciado

This aligns with the people that I've talked to in the industry, that there's a lot of companies working on it, looking at it.

20:01

There's some, I don't want to say hesitation, but there is some awareness around "We're working on it, but we can't really launch it into submission documents until we get our full guidance from regulatory bodies." The EMA has put out some guidance. ICH is working on some guidance. The FDA I think recently said "Hey, we're just going to go with what ICH decides." But I have yet to see that firming up of the decision of we are okay and we are in compliance, but that's great to hear that people are really looking at it.

David Turner

All right. So we've talked now about these use cases that are out there, so we don't really think that's why it's lagging. We're now going to talk a little bit about, is it the lack of value? And I think you're going to quickly see that the answer to that is also no. I wanted to bring up just a couple of things that I have seen in some recent webinars and talks and reading that I've done that I think are pretty potent for this topic. The first one here is something from McKinsey, and you can see here is the link down here, but it says that AI could generate $60 billion to $110 billion a year in economic value in this industry.

That's a staggering number to me. I would encourage everybody to check out the report. We'll put it on a document at the end where you can get that link so you don't have to try to quickly type that in there. Another one I saw was from IDC and it said "For every dollar the companies are investing in AI, they're realizing 3.50 in return." So three and a half times what they're investing that's substantial. And then there's a small group out there that's realizing something like $8 in return. So I think the benefits are huge and very promising. And I think –

Regina Lynn Preciado

Anyone on this call or on this webinar who is working for these AI companies has just gone "Oh man, we need to raise our prices." That's not the point we're trying to make here.

David Turner

It's definitely got value, and I think it's got value not just in terms of financial. I looked at PubMed's most recent reporting year, which was the year 2022. When you run the numbers, basically a new research paper was accepted there on PubMed every 18.4 seconds. So in the time we've been talking, how many research papers have been added? It's just impossible to keep up with that kind of information, and it's just getting more every year. So I think that AI is going to be really, really, really critical. So with that, I'm going to jump to the next slide, which is, so why is it lagging right now? I think you started to hit on some of this before. So what are your thoughts on why these big enterprise investments haven't really progressed a long way yet?

Regina Lynn Preciado

Yeah, so this is interesting because I feel – in my experience, right around the time OpenAI launched ChatGPT to the public and that it became the thing and you can't turn around without running into AI anymore, there was a perception that AI could solve everything, and you just install it and you have a solution. Now, a couple years later, that was what end of 2022, a couple years later, a lot of the companies, particularly in the content development organizations, the medical writers, the technical writers and CMC, labeling safety, report people, are seeing that actually no,

24:01

it takes some work to get the AI trained and ready and useful. So I think the lagging that we're seeing is that sudden realization that AI shows the worst of our stuff and makes it faster. So every inconsistency, every unconscious or conscious bias in historical content, everything we're using to tell the AI "Go crawl this, and from this, make a summary" it's exposing.

Everyone's done the best they could for a long time, but the best you can do in word processing and document control and basically doing a digital version of handwriting on paper is very limited from a digital perspective and a data perspective. So I think what we're facing in the industry is this realization that just like going to a component-based structured content type of system to facilitate data-driven submissions in the future, all those same things that at the time some companies went "No, we're fine, we're just going to keep being in documents "all those same things kind of get you when you're installing AI. So you're limited too. I think this is so funny that I'm saying limited too for this brand new emerging tech. You're limited too.

My GenAI CoPilot is sitting here, and it's suggesting some reason it's writing first drafts and it took my stream of consciousness and made it a bullet list. It more usefully maybe went out and extracted some data that I need, or, David, to your point, suggested "Hey, this was written before, why don't you just reuse what was written before?" So in conclusion, I think we're facing that there's a lot of work humans need to do, some of which we have some really good software tools to help us with, but to really make the AI solutions viable for content development. Some companies are there already, many companies are on their way. I think the more conversations people are having, the more it's helping everybody, sort of rising tide, floating all ships situation.

David Turner

Yeah. I think your company has some sort of a saying around this. It's like, if you have crummy content and you invest in expensive technology, you end up with – I don't know exactly how it goes. But you have a really expensive crummy content or something like that.

Regina Lynn Preciado

I'm not saying anybody, especially in this webinar, is writing crummy content. It is more that feeling of, at scale, humans working in individual documents even with document sharing and trying to be in a single source compared to a component-based and structured and much more thinking about units of information rather than a full-on paper. You just run into some barriers.

David Turner

So I think if I were to put what you just said into just a couple of bullets, I'd say number one reason is, in some ways the content's not ready. The content, it's in silos in these different groups. It's not consistent in how it's used, it's not been consistently applied. It's very much locked into these documents. There's also this whole idea that this industry depends on accuracy, and for these reasons, GenAI is not accurate enough for really final use. And then I think also one thing you mentioned earlier but I think is also really key is, I think there's some policies and some regulation,

28:00

health authorities, et cetera, that as we await those, companies need to be careful and want to be careful and I'm glad that they're careful. So anyway.

Enough with that; let's move on here and let's talk about what can you do to start working on AI besides wait. And I would say one of the first things that you can do is start preparing your content, maybe start addressing some of these big content issues by moving to a structured content type of a system. Before we get too far, we probably do need to define our terms. So I put together just a quick little what is structured content slide, and at that point I'll just be quiet and let the expert talk. Regina, talk to us about what is structured content and then talk about structured content offering versus structured content management and all those terms.

Regina Lynn Preciado

Yeah. And I keep bringing this up, this up. So when we talk about structured content, we are talking about having individual units of information, which we often call a component. I usually call it a component. You'll hear "topic," you'll hear "unit," you'll hear "building block," but a component of information. Pharma content is structured in the sense that there are templates that are documents with outlines, and the industry has agreed on a lot of standardization to a certain level. Section five of a clinical study report is always about section five of a clinical study report, that kind of thing.

If you didn't have information for a particular section in a label, you just skip that section. You do not take section 16 and move it up to 15. So it's taking that kind of concept much more granular, whereas this is my component of information about, I don't know, a drug-drug interaction, and that's it. There's no extra information, it's just about that. And then it becomes very reusable. So you end up with kind of a library of individual bits of information that you can then assemble into many different outputs. I often work with regulatory content, so I'm often thinking about submissions. I've also worked in MedInfo. It's a great use case for standard response letters, really. Writers can write these things faster. Once you get trained and learn how to do it, you can assemble faster and the machines can really help us. The structured part is, there's some markup that the machines understand in the background that says it's metadata or tags.

I don't know how many folks on this call work in this area, but you can tag the content so that this is a drug-drug interaction, this is about a demographic population group, so that the machines can more easily search, retrieve, assemble, and in the case of AI, it helps the AI right away, when you're training it, know. And then you start in your searching and retrieving with the AI; you're not getting the wrong information even if you've searched a term that could be used in many different ways. So that's structured content. I think I get asked a lot "Well, now that we have GenAI, do we need structured content?"

David Turner

And that is exactly where I was going next. That is probably the single biggest comment that we get, is I get people that'll say to me "Oh, well, are you guys going to be out of business? Because we've got AI now. Doesn't that eliminate the need for structured content?" And I think I know what you're going to say. I think the answer is...

Regina Lynn Preciado

No!

David Turner

No!

Regina Lynn Preciado

It elevates the need for structured content. And here's one example –

David Turner

Absolutely does.

32:00

Regina Lynn Preciado

So we've talked about training and training the large language model because, of course, we're not putting our proprietary intellectual property or our patient information out into public.

David Turner

Right.

Regina Lynn Preciado

Everyone's obviously working inside your own firewalls on your own large language models. Because you can train the AI faster because you're giving the AI structured consistent findable information, you're reducing the incidence of the AI just making stuff up, which the technical term for this is hallucinations. It does require less processing power because your enterprise AI is not having to comb through a blob of who knows what. It knows from your very structured information. The information's more predictable so the machine can make better predictions.

David Turner

The example I use with that is always around Lego bricks. So my boys and I sometimes like to work on these big Lego projects, and we created at Christmastime a whole big Nativity scene. We started, we had a big just bucket with just a bunch of Legos in it, and the time it took to find a gray two-by-two piece, sorting through that was pretty substantial. But then when we stopped and in the downtime we divided everything up into basically by the size, so we had a big tub of just two-by-two pieces, think of how much faster when we said "Hey, we need a gray two-by-two "I could just go to this two-by-two thing and find one that's gray as opposed to looking in a great big bucket there. I think it works the same way with AI. It takes so much more processing power to pull out the right information from this mass as opposed to a nicely structured ordered group.

Regina Lynn Preciado

Yes, and the nice thing is, you can be structuring, tagging, organizing your content starting right now, well, after this call, in preparation for however your company is going to use AI in the content's development and management. I will say a reason. So AI, particularly generative AI is transitory content, right? It's looking at what you had in the past and it's generating something new. If you just do that in your word processing document and you have it in that document, it's still wrapped in that document that has to be managed as a whole document. And if you need to find the information, you have to search within whole documents, find that little area.

It's probably not tagged in any particular way to show "Hey, this is the section about, I don't know, the pediatric study that for this." Whereas when you have a structured content ecosystem, every component is, we call it persistent content. This content has a full audit trail. Every change ever made to that unit of information is tracked and has your name on it and the timestamp of when you changed it. You have traceability. If you are reusing content, you can see where that content appears in other outputs like other documents. If you have integrated some data into that content, you can trace, okay, this component pulls data from this source. You don't get that with the generative AI piece. That's more like a chatbot drafting an answer that's good for now or we hope it's good for now,

36:01

or maybe a first draft that you're writing, maybe someday a final draft. So the two kinds of technologies work really well together. The AI technology can help you search through your components to find the one you need to reuse or update. The generative AI can search through your components to generate content for you. But the generative AI is, it's not managing the content, it's not keeping an audit trail, it's not providing traceability, it's not versioning. It's not even reusing, it's sort of creating derivative. It's creating more content. So if you're able –

David Turner

And it's not publishing content either. It's not really doing any of those kinds of things that really structured content can do. I think the whole concept around the persistent content is so critical, and I think identifiers that you can add to structured content make it just that much more useful that AI can't do on its own. Example I think of there is, when I was in high school, I remember getting called down to the principal's office, and I had no idea what was going on. They said "Well, this is about all those detentions that you've been skipping." And I was like "What?" Well, as it turns out, there was another David Turner. Story of my life.

If you look at LinkedIn, there's four or five Regina Preciados, there's over 4,000 David Turners. So when your GenAI tool is just looking in this big massive content, knowing who the right David Turner is, it's hard to know those details. But if you can put some identifier on it, we're looking for this particular David Turner with these characteristics, it's going to just much more ensure that you've got the right piece there. So I think it definitely enhances, I think it helps it go beyond. And as you were saying there, structure content also works closely in conjunction with AI. And we had talked earlier this week about all of these things, which is by no means an exhaustive list, but I guess the point is, a structured content can take you to the next level.

Regina Lynn Preciado

Yes, and I think that a reassuring thing is, again, the things we do, the tasks we do to take a team from word processing and thinking in documents and adopting a new mindset about units of information, which again, these documents are to a certain extent follow similar outlines. So it's not as new for medical and pharma writers to think about this as it is for some other industries except that people with the academic and scientific background are just, you're so steeped in academic papers and scientific journal articles and that type of writing. It takes a moment to kind of go "Oh, but we're going to do it this way now."

However, the same things we do to identify content by its type and purpose to create those modules or components of content, to tag them with metadata consistently, this is not making up your own hashtags as you go. This is developing standards that everybody, all the writers, whether they're human or AI or a combination, are following the standards to bring consistency, to make things automatable, to make things findable. It's the same things that we do. So if you're thinking "Oh my gosh, we're working on our AI pilot, we took all of our attention onto AI. We're not going to look at structure or vice versa. We're halfway through our structured content transformation and now this AI is coming in, what do we do?" It's the same things that you do.

39:56

David Turner

Absolutely. So if you haven't heard anything in this presentation yet, we're trying to get across that there are a lot of use cases for AI, and there is a tremendous amount of benefit ready for your organization, but the things that are going to slow you down are when you start looking at your content and the underlying problems with your content. If you can start working on a structured content type of approach to work hand in hand with your AI, it's going to help you take advantage of it and realize those benefits that much faster.

Regina Lynn Preciado

Yeah.

David Turner

So with that in mind, let's talk a little bit about how do you get started with this whole idea of structured content to really start getting you ready for these future AI applications. Are there some tangible things that you can do? I would think one of the first things is, it's a really good idea, as you start out with any plan, to start thinking about strategy. So Regina, talk to us a little bit about structured content strategy, what goes into that, what they should be thinking about as they're starting to align this AI piece into that strategy.

Regina Lynn Preciado

Yeah. Again, your structured content strategy has several components. The content models are about, this type of content serves this purpose and we're going to write it in this way, include this information in this order. The metadata is, these are the tags we're going to put on our content that everybody is going to put on the content because, again, consistency brings you findability, accessibility, interoperability, and reuse. Consistency for humans and machines is everything. You need to have a reuse strategy. People sometimes get very excited about content reuse, and they start reusing all over the place and actually end up tangling your content into a virtual knots. Other people don't believe they have reuse because of a 20-year career of copy, paste, and tweak. Or of course, I'm not talking about where you would always generate new content, talking about where you're repeating. So reuse strategy about what are we reusing and where.

If you plan ahead, then people have much better adherence to the standard because the standard is clear. Taxonomy is how you're going to organize and classify the content, and then workflows. Getting people working in – people will have their own business process, but the content itself can be tracked through a workflow. Typically, going straight from first draft to final is not enough detail in your workflow. If your components walk through a workflow where from, I don't know, draft review to ready to published, you get a lot of power in your audit trail and in your reporting and in everybody knowing the state of every type of content. And again, starting with this kind of plan, then you're going to move to transforming the content or transforming existing content and or writing your new content to this new way.

These are the same things you would do when you're bringing AI into your strategy. Some companies, because the attention has been on AI, have thought about this the other way. They're using pre-existing models maybe from some of the big tech companies like Amazon or Microsoft, or they've got sort of bespoke AI models they're working with through other vendors or even homegrown. But they're looking at the AI strategy. What do we need to clean up? What do we want to get out of it? Remember to start with, what is our business goal? Is it to be faster? Is it to be more accurate?

44:00

Is it to contain cost? Is it all three? And then they're starting with the AI focus doing these things and then kind of looking like, well, you're doing all these things, what a perfect time to structure the content with whatever tools you need to purchase or configure differently to support.

David Turner

Regina probably won't toot her own horn here for her organization, but this is the kind of thing that Content Rules can really work with you on, is helping you stop and take a look and think about things like "What is our taxonomy going to be? How can we maximize our taxonomy in light of this AI strategy that we're trying to take on? What is it that we need to do in our metadata that's going to allow us to get the most out of it?" So we're talking to a good consultant, a good partner, like Content Rules is really a good step and something that you should consider before you jump headlong. And I think another point, and I captured this from something you had said in an earlier presentation, what makes content better for humans makes it better for machines. Can you elaborate a little on that?

Regina Lynn Preciado

Yeah. This goes back to anytime you're searching for something as a human, you just want some information about something. I don't know, your elbow pain, you're searching elbow pain, it's the middle of the night, you have elbow pain, and you get a whole bunch of search results from the search engine and you don't know which ones to trust. And now there's the GenAI. Google has the generative AI experimental result where they're pulling from different sources and giving you some interesting wrong step-by-step information of things most of the time.

So it did not make content better for humans yet to throw that GenAI on there. But if the content we're searching were structured and tagged and focused, then it would be better for us humans. Even without all the technology, you're handwriting a book and you want humans to be able to find information in the book, not a novel, non-fiction book with information. You do things like a table of contents and index, running heads on the top and the bottom that have – if it's dictionaries, the first word on this page is up here, the last word on this page is up here. We have all kinds of conventions around navigation that are about finding information and keeping information focused, tagged in a meaningful way consistently. That's what automation depends on, consistently. Humans depend on the meaningful.

So again, if nothing else, I want to do this as kind of a reassurance but also kind of a wake-up because I think a lot of people don't understand this, that taking your legacy content and throwing it into these fancy new tools doesn't get you the result, and your technology investment in the new tools, investment of time, money, energy, and fear. Content creators are really worried right now and are needing to be looking at "Well, actually this is taking over some of the more rote repetitive work that I do.Where does my human expertise really come in?" And I'm going to be doing that more than I'm doing these repetitive tasks. Anyway, sorry.

David Turner

So thinking about the way you write in light of these feature area applications, I think, is huge.

48:02

I also suggest people think about accessibility. If you think about accessibility and people with special accessibility needs as you are writing content, whatever content you're writing, it's going to make your content better in whatever format and it's going to make your content better for machines to read. So that's just something to keep in mind. Anyway, we're starting to get close to the end here, so let me just cover a couple of quick things.

Another thing that you can do to get started as you're thinking about an AI implementation, things like that. Content Rules does have an ebook resource here, and you've got the QR code if you want to go to it now. We'll also include this link with a link of resources, and I would say, most people say "All right, start looking at resources." Well, you can start looking at resources. We actually are going to tell you, as you're getting ready now, as you're thinking about this, as you're going about your strategy, definitely start exploring. I'm going to put together a list of white papers and articles. I've seen some out there already, kind of getting close to the end of time, so I'm not going to go into too much detail.

I would encourage you also start exploring some of the AI life sciences conferences. There's one in London, Digi-Tech Pharma and AI. I think that's at the end of this month. That's one that you might want to consider. There's a Pharma AI summit. It's not specifically for documentation, but that's coming up in September. There's an another called AI in Pharma that's in October. I'll put links to all of those in there as well. I know, Regina, you're going to be speaking at DIA again I think this year. Is that right?

Regina Lynn Preciado

I'm sharing a session about specifically speakers from Sanofi, Eli Lilly, and Novo Nordisk are sharing their use cases of how they have implemented AI in their medical writing processes in different, actually I think it's all in clinical, but with different levels of what they're doing. And they are being very transparent about the challenges they faced, how they overcame them, and the metrics, the results they're getting. So if you are going to the Drug Information Association global meeting in San Diego next month, starts on June 16th, June 17th, there's a whole track about AI that is content focused. It's been very interesting. I can't wait until this is public and we can talk about it.

David Turner

Yeah, I love it. If you go just look at the – well, I'll include these again in that market, but there's things like accelerating time to market with smart structured content reuse and AI. There's AI and life sciences promise or peril. There's decoding patient journeys at scale through clinical AI. There's a ton of AI related sessions this year. I think you should also start exploring some of the AI tools and technology that are out there. You probably don't have to look for them. Probably somebody is already emailing you and suggesting that you look at them. So I would certainly recommend you do that. And I'll include a couple of resources that I know about. My new favorite one in this space is, you remember the old, I think it was Apple that did it several years ago. It was like, there's an app for that. Remember that whole marketing series?

Regina Lynn Preciado

Mm-hmm.

David Turner

Well, somebody's come up with a website called There's An AI For That. I think it's just theresanaiforthat.com, and it's essentially an AI tool to help you find other AI tools. So I think that's kind of awesome.

51:58

Regina Lynn Preciado

Welcome to the metaverse.

David Turner

Yes. All right, so let me just hit – oh, and if you want these resources afterwards, shoot me an email that says "Hey, I want those resources" and I'll get them packaged up and sent out. It'll probably be the first part of next week. So yeah, just email me, here's my email address, and we'll take it from there. So before we get to questions, just a really quick wrap up here for you. So what we're trying to get across to you is that, yes, that the enterprise scale type AI adoption things have not quite maintained the same pace as what's going on on the research side of the house, but they are coming. There's a lot of great use cases. There's a tremendous amount of upside there.

But the truth is, there are some issues that remain, that have to be resolved, whether that's issues with your content and optimizing that content. Just because you have a data lake, that doesn't mean that it's a useful data lake. And there are going to be some things that are going to change with health authorities. But there are some things that you can do today. You can get started really by implementing structured content, improving your structured content, implementing a structured content strategy, reuse metadata taxonomy. Get things cleaned up, start writing with this in mind, and you'll be ahead of the game once everything is ready to go. Before I jump into questions, any other last-minute comments and summary from you, Regina?

Regina Lynn Preciado

I thought that was a great summary.

David Turner

Excellent. Well, with that then, let's bring Marianne back on. I think we got five minutes here, Marianne.

Marianne Calilhanna

Yeah.

David Turner

What questions did we get?

Marianne Calilhanna

Okay, so regarding structured content, is there an industry standard markup that's being used?

Regina Lynn Preciado

interesting question. So there are industry standard, I'm using air quotes, I don't know if you're looking at the video, I'm using air quotes, industry standard markup languages for structured content in general.

David Turner

There are a lot of them.

Regina Lynn Preciado

Read known ones, and just general structured content are DITA, D-I-T-A, DocBook, doc like document and then book all in word, and STE I think is the other one, or STS. Oh my gosh, sorry.

David Turner

Yeah, it's the standards, STS.

Regina Lynn Preciado

In pharma, there are some standards coming out of CDISC for markup of all kinds of medical. It's the unified study data model at CDISC that has some markup in different schemas for clinical and medical information. There are, as a million standards for everything, and we see that FHIR, the F-H-I-R, HL7 FHIR is being adopted as the messaging standard, which is the interoperability for a data-driven submission we're going to send from our company to a health authority.

That seems to be the one that has won the standards game, in part because I think a lot of content people don't worry about APIs and the integration of technology. So that was easy to adopt, whereas the content markup has been harder to adopt. There are different initiatives at ICH and CDISC coming up with some XML markup standards.

55:57

A very easy-breezy way to say this is, almost anything that's marked up in a consistent XML structure can be transformed into almost any other XML structure. Any developers on this call just went like this. But from a content perspective, that's almost true.

David Turner

What I tell people is that there are a lot of delivery standards. This group, for this purpose, you need this. For this, for that purpose, you're going to go to the FDA, you got to have SPL, XML. We know if you're going to do an EPI mission, you need FHIR. You look up, there 16 different things on the CDISC website. That doesn't necessarily mean that you want to write your content in all of those different models. A really good way is to figure out a good central model, like a DITA, type of a general purpose, and then you can create basically what you call transforms where we can then transfer. Okay, so I've written this EPI and I've tagged it this way. So now I press a button and it creates the FHIR XML that's needed. It creates the SPL XML that's needed. It creates the HTML that's needed from our website, so on and so forth.

Regina Lynn Preciado

Yes.

David Turner

So if we can answer more questions. Sorry, I'm on one question.

Marianne Calilhanna

Okay. So –

Regina Lynn Preciado

Go ahead.

Marianne Calilhanna

That's okay. I was starting to type a response to someone and it got away from me. Someone asked if this is being recorded. It is being recorded. We'll be sharing it in the on-demand section of our website, and everyone who's here will be sure to send out a reminder once it's up on the website so you can share it with your colleagues. Okay, let's see if we can answer this one in about one minute." How can I figure out a content model? My docs are all over the place."

Regina Lynn Preciado

Oh, email me.

David Turner

Oh, that's easy. Call Regina.

Regina Lynn Preciado

That's a long answer. I have some blogs and papers and things I could share with you on that, but I'm also happy to talk about it with you, seriously. So email me. My email is somewhere on this, but it's reginap@contentrules. I'm all over the place. Google me, there's only four of me on LinkedIn. Happy to help.

David Turner

All right.

Marianne Calilhanna

Okay, so –

David Turner

Marianne, I think that brings us up to you then.

Marianne Calilhanna

We are at the end of the hour. Thank you so much for taking time out of your day and staying with us during this conversation that we think is really important. Really important to distinguish what we mean when we talk about AI and where it fits into different organizations and their workflows. So just to let everyone know, as we wind up, the DCL Learning Series comprises webinars like this, monthly newsletter and our blog. You can access many other webinars related to life sciences, pharma, content structure, XML standards at dataconversionlaboratory.com in the on-demand webinar section. We sure do hope to see you at future webinars. Have a fantastic rest of the day. This concludes today's broadcast.

Regina Lynn Preciado

Thank you, Marianne and David. This was great.