DCL Learning Series
It’s Really Time to Get Serious About Accessibility
Marianne Calilhanna
Hello everyone. Welcome to the DCL Learning Series. My name is Marianne Calilhanna and I'm the VP of Marketing here at Data Conversion Laboratory. Today's webinar is titled "It's Really Time to Get Serious About Accessibility." And this is a Lunch and Learn format in which our speakers today will be talking about where we are in terms of accessibility in publishing workflows. So I'm thrilled to introduce Bill Kasdorf. Bill is Principal of Kasdorf and Associates. He's an expert on accessibility, XML content modeling and specification, standards alignment, and publishing workflows. And if you're here, it's probably because you knew Bill was speaking today.
We also have Mark Gross. Mark is President and Founder of DCL. He established DCL in 1981 and is an industry leader in structuring content and digitization. So before I release it over to Bill and Mark, we are going to launch a quick poll. My colleague Leigh Anne is going to launch that for us. Leigh Anne, could you launch that poll?
So what we would like to understand is, in what areas are you actively working on accessibility within your organization? You can select the answers that apply or that are most closely related to where you are. And we just kind of like to see where you are at the beginning of the conversation, and then we're going to ask a question at the end. Leigh Anne, I do not have visibility into if people are answering. Can you see if –
Leigh Anne Mazure
Yes, there are answers coming in.
Marianne Calilhanna
Great. I'll just leave that up for another few seconds and then we can close the poll and then share the results with everyone, just to sort of gauge where our industry colleagues are on this topic. So Leigh Anne, how about you go ahead and close that poll and share results.
All right, interesting. So the majority of you here today are starting your journey toward making content accessible. Then it looks like the second area is implementing content structure earlier in publishing workflows, auditing internal and external systems, followed by reliable workflows for image descriptions. And the last one is structuring all books in EPUB3, which I'm sure Bill and Mark are going to be speaking about a little bit. Okay, we can close that. And Bill, Mark, I'm going to turn it over to you. Thank you.
Mark Gross
Okay, thank you Marianne. I'm delighted to have Bill here on this program. It's always a pleasure to have you, Bill, and to get a chance to talk to you.
Bill Kasdorf
Pleasure to be here. Looking forward to it.
4:00
Mark Gross
So first, you've been involved in this area for a lot longer than me. I think I'd been involved about eight or nine years before when we first started actively working on accessibility for all kinds of materials. You've been doing it for a while longer and you're very involved committees and stuff. Maybe just give us a synopsis of what you've been up to lately and what's interesting out there and what you're working on.
Bill Kasdorf
Sure. And I assume you're saying just in the context of accessibility, which even though I do a lot of different things in my consulting, I used to go through my elevator pitch. I said this, this and this and accessibility. And I realized at least a couple years ago, wait a minute, all of that involves accessibility. So I now don't have any consulting engagement that doesn't to some degree have something to do with accessibility. Sometimes it's 100%, like helping to assess a publisher's workflow for their ability to produce accessible books or checking their website for accessibility, blah, blah, blah.
But as you probably know, Mark, I'm involved in a lot of different standards organizations, and one of the things that I find very encouraging is, I'll have to say that I've actually been advocating for accessibility for 20 years, or actually more than 20 years, but for the first couple decades didn't get much traction. It's like, why aren't people paying attention to this? This is really important. But I'll say about maybe three, four or five years ago, suddenly it started to get pushed to the front of people's attention. And particularly for book publishers, nowadays, the European Accessibility Act has got everybody paying attention to it because they can't sell their books in Europe if they aren't available in an accessible format. So that's really important.
But one of the things that's puzzling people about that, and justifiably so, is that that regulation actually doesn't provide much specific information as well. What do you mean by accessible? What do you have to do to be accessible? And so, one of the things that I think is really important is that W3C, well, in terms of the work in the W3C, which I'm very involved in, big watershed moment just within the past several weeks actually, or a month ago, is that EPUB 3.3 became a formal W3C recommendation. And many people aren't aware that EPUB, which has been a standard for eBooks for a long, long time, and EPUB3, even, for many years, hasn't had that status of being a formal international standard until just recently. And it actually is composed of three different parts. One is the usual part, which is how to create an EPUB to make it accessible or how to create an EPUB to make it conform to EPUB 3.3.
There's another document about reading systems, how are reading systems supposed to deal with this? But the third is EPUB accessibility 1.1. And that's really key because now we have an international specification for how to get accessible.
Now that's not to say that there still aren't open questions. I had a question from a service provider just a couple of weeks ago who asked about, well, what about language tags? Do we have to actually tag when there's a foreign language in the content? Do we actually have to tag that?
8:00
And it's a bit of a gray area, but my answer is yes, you do have to tag it, because it's just fundamentally important that for somebody using a screen reader, when they hit something that's in French, the screen reader needs to know that that's French so they'll pronounce it right. And that's actually possible.
So anyway, that's just an example, but at least now we've got those standards out there. And there are also guidance documents that most people aren't aware of from the W3C. There's a document called "Techniques" that explains how best to accomplish these things. And there's a guide to how to do the metadata, and how the metadata that's built into EPUB, which is the same as schema.org metadata for accessibility, how to express that in ONIX, which is your supply chain metadata. So anyway, lots of progress very recently that I think is making this easier for folks.
Mark Gross
Right. That is an avalanche of progress in the last four or five years. I didn't know about all the things that you mentioned, but that's terrific. Of course, people have to follow these standards and get ahold of those standards and understand it. I think I saw a quote actually, Marianne showed me a quote that Allen Institute says, only, what is it, 2.4% of published scientific research papers are fully accessible. That sounds like a really tiny amount, although I imagine the 'fully' is probably a qualifier there. I guess if something isn't EPUB3, it's accessible, but not necessarily fully if not all the information isn't there. So I think maybe, something that you said about a language tag, if the reader doesn't know what language it's in, won't necessarily do the right things with it.
Bill Kasdorf
Yeah. And I do have a couple of observations about that because yes, that's a shockingly low percentage, but keep in mind that a lot of published research is pre-prints and it's not actually going through a form of publishing workflow. So increasingly, by the time an article actually gets published, particularly by a major publisher like an Elsevier, Springer, Taylor & Francis, et cetera, typically those are accessible, but they're not EPUBs. If you're talking about journal articles, that's been a confusion that a lot of people have had. And I've been contributing to that because years ago, Atypon, for example, which is one of the leading hosts for journals, provided the ability to create an EPUB for any article that you submit to their system. And a lot of white label systems like Taylor & Francis, that's really Atypon under the hood. So they could be publishing them all in an EPUBs, but really just a good HTML file of a journal article properly tagged is accessible.
I was working with the arXiv group a number of weeks ago, a month or so ago. Most people on this call probably know of arXiv, which is the granddaddy of the pre-print servers originally at Los Alamos, now based at Cornell. And it's physics and math, typically more disciplines now than it started, but they want to make sure that their content is accessible. And it's not right now, because it's the author-contributed content. And so, how to make sure that's accessible, that's the new frontier, is getting the accessibility way upstream. But that's so important,
12:00
because if we get the accessibility upstream and the authors do the right thing in the first place, it's really a no-brainer for the publisher to get them accessible. But you're right, it includes things like are there image descriptions and are they any good? Typically, there is a language tag at the beginning of an HTML file that says what language it's in. It's in the header of the HTML. But what they neglect to tag is when there are components down in that file that are in a different language, that's what you have to tag.
Mark Gross
And I think you got a key point there. The author's doing the right thing also there. So there's a bit of education going on there and probably sensitivity. I think when we spoke before, we said that most people who are producing the material are not visually impaired and don't have these disabilities. So it's very hard to be aware of what needs to be there. And I mentioned the event and I thought my sensitivity developed when I went to, there's a museum in Israel, it's called the Blind Museum, but it's really called "Dialogues in Darkness," which is an hour spent inside this facility led by a guide who is blind through scenarios of what a person deals with in everyday life. For an hour you're in total darkness. And that really gives you a whole different perspective on how things work.
And so, I think part of it is if people had more sensitivity, they would be worried about this more. And the other is, today's tools really, I think what's different, today's tools allow you to actually create things without that much difficulty. So that's really been, I think that's the two pieces that have changed over the last 20, 25 years. Of course, people are aware of the whole OCR technology we used today was originally built as a reading machine for the blind by Ray Kurzweil –
Bill Kasdorf
Kurzweil. Absolutely.
Mark Gross
Right. Ray Kurzweil in '76 or '77. And there's a great little video of Stevie Wonder, who became a good friend of his, hearing a book read to him for the first time. And it's just like an amazing revelation that this technology that now lets him do that. So a lot of this is technology. So OCR started there, but now the machines today can do it, but they need content. They need content that's prepared properly.
Bill Kasdorf
There's an accessibility conference in California every spring. And actually, Stevie Wonder often shows up to that conference. People get really excited when they see Stevie Wonder, but it's a cool thing. But a couple things. One thing is, yes, it's gotten a lot easier. Basically the requirements for accessibility are almost entirely built on web standards that are open, free, patent free, no cost, et cetera. And so I mentioned the HTML files for your journal articles and EPUBs for your books. Basically, the content documents are a bunch of HTML files in the EPUB. So you're still using HTML markup, one way or another. And yes, there are some nuances like ARIA, which helps modify the HTML to a level of specificity that's helpful screen readers.
16:00
But the other thing is, this is not just about blind people. There are lots of people, one of my friends at Benetech, Michael Johnson, often, I think I have this right, I may have this backwards, but almost every presentation he does, he includes a slide that says there are more blind people than there are redheads and there are more people with dyslexia than there are left-handed people in the world. It may be the reverse of that. I always mix those up, but it's just really, I have a friend, Caroline Desrosiers is an expert in image description work, and she and I were giving a presentation to Council of Science Editors a couple years ago. And I had that slide in my presentation and I said "Now, my colleague Caroline, who's sitting beside me is a redhead and I happen to be a lefty. And you don't think either of those things is unusual, do you?"
And why I'm seem to be going off on a tangent there is that there's a problem with PDF. Because people think, oh, there's an accessibility standard. Yes, it's called PDF UA and it makes a PDF more accessible or better on accessibility than it typically otherwise is. But it's nowhere close to being as accessible as an HTML file. And one of the big differences is that for people with low vision, they need to be able to enlarge the font and zoom in. And try doing that with a PDF, and it goes off your screen.
And dyslexic people will often need to change the font or change the colors, et cetera. And all of that you can do in a properly tagged HTML file, and totally impossible with the PDF. And actually, in that arXiv forum that I was mentioning, that arXiv is working on this and they had a forum on accessibility about a month ago. It was unbelievable. We got over 2,000 registrants for that thing. And admittedly, some of them thought it was about access because there's a big confusion. You'll talk about accessibility, they think it's about open access. No, no, no, that's a different subject. But still, we got about 400 people attending.
And one of the good things that they did, Shamsi Brinn was the person at arXiv that was organizing this thing. And she made it a point to recruit speakers that themselves had disabilities, whether low vision, blind, deaf, dyslexic, et cetera, so they could speak about their personal lived experience about dealing with this. So there's one person who's a high-up person at Google, but she does a lot of peer reviewing. And it's like, there are cases where I simply have to decline to peer review because I can't access the damn thing properly. And she's an expert, she needs to be able to do that. But uniformly, they said they really wanted to move off of PDF and just please give me a good HTML file.
Mark Gross
So I'm a lefty also, and when I'm alone, I don't –
Bill Kasdorf
My screen is reversing, but I'm drinking my water with my left hand here.
Mark Gross
Me too. So I'm a lefty too, and my daughter's a lefty, and in many family gatherings, almost half the people will be lefties.
20:00
So it doesn't seem like anything unusual, but when you're at a restaurant with a bunch of people you end up banging your elbow against, you certainly realize that you are in the minority.
Bill Kasdorf
And the world is built for right-handers, right?
Mark Gross
The world is built for right-handed people. We all have to learn to adapt to scissors and notebooks with binders and all those kind of things. But that's a relatively easy adaption compared to low sight. And there's other disability also, you mentioned color. There's a large percentage of population which has some degree of color blindness. I don't think most people would notice that, but blue, green and other colors that look exactly the same. So it really gets us into imagery and stuff. You mentioned changing the colors and fonts, but it's also, colors and images you usually can't do much about. So that becomes –
Bill Kasdorf
There may be a figure where the color is used to communicate something. Well, the red things mean this, and the green things mean that. And somebody with red-green color blindness just can't tell the difference. It's like you just shut them out of being able to understand that figure.
Mark Gross
Right. Or the red line on a subway, or the blue line and all those kind of things. So that's a sensitive –
Bill Kasdorf
A related issue is color contrast. And I mentioned that I do a lot for my clients and sometimes just for a publisher that says "Can you review my website and get me a VPAT" or "Can you review my EPUBs and tell me are they right?" Oftentimes the two – I don't think I've ever done a website accessibility review that didn't have color contrast issues. And that's because pastel, soft colors, small type is fashionable right now. And for accessibility, you need sufficient contrast between the color of the type and the background color. And it has to do with the size of the type. So big type doesn't have to have as much contrast as little tiny type, because it's hard to distinguish.
Mark Gross
If you had a taste of it, if you take something that was done as a color image and somebody translated it into gray scale, very often you can't tell what needs to be done there because it's sort of like got dropped out.
Bill Kasdorf
That's another question. Those images should have image descriptions anyway. So...
Mark Gross
Right.
Bill Kasdorf
Those aren't just for the blind people, right? Yes. That's why we do them, because some people just can't see that image and you have to tell them what it's trying to convey, but by doing so, it's actually useful to any reader. And I find that for users of scholarly and scientific content and educational content, regular old sighted users like you and me actually find those image descriptions very useful. Because it's like, oh yeah, this is telling me what this author is trying to convey by putting this figure in this book or this article.
Mark Gross
Right. No, I think that's it. So the problem is if the author is doing – somebody's got to put that description together and have enough information there to be useful. So a simple picture, I was going to an exhibit at the Museum of Modern Art last week, and when I was looking at the website and I scrolled over the image, it did come back and say "Picture of woman looking at pictures that have colors in orange and red." So that was sort of useful, but really,
24:00
it becomes an issue when you're dealing with a picture that's trying to convey a lot of information. Graphs and things like that. How are you going to do that, and can you?
Bill Kasdorf
There's two schools of thought on this. I make a distinction between what sector of publishing you're talking about. So in this context, I bet you most of the people on this call probably are in the scholarly publishing side of things. Not all, but many or most. And in that world, it totally makes sense for the author to contribute the image descriptions, to my mind, because the author knows why they've got that image in their article or their book in the first place and what they're trying to convey about it. But what I'd like to stress is that those are draft image descriptions. The authors don't really know how to write a good image description, but they can give you, here's what this image conveys, and then you need to train your editorial staff. Guess what? Editors edit the copy. They just need to know how to edit the image descriptions along with editing the rest of the content.
Mark Gross
That's actually a good point. So it comes to, you're not putting the whole editing burden on the author, but the author knows or should know why they put it in there, or at least be able to say, this isn't really important, so we can drop it.
Bill Kasdorf
And there are publishers that are successfully doing this. University of Michigan Press does this, Taylor & Francis does this, et cetera. So it's not rocket science. Although I was talking to an industry colleague just actually, I think it was beginning of this week or maybe last week, who has just the opposite position. And this came up, I don't know whether I want to open up a can of worms, but it's going to come up in the questions if we don't bring it up, which is, well, what about using AI for image descriptions? Well, I mentioned my friend Caroline Desrosiers, who basically, she's a professional at this, and she basically says, I have never seen an AI-generated image description that's actually a good image description. And the reason is, what people don't appreciate is that there are two factors that really influence how you write the image description. One is the audience and one is the context.
So the context is in this chapter or in this book or in this article, in an architecture textbook, you would describe the Mackinac Bridge very differently than you would in a tourist guidebook. They wouldn't be the same description. It's the same picture of the same damn bridge, but you wouldn't. But the other is the audience. So for example, an image of a human heart, if it's in a high school biology textbook, the image description would be very different than if it's in a cardiology article for a physician. If the audience is a physician, you'll use entirely different words to describe that image. It's still the same image. So it's not easy to get this stuff right.
But anyway, getting back to my friend. He was saying, well, he contends that providing – this is actually pretty clever, actually. I don't think I have his permission to quote him, so I won't identify him. But his position is go ahead and use AI to generate the image description and send it to the author. And they are a scholarly and scientific publisher, mainly science. So most of their authors are scientists.
28:00
And so he said "My strategy is I am going to send them that AI image description because when they look at it, they're going to say, wait a minute, that's not right here. This is what it should say." Whereas if you just say "Please submit the image description," they'll say "I'm too busy for that. Just you deal with that, you're the publisher." So I don't know.
Mark Gross
I think that's a brilliant strategy, actually. And I also want to point out that it took us 26 minutes into this webinar before we mentioned AI. So I think that may be a new record for – every conference I've been to, every session, within five minutes needs to mention. So AI is a great tool, as you mentioned. And here it's a prodding tool because after he stops laughing, you say "I don't want that in there. Here's something else instead." And that'll work well. I want to go back for a minute to, we talked about PDF and HTML.
And I think we're not just throwing PDF and HTML, we're talking about structured content there. Because a lot of content today, and certainly in the scholarly, in the scientific scholarly world than others, is now being generated in XML, which gives you structure and lets you automatic curate the HTML that goes on websites and stuff. So I think maybe we should talk a little bit about the importance of really putting in structure into documents, into these things right at the beginning. And that would probably get off the PDF problem at some point.
Bill Kasdorf
That's a really good point. And yes, you can put structure and reading order in a PDF, that's partly what you do to make a PDF more accessible. But in the world of scholarly publishing, certainly, there is, well, there are two almost identical XML formats that are the lingua franca of scholarly publishing, which is JATS XML and BITS, its book variant. And those are big, rich, virtually universally used XML models. So almost all XML, almost all journal articles are in JATS XML at some point in their life, even if – the JATS XML is not designed to be provided to an end user, it's designed to be transformed to something that is like PDF or an HTML.
But I happen to be participating in a NISO working group called JATS4R. Not the whole working group, actually, I'm in a subgroup that's working on accessibility. And that's, what that group does is develops recommendations for how to use JATS properly for X. And so what this subgroup is doing is how to use existing JATS in its current form to properly convey accessibility in that JATS XML.
And actually, we're almost done with the project. We're at the stage of, in fact, one of my to–dos this week is I have to review our recommendations that we've got drafted by Friday because we're about to finalize that recommendation. So there's an example for you, Mark, that, yeah. And basically what it says is that there's already, there are two elements in JATS. One is called alt text and one is called long description or long desk. So you can have the brief alt text that actually gets attached to the image. And for a complex image, you can actually have a separate, more extensive description like a graph or a chart or a workflow diagram or whatever that the simple alt texts couldn't convey.
32:00
We've always had those in JATS. You just need to use them.
Mark Gross
We always have those things, it's just, they're not being used in that way. And certainly alt text, I don't know if it's always been there. It's been there for a long time.
Bill Kasdorf
For a long time. Both of those have been there. I think going all the way back to NLM, which is the predecessor of JATS, I think they would have been. But anyway.
Mark Gross
Right. Right. So it's been there for a long time, it just wasn't getting used. But now there's that impetus and awareness. I think that's really a big piece of it. The awareness is going to make a difference. Besides images, the other place of course is tables and math. Those are the things that need description. And that's where XML really comes in as the forerunner to HTML. Let's talk about a table. If a table's an image and you got to describe the whole thing, it's a mess. But if it's XML or it's a table model in HTML, then the machine can read it for you, right?
Bill Kasdorf
Absolutely right. I'm so glad you mentioned that because JATS used to, it still permits two different table models. One is the HTML table model and one is the CALS table model, the OASIS table model. And for some time now, the current version of JATS strongly recommends you use the HTML table model. So if you're doing your JATS, right, you've already got HTML in your tables. And guess what? I'll get back to that because there's an interesting example of how to do that right. Well, I'll stick with tables for now and then I'll get to math. The one thing that you need to do is make sure you've got table headers, column headers and row headers, need to get tagged as such.
And here's the example of why that's so important. So say you've got a table of the 50 states and those are the rows. And you've got the columns are different statistics about those states. So if you don't have the column header and the row header tagged, when the screen reader is reading, say, the cell that's in column three, row 11, it'll just say column three, row 11, blah, blah, blah, big long number. But if they're tagged, it'll say, population Michigan, blah, blah, blah. That's night and day more intelligible. It is intelligible and the other isn't. Without the headers, it's basically useless for a screen reader. So you do have to.
Mark Gross
That's very important. And the structures in there, because we're talking about a table, the way we first think of a table, it's just got those and columns, but then there's also, most tables have things combined and scaled, and that becomes very important for the screen readers. By the way, the two table models there are really are interchangeable and we can convert from one to the other.
Bill Kasdorf
Of course. You're in the conversion business, so it's like "Hey, we could convert those."
Mark Gross
It's very key to have the table model there and structure so you can do that. The same goes for math and MathML. Math is a blob, not impossible to describe, you can describe it, but if you've done it in MathML or you've done it in one of the standard table models, then the machine will be able to read the math feeds. It makes things possible for people to do without relying on another person read this stuff to you, which what we're trying –
Bill Kasdorf
Absolutely. One thing that is not actually recent news,
36:00
but it may be news to many of the people on this call. For many years, publishers have just been putting images of their equations in the books and articles.
Mark Gross
Absolutely.
Bill Kasdorf
And it frustrated the heck out of me, because prior to going independent as a consultant about five, six years ago, I've been consulting for 20 some years, but I was in the context of a data conversion and pre-press firm that will go unnamed because it wasn't DCL, back in the day. But I can tell you that all the equations were MathML at some point in the workflow. And then they made images, and that's what got put in the EPUB or in the HTML. But it's not like there wasn't MathML in the first place. That's how we do math these days. But now, one of the reasons that people resisted, is that for years the browsers and the reading systems based on the browser engines didn't render it properly. And so they needed to put an image in so they could ensure that the equation looked right.
But there's a new generation of math now, MathML, that is in fact publicly available. There's a group called Igalia, a developer group that worked closely with Google so that the chromium technology now natively renders MathML properly. And that means that the Chrome browser, which is based on chromium, the edge browser that's based on chromium, et cetera, now render math properly. And I think that's true of iBooks as well. I'm not absolutely sure that that's been accomplished yet. So anyway, there is now no reason not to put MathML in your content.
Mark Gross
It did start there. And the issue comes up with all the material. And certainly, if you're building an archive, all the materials was often not tagged that way, so for images. But I think the publishers who see it as a long-term asset have, we've got a few very large collections where we've turned all the math into MathML, or LaTeX, those are the two models out there that can do a good job of explaining the math. And that becomes a long-term asset because you can now use it, not just in reading to the blind, but also to be able to display things in larger, smaller format. So it's like become a good business thing, rather than just "let's do accessibility."
Bill Kasdorf
You've raised a very important issue there, Mark, which is all that legacy content that publishers have. And of course, if anybody knows this, you guys know this better than anybody that, for example, for a platform migration where you're going from platform, moving all your content from platform A to platform B, almost always, not only do you have to align with the new platforms' JATS spec, because people often don't realize that JATS XML for Atypon and JATS XML for Silverchair or for HighWire or whatever, are not exactly the same specification. But there's an even bigger problem: all that older content is a mess, typically, because it's tagged differently. So anyway, that needs to get normalized.
40:00
Mark Gross
And that's an area, by the way, where we are using AI effectively to take older material. So sometimes that does help with we're starting to deal a lot of other things and trying to make things economically feasible and you're taking, when you want to go back 50 years in your archives or 100 years in your archives.
Bill Kasdorf
And this is a big deal to publishers as well. There's a misconception in the marketplace that people think "Oh, but you don't have to do that for your back list, do you?" Well, no, you don't. But then you can't sell it in Europe. So yes, you have to convert, you have to make sure that you have an accessible, which basically at this point in time means an accessible EPUB3.3, or at least an accessible 3.3. And I remember talking to a friend of mine from Taylor & Francis a year or two ago, and he said "OMG, we've got thousands and thousands of crap EPUB2s that we have to upgrade or we're not going to be able to sell it.
And so the big publishers, obviously T&F in the scholarly world, but PRH is a good example for a trade publisher. The world's biggest, or at least the country's biggest. I think they're the world's biggest trade publisher. They're very, very actively working on their back list to just make sure that that all gets upgraded and is accessible because – and I should say, I should clarify that it's as of 2025 is when it takes effect. So it's as of 2025, if the book isn't available in an accessible form, you can't sell it in the EU.
Mark Gross
So this is an example where legislation is good for getting things done, right?
Bill Kasdorf
Yep.
Mark Gross
Some other places. Let me talk about just, and I don't know how much involvement you have, but there's a whole new area of accessibility. We're talking about visual disabilities. There's also, people can, with auditory disabilities, actually, the same place that has that Blind Museum, which is called "Dialogues in the Dark," has another one that I've never been through, to replicate deafness. So that's sort of the flip side. We usually think about reading, but more and more you have videos and a lot of material comes across in lectures, you just have podcasts. Have you given any thought to what happens with all that?
Bill Kasdorf
Oh, absolutely. And it's become more and more routine for videos to have transcripts, which are obviously key. And I'm sure we've all gotten lots of chuckles from seeing the transcripts that Zoom creates as live transcripts and how much it gets wrong. So please, okay, go ahead and let it do that, but then edit the darn transcript so that they're actually right. That's important. But one thing that was actually an eye-opener to me a few months ago is in conference presentations. I always thought, well, if the content is on the slides, then they can read it. And somebody that I know that works with the deaf community said the problem there is that for many people in the deaf community, particularly people that were born deaf, English is not their first language. ASL is their first language, American Sign Language, or BSL, British Sign Language, for the English language. And to many of them, it is a struggle
44:00
if they have to deal with the English content. So in a case like that, they're really looking for ASL.
Mark Gross
Are there translators to the sign language?
Bill Kasdorf
If you go to an accessibility conference, there will be a little window on the screen where there's somebody signing, but it's not common in other things. I think Cadmore may be making that available for videos. They're the farthest along with accessibility in the video world.
Mark Gross
That's a person that's signing. I'm just wondering if, is there software that'll replicate it? That's definitely an AI application that should be out there.
Bill Kasdorf
That would be cool to see. I have not heard of that existing yet, but yes, because that gets into the AI image generation. It's like, well, theoretically that's possible. I hope somebody's working on that. But that's at the heart of the actor's strike right now, why the actors are freaking out about AI. And particularly, the vast majority of actors are not the famous people that you see on the credits. It's all those people in the background. And the reality is, and this is actually happening today, is that they're being required just to get scanned. It's a very elaborate scanning process, but they get scanned so that the studio can then use their image in a crowd scene or in the background in a restaurant or whatever without paying them because they can use AI to render those people shockingly faithfully in the background of an image. It's like, wow.
So anyway, getting to your suggestion of would it be possible to make AI-based signers? Sure. That may be controversial because people say, no, no, no, no, no. Humans should do that. Well, here's an analogy. There's a controversy in the audiobook space about do you use human narration, which is the gold standard, or do you use AI-based narration? And I tend to be a "both/and" kind of guy. So maybe for your marquee books or your front list or the most important books in your front list, yes, you want to get actual voice actors and do a really good job of that voice. But what about all that backlist we were just talking about? AI narration can get you an audiobook that's serviceable for all that backlist. I wouldn't be dismissive of that.
Mark Gross
Right. This argument and discussion dates back quite a few centuries. I think the same argument went along with the printing press and how it was going to put all the monks out of business, and the industrial revolution. And all the things that eventually ended up being tremendously beneficial to mankind started off with that, but it's going to put all these people out of service. And I would guess this automated sign generation is the same realm as a printing press, in that sense.
Bill Kasdorf
It's a tool.
Mark Gross
It lets you backlist. So that's an interesting – okay.
Bill Kasdorf
And I don't want to go down a rabbit hole on AI right now, but a comment I would make that a lot of people just forget about because they are so panicked when they realize that the generative AI programs like chatGPT, admittedly,
48:00
they can make stuff up. The technical term is that they hallucinate. And so they say, well then you can't use them. No, you just need to use them properly. And so if they're trained on a known corpus of content, that can help. But even using a large language model, keep in mind, we're talking about publishers. Well, you can edit things. So yes, you should be able to use AI in a way that's useful to you, but then have a human, it's called human-in-the-loop AI, is have a human being review the results and check the things for accuracy. There's this famous case recently, a legal case where a lawyer used AI to write his brief and it made up a bunch of cases.
Well, guess what? It is not difficult to determine that those are fake. Somebody just had to do that. It's like, no, no, there is no such case. So yes, it did a terrible job. It made up stuff. But does that mean you couldn't have found that out? Yes, you could have. He just didn't take the trouble to find that out. And he submitted it to the judge and he got in trouble.
Mark Gross
And I think I've had a few colleagues and friends who have used it in that way. They use it as an initial draft and then correct it. And so there certainly is – and it's going to get better. It's like when I was in college, I could beat the chess program. And I was playing some pretty good competitive chess, and I was beating the chess program. Today, no human can beat – the top chess program cannot be beat by any human. So it's going to get better. It's going to change. And I think all this is going to be changing for the better. And I think what we're seeing here is just one area of accessibility.
You mentioned five years has been tremendous. Just in the last five years has been tremendous differences. And the technology has gotten better and it makes for good business. We're talking about publishing here, but look at in technical publishing and reference manuals and repair manuals, if you get it structured the right way, then machines can read this, and not just for somebody who's visually impaired, but you can put it on a headset and the person would be fixing an airplane and listening to what to do next. So the structuring is going to work all around. We're going to run out time. Marianne's supposed to get on. Any last thoughts before we turn it over? No, we'll turned it over.
Bill Kasdorf
I wondering when Marianne was going to shut us up.
Marianne Calilhanna
We're doing fine. Everyone's enthralled, I can tell. We do have a couple questions. So I'm going to throw these out. And then while you're speaking and answering the questions, I will launch our second poll, which is just to assess where our attendees might consider moving forward in terms of accessibility in their editorial workflows. But one question, Bill, you mentioned alt text for images. This person would love to hear more how-tos. How can editors do alt texts in their daily work? What should they be aware of? How much description is appropriate? How do we train editors for this? So if you could speak a little bit to that.
Bill Kasdorf
Those are all very good questions that are very difficult to answer. First of all, there are, if you go to the W3C, w3.org, the website for the W3C,
52:00
the accessibility work in the W3C is incredibly extensive. And so there's all kinds of really good guidance documents there that describe, that can guide you into how to get image descriptions right, what are some examples of good ones, et cetera. And that includes tables and things like that as well. But most fundamentally, it's actually really just understanding the point of it, so that when you as a publisher or your editorial staff for example, really gets into this, it's not as much a matter of a checklist as a matter of thinking, what would a person that can't see the image need to know that a sighted person gets from this image being in this publication?
And so like I said, if it's a publication for physicians, you probably want what are called SMEs, subject matter experts, namely somebody that's familiar with cardiology is writing the image description for that cardiology image. But if it's education, interestingly enough, in education, oftentimes, most oftentimes, and this actually happens in trade too, is the images are commissioned. So often the commissioning of the image actually can serve as a really good starting point for the alt text. Because it's like, here, give me an image that shows this and this and this doing this or that, et cetera. So that's a useful starting point. But as I said, the main thing is to just get a sense of the point of the image and what it's trying to convey and to what audience. And you'll get pretty close to right on your image descriptions. It's an art, not a science at this point.
Marianne Calilhanna
Bill, so I have a question. I love that some of the tools we use are, they easily integrate accessibility. So when I'm posting something on LinkedIn, it's very easy for me to add alt text and the image title, but sometimes I get a little, I get confused or I just question which has more importance. And I think they both always need to be separate, the image title and your image description. Is that true?
Bill Kasdorf
Yeah. One really important principle of an image description, and it goes beyond the title, is that it shouldn't just repeat the caption. That is so frustrating to a person using assistive technology. They've already heard the caption and now they hear the damn thing again. So no, whatever you've already got, even in the surrounding text, it's possible, I was reviewing medical journal, actually a top medical journal. You'd know the journal. I won't name them. But when they've got articles, they've got images and they've got a separate webpage that's discussing that images. That's why this doctor put these images in this article. Well, guess what? Those were the image descriptions. They didn't need to also do another image description. Yes, the image has to have alt texts to be HTML, to be legal HTML or valid HTML, anyway.
56:00
But I'm so glad you brought up tools, Marianne, because what most people don't yet realize, most articles, either book manuscripts or journal article manuscripts come to the publisher as Word files. Well, the current version of Microsoft Word has accessibility built into it. Now, is it completely sophisticated accessibility? Well, no, but it's really useful. So just to let people know on that ribbon at the top of your screen, there's usually a review pane on that. You click on "Review," and it'll give you the review ribbon and there's an accessibility tab. And under that tab, it'll give you things as "Check Accessibility." Well, what that's doing is saying, well, are your headings styled properly? Do your images have alt text? If it doesn't, it'll give you a little dialogue that says, here, put your all text in here, so you can at least get more accessible just by using Word properly these days. It leads you by the hand, basically. And PowerPoint does the same thing.
Marianne Calilhanna
Right. And I think also my biggest takeaway, hearing the two of you speak, is that accessibility truly starts with structure.
Bill Kasdorf
Right.
Marianne Calilhanna
You have to have your content structured to achieve true accessibility or even get started with accessibility.
Bill Kasdorf
And one corollary to that is people often don't appreciate how sophisticated assistive technology is. It's amazing technology. So like I said, if you tag something as Spanish, it will speak it properly in Spanish if you've tagged it that way. And if you've got your heading structure, that means that the user can actually jump like you or I would to, oh, wait a minute, I'm looking for that part of the article that I want to look at, or that part of this chapter. They don't have to scroll through the whole damn thing, they can just structurally find their way. If you've got good reading order, a good TOC, good heading markup, et cetera, list markup, all that stuff. Assistive technology is, particularly because of HTML, it's programmed to understand the HTML if you use it right.
Marianne Calilhanna
Well, thank you so much, Bill, for your time. Thank you so much, Mark. I do want to remind everyone that this webinar is being recorded and it will be available on the on-demand section of the DCL website at www.dataconversionlaboratory.com. And as you've heard today, DCL is all about data conversion. We help publishers structure content so that their content is accessible for the visually impaired and just accessible to the world in terms of finding, and some of the other searching and finding that content. And the other services that we provide are listed here and all relate back to discovering content, whether it's by people or systems. And we're always here to help and talk through any of your content challenges. That's it for today. We have a lot of other upcoming webinars on our calendar. You can visit our website to learn more. I thank you so much for spending an hour of your day with us. This concludes today's webinar.
Mark Gross
Thank you, Bill. And thank you, everyone.