Keeping Up With Content: Revisiting Metadata to Ensure it Meets Today's Vernacular and Drives Sales
[Marianne Calilhanna] Hello and welcome to the DCL Learning Series. Today's webinar is titled "Keeping up with Content: Revisiting Metadata to Ensure It Meets Today's Vernacular and Drives Sales." My name is Marianne Calilhanna, I'm the Vice President of Marketing here at Data Conversion Laboratory, and I will be your moderator today. A couple of quick things before we begin.
We are recording this webinar and it will be available in the on-demand webinars section of our website at www.dataconversionlaboratory.com. We invite you to submit your questions for our panelists at any time. You may do that in the control panel via the little questions box. And we'll also be monitoring chat if you're having any technical issues. My colleague, Leigh Anne, can help answer those.
So, before we begin, I'd like to provide a short introduction on Data Conversion Laboratory, or DCL as we are also known. Our mission is to structure the world's content. Content can unlock new opportunities for innovation and monetization when it has a foundation of rich structure and metadata.
DCL's services and solutions are all about converting, structuring, and enriching content and data. We're one of the leading providers of XML conversion services, DITA conversion, structured product labeling conversion, and S1000D conversion.
While we're best known for these excellent content conversions services, we also do a lot of work in the related areas that are listed on this slide. Semantic enrichment, entity extraction, content re-use analysis, structured content delivery to industry platforms. If you have complex content and data challenges, we can help.
Today's panelists are two gentlemen who have spent their careers involved with content structure, metadata, and publishing technologies. Christopher Hill is the technical product and project manager at Data Conversion Laboratory, and Joshua Tallent is Director of Sales and Education at Firebrand Technologies. They really have a captivating conversation planned and I'm thrilled to turn it over to them. Joshua…
[Joshua Tallent] Thank you, Marianne. Yeah, I just wanted to, before we get started here, chat a little bit about what Firebrand does as well, since some of you may not be familiar with us. So Firebrand Technologies is a company that helps publishers manage their data and workflows across all their different products. The two main products that I think apply to what we're talking about today that we provide: one is Eloquence On Demand, which is the gold standard in metadata management for publishers, sending data to over 500 trading partners around the world, and doing custom metadata management and requirements for all the publishers and trading partners that we send data for and to. And Eloquence On Alert, which is a title performance monitoring product that watches your products out on Amazon and Barnes and Noble, and other retail sites, and brings back actionable details and intelligence about what's happening to your products, and helps you understand what's going on and how to adapt to the changing situations on the retail side. So, that's what Firebrand does. That's kind of two of the things that we, we help publishers with.
So, yeah, that's, that's who we are. I think we're going to start off with Chris. So Chris, we'll hand it over to you.
[Christopher Hill] Sure. Thanks, Joshua. So we're going to start off with a little bit of a metadata story. And what you're looking at here is the Federal Military Archive in Freiburg, Germany. And this is an archive that contains millions of documents that were left behind after, after World War II. And they have these tower rooms filled with these documents, originally that were stored in paper form, and file folders with basically the only metadata written on the, the folder itself.
So, when this was digitized, it opened up this huge trove of information. But it really has only one way of looking at it built-in, so to speak. So, the metadata provided was provided a long time ago, was hand-written. And it was provided from the context and perspective of the people who were originally putting these into files, right. Well, that context changes over time.
For instance, one of the books that came out of this archive is, did a lot of research on drug use in the military in Germany at the time, and how that might have impacted Hitler's behavior. Research like that is not very well supported by this metadata, and it becomes really difficult to tease this stuff out, or, or find new angles on the information, because that metadata has really aged and, and lives in a context that, maybe doesn't fit our modern context.
So, that really brings up an issue that we're going to kind of dance around during this webinar, which is that, you know, metadata is really contextual, and your metadata is not my metadata. So, if we move to the next slide, we'll see that, you know, depending on who you are and what you're looking for, you're going to pick maybe different keywords or different pieces of metadata. To think about the, the objects and the things in your life that you interact with.
So, you see, here's a Japanese home cooking book; there might be hundreds or thousands of different words that different people would come up with to describe this book that might be important to them, and, and maybe none of that will resonate with my angle on, on Japanese home cooking. And this all happens, because, really, metadata carries the context of who assigns it, right. And that can be regions, disciplines, age groups, ethnicities, time. All of these things help create that context. So you may think you're making great metadata today, but if you don't have a plan to maintain it, it may, like a lot of the metadata in the, in Germany, in Freiburg, that metadata may end up becoming stale.
You know, you may have great metadata for an electrical engineer. But if I'm a brain surgeon trying to look at the same information, I may not know any of those terms or use any of those terms when I interact with the information. So you see this all the time, and there are many, many examples. I like to always pick examples out of tech because it moves so fast. You know, some of these terms, all of these terms refer to the same device, basically. And have at one time or another have been used by all kinds of different people. And in fact, the same person often interchangeably switches between different ways to describe something. So again, that's just an example from your real life of how metadata could impede a search if I pick the wrong term. And you haven't accounted for that in your strategy for categorizing this object.
I might not know this object exists, and, obviously, that can have some pretty important ramifications.
So, as kind of a broad start of how do you deal with the contextual nature of metadata, we really have to start looking at metadata and search as products in and of themselves. So, I'm a product manager in my company, and I have a lot of tools I use when I'm developing and maintaining products. Those include typical product management things like writing user stories about various different users, and, and assigning different personas to those. So I have my, my target audience in mind and the different types of target audience in mind.
All of these techniques that we use to actually maintain a product should – can and should be used to maintain metadata and search. Really connecting people with information is a critical function.
When you have, whether you're selling content or publishing documentation, or whether you're selling products, like books, or, or iPhones, right? So, somebody has to be thinking of these contexts that users bring, and then building out that strategy around metadata and search. And this will inform your metadata and content strategies.
By thinking about this as a product management problem, and not just a one-shot Hey, we got to fill in some keywords to publish this on Amazon, that means that this is not a static activity. This is an ongoing and a circular activity. We go to the next slide. I have a little picture of just how you have to keep refreshing this. You have to continue to adapt to the evolving context that the, the information or the objects you are maintaining live in. A complete metadata strategy really has to take into account the fact that things evolve over time, and has to monitor that so that you know when maybe, it's time to, to update those keywords. Or, you know that users are using search terms that maybe they weren't using three years ago.
And again, the, the mobile phone business is a good one as an example of that. People still say cellular phone, but if I look for cellular phone on a website, that's kind of an outdated term, we don't really use that in the technical part of the business. But if your users are still using that term, you want to take that into account, right?
And failure to do this, if you fail to continue to maintain that context, what happens is your content, or your, your, the objects you're trying to connect to users can become invisible, they get lost, because, again, only the most diligent person would be able to, to dig into a complex archive of information and tease out new information.
So that's really sort of what we're talking about here. I want to have Joshua really talk about how important this is just for the bottom line. So I'll turn it over to Joshua.
[Joshua Tallent] Yeah. Thanks, Chris. So yeah. And this is, this is really important for everyone to understand how important metadata is. My example here that I'll be talking about is book metadata. But I think Chris's broader point, too, about data is more than just, even in publishing, it's more than just the book data. More than just the metadata that you send out in your ONIX file. This is about all the data that you create and produce, because there's a lot of things that you have to take into consideration to keep up to date, and to maintain properly. So I use this example all the time because I think it shows best how important metadata is for sales and discovery. There was a study by Nielsen, now NPD, back in December 2016. They looked at book sales over the course of an entire year. So we'll go on the next slide. They looked at sales for books and found that the two point five million different products or titles had sales within that one-year period.
But on the next slide the, they found that of that two point five million titles, only 100,000 of them, about 4%, accounted for 86% of total sales. So that's a big deal. Right? You want to be in that top 100,000, you'll want to have that, your book in that 4% of titles.
Then they started looking at the 100,000 titles, and said, so what actually, what kind of impact does metadata have on those titles? And so they looked at things like basic metadata. So just foundational stuff, title, author, you know, the, the basic territorial rights, the price, things like that. Titles that had basic metadata had 75% higher sales than titles that didn't have all of it. So if you're missing certain pieces of data, you're more likely to have fewer sales, because you're not able to sell your book as well.
On the next slide for fiction, that was even higher. So for fiction titles, 170% higher sales for those titles that had product – basic product metadata compared to those that did not. And this doesn't even include things like cover images and book descriptions. Those were other things as well. So if you go to the next slide. Cover images had 51% higher sales for products that did, that had cover images versus those that didn't. Now what I find interesting, is that, of a 100,000 titles that had 86% of total sales for the year of the study, some of them didn't have cover images. I don't understand how you could sell a book without a cover image, but it happens. So no, having cover images, though, can have a big impact on sales. And then on the next slide, a book description, author, bio, and reviews. Those are three very core pieces of long form metadata that are sent out by book publishers and SHOULD be sent out by publishers for their books.
And those titles that had all three of those items had 72% higher sales. And then, to speak to Chris's point as well, keywords – keywords actually at the time were not really very powerfully being used, there were a very small number of publishers who are using them at all, and also only one retailer that was accepting them. And that's kind of almost the case now, because Amazon is basically the only retailer of note that accepts keywords. But titles with keywords at 34% higher sales. Now, this 34% actually ties into a study that we did at Firebrand about a year after those back in 2017 or so. We did a study on keywords and found that for publishers that we were, that we were helping with keywords at the time, that those publishers that implemented keywords had, on average, about 32% increase in sales after applying keywords to their products. So, keywords do have a big impact on this.
What's important to remember as well in addition to the study from Nielsen is there are other reasons why your metadata becomes important. You know, back in the early days of publishing, you would just, you know, publishers who just had some books that they sold, and they, you know, the data wasn't as important because they weren't selling directly to consumers as much. But today, online sales have become the primary way of selling books, especially according, especially in the midst of the pandemic, we saw a massive increase in online sales. So in the, just the last nine years or so, 9 or 10 years, the increase in sale, online sales went from about 20% to about 50% of total sales. And this year, I imagine it's much, much, much higher because of people basically buying almost everything online.
Also, this, if you go to the next slide. In just Quarter 2 of 2020, as a result of the pandemic, Amazon saw across all of its product lines an increase of 48% in sales. And that is going to affect books as well obviously. So, having a, having these kinds of changes in how we sell books and changes in how we buy books makes a big difference in how you handle your metadata. If your book product data is not up to date, it's nothing kept up to date in that cycle that Chris talked about, then you're gonna have a hard time selling that book. It's going to be difficult to sell a book, well if you don't have good data as the Nielsen study shows, and also as these, these changes and online versus offline sales are, are showing as well.
Now, publishers that focus on this kind of work and publishers that it really engage with their, their data can have a huge impact on their sales. On the next slide, we have a company that we work with on, with our Eloquence On Alert product. And they saw a tenfold growth in their sales on Amazon over the course of seven years. They went from $95,000 in sales on Amazon to $933,000 a year in 2019.
And in 2019 alone they had a 42% increase in sales. That was directly a result of their engagement of their data and their data practices in dealing with Amazon and dealing with other retailers, dealing with how their data was, was working on those products. What's really interesting is this, even, this effect even had an impact for that company during the pandemic. So in the next slide in April of this year, they had a 63% increase in sales year over year on Amazon.
And in May they had 107% increase on Amazon year over year. And that impact, regardless of the pandemic, despite all the issues that publishers were having in the midst of a pandemic, this company actually was making more sales than they had the year before in both of those months. And they still had Amazon buying product from them getting product delivered despite the fact that a lot of other products were not being purchased by Amazon and put on the shelves, the digital shelves. So very important to remember that the quality of your data, how you, how you work with your product data. All of this is going to have an impact on the sales that you, that you make and going to impact, have an impact on how your book is discoverable. And so, Chris, let's talk a little bit about how discoverability works online, and how website search can help, help with that.
[Christopher Hill] Sure, thanks, Joshua. So there's a couple of different things we need to speak about when we talk about discoverability and making, connecting our users, with whatever that product is we're producing. And now, one area is website search is really YOUR search, the way you expose things, either internally as an organization or your own personnel, or often externally through your own website to your customers, and there's a couple of angles here.
So one of the things I've noted is that, a lot of times, website search is, has a lot of limitations. That just comes from the nature of hosting a small site, as opposed to hosting or trying to crawl all of the internet. So an example I use, if you think about a pharmaceutical company who is launching a new drug, let's say I've created a brand new drug, and it's to treat maybe a flu or respiratory virus, to be topical. So, if I've created this new drug, and I just throw it on my website, what I'm going to probably find is that any searches in that site for a lot of the keywords and metadata, if I just throw it up, there, are going to miss that file or that new drug. And that's because of the new drug probably has a very limited amount of information compared to all the older stuff that you've been publishing on the subject on your site.
So when, when a user comes in and uses your search box, you really have to take effort, and somebody has to make the effort to make sure that the newest things show up at the top of those searches. That's what you're trying to promote. And that's where you really take advantage of the capabilities of your search platform.
You may predefine some results for some searches. I want these pages to show up. Or you may actually build out a taxonomy. And that taxonomy will maybe rank, have some built-in ranking abilities when people search for words or phrases related to the taxonomy. And all those things are things that you can control. Now, larger organizations tend to put a lot more effort towards this. But even a small organization, you need to think about that as you publish things to your, your own website.
So in one way, you're thinking in one term of, how do I get the newest things to bubble to the top on my website, and that's certainly a strategy you have to plan for. But really, we also have to deal with the reality that, for most websites, users are going to be arriving from Google. And that's just still the reality out there, that most users are not using the site search. And I'm one of those guilty parties, I find.
When I'm on Microsoft's website, and I'm trying to find some problem I'm having in SharePoint or something, and troubleshoot that, I usually get better results, ironically, when I go to Google and search, than if I go, actually, to the SharePoint site and search using their search box.
Again, it's, it's, it's a challenge of those individual sites have a very limited context they're working in. And if you don't hit those contexts just right, it can be very hard to find, whereas Google takes a whole different approach. So let's talk a little bit about Google. And Google really cares about showing people relevant information. That's what they're looking for. So when you type in some words, it's trying to infer what it is you're looking for. And, and it's taking out of all of the mass of the Internet, trying to rank things and find those things that you want and put those at the top.
As such, Google does something that's actually kind of opposite of what happens on your website searches. They really limit the credence they give to the keywords you provide. So you can take and very carefully craft all the different words you think somebody's going to use to search for your product, or your webpage, or whatever. Google's going to look at that. But really, what Google puts most emphasis on is not the keywords provided by you, but rather, it kind of builds its own set of keywords, if you will, internally, based on the content you've launched.
And the reason for that is – there's a lot of reasons for that, but just a very obvious one is that, you know, early, in the early days of website search, you would pack keywords, people would pack keywords in and throw a bunch of keywords on a product that had nothing to do with the product, just so that they could get high search rankings. And Google learned that game really quick. And that's why, one of the reasons why they really don't give a lot of credence to keywords. In fact, if you give too many keywords to Google, they'll start to knock you down because they will detect this keyword packing practice and they'll penalize you for it. So there's a lot of mystery here in Google that evolves over time. I mean they have full-time teams of people working on every aspect of their product 24/7, basically. So it's important to try to understand this.
Now, if we do try to discern what Google is doing, it uses sophisticated secret algorithms, and that's all I can tell you about Google, because most of what they do, they don't tell you directly, this is how we work. If you go to the next slide, basically, that means that we are forced to sort of reverse engineer some of the metadata and figure out what is it that's helping us with Google rankings, and what is it that isn't helping us? And that's sort of the reading of tea leaves that we all have to do.
Fortunately, there are some people who are much better at that than I am, and who have much more time dedicated to that task. So they start to discern this stuff. And if we go to the next slide, we'll get a little bit of the good news and the bad news for you as a publisher, as somebody who's trying to get your information or your products in front of your users. The good news is Google takes care of aligning search terms with content. So you don't have to provide Google with a lot of really great keywords to rank highly in their searches. The bad news is, they're taking care of that same thing, which means we have to figure out, what is it that they're taking into account as they try to align search terms with content.
And the best way that you have to control this is not to provide the keywords embedded into metadata, but to actually, when you're creating content about the product, or the service, or whatever it is you're publishing on the web, you have to make sure that you are associating your brands and the terms that are important to you to the Google terminology through the way you author the content. So, this involves understanding what users are searching for in Google, and making sure to use those terms in your descriptions or your, your actual published information that a user will see. By making those associations, Google will begin to associate the brands and other proprietary terms that they may not know about to the actual keywords that they do know about.
So, if I'm that drug company, I want to make sure that I publish a lot of information about my drug, and I want to make sure that it uses the terms that people are going to use when they search for, say, the condition that my drug treats or, or for other related drugs. I want to make sure I get those associations out there so that Google will crawl them in and build them into their database.
Now, there's a whole bunch of other stuff that Google does as far as how to rank content and decide when to serve you up as, in a result or not. And it's a big challenge to understand that, again, it's a hugely complex area. Fortunately, there are some tools that I've found out there that have helped even us internally at DCL in our Google rankings. And one of my favorites, this is an unsolicited ad; it's from Search Engine Land. And they have, for years published, what they call the Periodic Table of SEO factors. All they're doing here is they're giving you a little box for all of the little areas that you should consider when you're publishing content for Google to consume. And these are the ways that the, that those particular aspects of your, your content will affect the search results.
So if you look, the things on the left in green tend to be the very highly important things. And then they have a whole color coding table that I won't go into.
But you can see over on the left, uh, quality is very high. It's up there in the top left. That's one of the most important things, is to make sure your quality is high on the content you're publishing. Google has all sorts of artificial intelligence applied to this. They can tell if a machine wrote your content, or if you've just put together a bunch of random sentences in an attempt to gain them, and they will lock you for that. But if you have high quality content, you'll, you'll get a lot of results out of that.
So that's just one example of those squares. And then over on the right are some things to avoid. So you'll see stuffing, "Sf," is over on the right under "Toxins." That's really stuffing those keywords that I was talking about earlier. So this tool is written by experts who do that reverse engineering at Google, and then make it available to the public.
It's a very handy tool, and they generally update this every year. I don't think I've seen the 2020 version yet. And who knows with COVID what their publishing schedule is? But I find that this table is incredibly helpful in helping you understand how you can really improve things. Now, fortunately, or unfortunately, Amazon is a very different animal than Google, and fortunately, we have an expert here on Amazon. So I'm gonna let Joshua talk a little bit about how Amazon works, and how you have to change your strategy for that.
[Joshua Tallent] I don't know if I'll call myself an expert, but I try to fit the role the best I can.
[Christopher Hill] Compared to me. Compared to me.
[Joshua Tallent] Yeah. So, this, the search engine capability, you know, Amazon and Google will take a different approach to search and this is important for book publishers, especially, or anyone selling products on Amazon. Because how your website ranks in Google and the benefits, all the things that Chris has been talking about, making high-quality content, making sure you enrich your content with actual, you know, using taxonomies and keywords inside the long-form descriptive copy or the blog posts or whatever – that's important for Google. But Amazon doesn't care about that because Amazon search works differently and has even different motives than, than Google search does. So Google cares about show– showing people relevant information. Amazon – Next, next slide – cares about selling people products.
That's the primary goal of the Amazon search. So, if they want to sell you a product, they want to know what products you're looking for, what kinds of products you're looking for. So, their search engine is built around an understanding of the products themselves and the details about it, which actually means that they don't care about long-form descriptive copy, they care about keywords more than they care about that. So, on the next slide, we have, there are basically four things, and there's a little bit more than this, basically, four things that, especially a book Publisher needs to be aware of, and needs to focus on in order to be, to have their products rank well within Amazon search. And that those four things are the title of the book; that includes the series name, and the subtitle, as well.
The author of the book, the categories the book has been placed into, and the keywords that have been assigned. You can do a search on books on Amazon pretty easily, and find books based on keywords and find books based on titles and authors' names. But you'll also see that if you have descriptive copy, say, long-form descriptions and things like that, book reviews and stuff like that, that's not indexed in Amazon search. They don't care about that. They care about the keywords and the other details you can provide to them.
So, this brings up what we understand about, you know, how Amazon search works. This is where long term– long tail queries make the most sense and where long tail queries will be the key to Amazon's search algorithm. So on the next slide, I have an example here of what a long tail query is. So let's, let's say you're searching for books and we're gonna go with the romance idea, though you can do this for anything.
If you do a search for romance versus doing a search for historical romance in Florence, that long tail query, that second option there, historical romance in Florence, is considered a long tail query. You probably have done this yourself. You go on Amazon, even do it on Google. You to type in a couple of words, into the search bar, and it doesn't quite give you exactly what you're looking for. So you add a couple of words to it. And you might, you know, kind of tweak what you're searching for, to try to find that product that you're really going for. And you may actually use the browsing or the, you know, the categorization stuff on the left-hand side, on the Amazon page, to, to come up with a more understandable idea. Like, what are they using to filter, how can I filter the search better? Amazon search takes advantage of these long tail queries to really customize and really give you a detailed search results list.
So the number of results you get for a romance search, on the next slide, is a million, right? Million plus titles that will show up in the books category on Amazon for the word romance. But when you search for historical romance in Florence, there's only 90.
So that's a really big deal. And think about, you know, think about, do you want your book to show up in a search that's that specific? Probably so, if your book is a historical romance in Florence, you know. So long tail queries are really helpful and that's how people tend to search on Amazon. Amazon themselves says that most people will find products by search. They don't find products by browsing through categories and looking at, at the stuff like that. They really are finding things by searching for it. So this means that your categories, your keywords, have a big impact. If we think about those four things, title, author categories, and keywords, keywords are the, are the key to making your book show up in higher searches. How you build that keywords list is going to be important too, because it's not just about having keywords, but it's about having relevant keywords that actually matter, and this goes back to Chris's earlier point. You know?
You have to think about the personas, the people, who will be looking for your product. Who are those people? What do they feel like? What do they act like? What do they think about? And coming up with those, those personas can help you understand what keywords to put on your product. We actually have a publisher that we worked with on, on creating some keywords couple of years ago, and they were trying to figure out how to make their book show up higher in searches. And in the process of doing a keyword analysis for their titles, they found that while their titles were written for teachers, for teachers of students specifically who have autism, or are on the autism spectrum, they found that it was actually parents who were writing most of the reviews on the books. And they didn't know that that was their target market, but that's a target market that they can really key into, so they wrote their keywords in such a way as to help them target that market better, and they saw an increase in sales as a result.
The goal is to understand your market, and to then use that understanding to help you build up keywords. And there's a lot of practical advice you can, you can find online, there's a really good paper, that the Book Industry Study Group created, I was on the committee that created this, this working paper about how to create keywords. There's a lot of practical advice, how to go and find the right keywords, what tools to use to do it. I highly recommend that you go check that out. There's, the key really is to think about that persona, to think about the people who are writing the reviews of your books, and buying your books. What are they going to search for? And you can do that pretty easily. You go and look at the reviews and say, OK. What are the, what are the words, the phrases that people are using to describe my content?
If that's a pretty consistent thing, then use that as a keyword. Amazon limits you, if you're sending out ONIX into Amazon's system into Vendor Central, they'll limit you to 210 bytes of information in that keywords field. You can give them more, but they'll cut, they'll truncate it at 210 bytes or 210 characters, which means that the order your keywords are sent in actually matters, and the best ones needs to be at the top.
But it also means you can play around with that. So over time, you might play around with which keywords. So on the next screen, you know, the, the keywords you send for, say that Japanese cooking cookbook are gonna be really interesting keywords for a cookbook, but you'll also notice there's some other things in here that you may not actually, may not actually see, or understand as a keyword for a cookbook. So, for example, right toward, toward the middle, and I don't have it highlighted, but Marianne pointed this out when we were chatting the other day. There's one called Married a Japanese.
You know, if you married someone who was Japanese, you might want to be able to cook things that are traditional for them. And you might want to understand a little bit more about the culture. And so having that as a keyword might actually help you understand that this book will help you in some ways. So again, this is all about understanding the personas and the people who are buying your books and, and trying to target that metadata toward them.
So, beyond the keyword metadata, there's also the question of what actually gets used, what actually is visible to the consumer. And this is where you get into a whole bunch of problems, because you, as a publisher, will send out a lot of data to different retail stores and, and most of them will ignore most of it. And that's really what it comes down to. So, unfortunately, if you look at, basically anyone besides Amazon, and Barnes and Noble, all that great, rich, descriptive content that you're creating and sending out is going to be ignored, and not going to be shown.
But, it still means that you need to send it, as much as you can, especially Amazon, or Barnes and Noble, will show things like the, the table of contents or the excerpt of the book or all the book reviews. Those kinds of things can be helpful. And I mentioned before, that Nielsen study showed that when you have book reviews, and an author bio, and, and a book description, those are important fields to have for people to know that they want to buy your book. I also would include an excerpt in that as well because I think book excerpts are really important just for, for sales purposes. So, this is, this is really important deta– detail, really important data to consider, and just realize that even if you send it out, it may not always be picked up, but it is going to be important for the, for those sites that do have it for people to be able to see it and know that it's going to work well.
So, beyond the data that you're sending out at the beginning of your process and beyond all the kinda, let me, you know, come up with these ideas when I'm cutting, putting the book out, a big problem that a lot of publishers have is not paying attention to data after the book is published. So after the first six months, or after the first three months, you've gone onto other projects, things, other things are important. And it may be that, you know, some titles, they suffer from that. And some titles will just suffer because they were only beneficial, they were only really going to hit a high sales point at the very beginning, and they never were going to have a lot of sales later. The long tail and publishing is getting longer.
As we extend out publishing processes, we're able to do print on demand. We're able to, you know, e-books are available for basically, forever, you know, books that are published five years ago, 10 years ago, that in some other situations would never be available again, they'd go out of print and we never see him again, are still available, and actually, the number of titles that are being sold, the percentage of titles that are being sold that are backlist is going up significantly. So NPD just reported last year, on the next slide, that in 2019 the, the number of titles that – on the next slide there.
There we go. So the number of titles, or percentage of titles that are backlist titles have gone up from 50, what is it, 54%? No, 57% in 27– 2015 to 63% last year and in 2020, due probably to the pandemic, the number's now at 69% according to NPD, just in the first part of 2020. So that's a percentage of all book sales. Basically, 70% of all book sales or backlist titles, something that was published more than a year ago, and that means that your data and the product data about those books is even more important than you might expect.
So the problem that most publishers run into, though, is that as that long tail gets longer, your data has to be updated. And you don't know that it has to be updated, or you don't think about it. So the detail that you give to a, to a retailer, the retail, the metadata you give out to a retailer could be overwritten by any number of places. So on the next slide, you know, it's a, it's a constant process to ensure that your data being sent out to trading partners, whether that's a retailer, whether that's a distributor, a wholesaler, all of those other places can also update your data in other places. So you may send data out, and then, you know, the next day, an older distributor sends out some data that was out of date, or a data aggregator like Bowker's sends out their data, and their data isn't correct.
An author can go and make a change on an Author Central page that has some impact on your book, on the book online. So there's a lot of things that can have an impact. So this is where it's important as a publisher to think about not only the quality of the data and keeping that data up to date, but also rescinding that data on a regular basis. Making sure that you do a data feed on a consistent basis to your trading partners, to make sure they have the most up-to-date feed, the most up-to-date information, to do. And then, thinking about, OK, so what's the process? How do I, how do I update data? Where, how often should I update it? What should I be focused on when I am updating my data? So, a couple of things to think about when you're doing updates to your metadata. One is review quotes and endorsements. It's pretty often that you'll see, OK, so there's a good review for this book that came out more recently. Let me go and put that on, that, on the product data page, and, and send that out.
Keywords are an obvious example. Something to update. You know, if you can do keyword updates every six months on a title, especially titles that are doing well, tweak the keywords, move some things around. Put some keywords that are lower down on your list higher up, and you know, Look at reviews and come up with another set of keywords based on reviews and the audience analysis and the personas, maybe again, figure out whether it's actually those people that you thought were going to buy the book that actually did buy the book. You know, a year after you've published the book you're gonna see a lot more about who actually is buying it by reading those reviews than you did before you published it.
BISAC subjects can be updated and, you know, if you're finding your book is in the wrong category, or, kind of needs to be tweaked into a different category then you can do that, book excerpts are great to put out there. You can even change that up if you wanted to put the first chapter at the beginning and maybe, you know, throw in, maybe the preface or something later, there's a lot of ways to kind of change that data. Any change to your metadata is going to help the sites recognize that there's something different, and Amazon especially will pay attention to data changes and will help cycle your book higher in ranking. So if you're making positive data changes to your titles, so it's important to keep it, keep up to date with that data.
And then author bios, you know authors change, things happen. They, you know, they change jobs. They get a different career in some other way. They write another book that's really important. There are great examples of this where an author wrote, their debut novel and it wasn't really that popular or that powerful, you know, sold a thousand copies. They, they didn't make a lot of money but then they suddenly wrote, you know, they wrote the second book, and all of a sudden, you know, they're a best-selling author of whatever: you know, "Eat, Pray, Love," or whatever, you know. And that, those, those examples are pretty broad. And you can, you can take that, that information and go back to those older titles, update that author bio information, update the keywords, and help drive more people to those older products that people might be interested in, because they, they read the newer book.
So there's a lot of metadata you can update. And I think in addition to updating data, it's important to watch your titles consistently. It's important to watch them in the marketplace, to catch issues, to catch opportunities that may arise. For example, watching sales rank of your book can help you catch titles that suddenly jump up in visibility and sales. Amazon sales rank can change constantly. And over time, the sales rank on a product will have kind of a trend that follows this kinda sawtooth pattern where you'll see a big jump in the sales rank, and then kind of a slow decline, and then a big jump, and a slow decline. Sales rank on Amazon is a factor of not only the sales of the book, but also of the visibility and the number of people going in and looking at the book. So the number of times the product page is loaded. So if you see this kind of uptick and drop-down, uptick and drop down, this is really common.
The real goal here, though, is to, instead of looking at the sawtooth is, on the next slide, to look at the overall trend. So the overall trend of your book is where you're going to see that the sawtooth will tighten up at certain points, and that's an, that's a reason to focus more attention on that product. When you see that sawtooth pattern start to tighten, it means that the sales are doing better. It means things are happening. It means the effect of whatever marketing you're running is, is working really well, so it's great to focus on that and to know when that tightens that I can really focus my attention to my advertising on that product.
And then on the next slide, in addition to that, it's also very helpful to see the sales rank in conjunction with calendar events or marketing campaigns or things that are happening in the real world. If I ran a Black Friday special on a, on some of my products and dropped the price or something, I want to know, did that have an impact? I want to, you know, looking here, I can see that, you know, Black Friday, the sales were kind of, you know, they kind of picked up compared to how they had been before that. And so that you can see the tightening of that, up and down sawtooth, but then after Christmas it went back to the way it was before. So, my book was selling pretty well during Christmas, was I running advertising for that product during that time? Was there an author that was actively engaged online or something else going on? What was happening? And maybe take that into consideration.
And then, on the next page, another example of this is, you know, looking at things from kind of a long term, year to year view. This is a book, a children's book about Saint Patrick's Day. Obviously, you know, sells pretty well at Saint Patrick's Day, but if you look closely at this, the sales for this book and the sales rank started to jump and started to really tighten up in January. So people were starting to look at this book and think about buying it, or buying it directly. And that happened in January, not in, you know, February or March. That may actually have an impact on your marketing for this, and other products that are similar in the future. So watching, watching the products over time, watching what's happening, that's important too. It's not just about the data that you're sending out, it's about the data that you're bringing back in and being able to react positively to opportunities and to issues as they occur in the marketplace.
All right. So, Chris.
[Christopher Hill] Sure. So, here's a few key takeaways. I think that you might consider your approach for metadata and search should really bring in a product management mindset, really treat that as a distinct thing that, that needs its own attention, its own focus, its own expertise, somebody thinking about it. Even if you're a small organization, and you have very few people, somebody should be tasked with at least periodically reviewing our users, connecting with our products, effectively.
And looking for those trends and those patterns in the data that's available to you from Amazon, or even going through those Amazon reviews. Somebody has to have the job to say, Hey, I'm gonna look at our product page on Amazon, I'm gonna see how it's being presented, is it still being presented correctly? What are, what are the review saying? Are there are a bunch of parents making reviews for a product that I thought I was selling to teachers, but it turned out I really wasn't, and then how does that impact the product itself?
Obviously, you know, the more limited your resources, the more limited you are there. Understanding, you really have to understand that metadata and search are critical for sales especially when you're an environment like on Amazon. But then also, understanding how other environments, like, Wal-Mart, I noticed, used very little of the data that was being provided to them. That sounds like they basically, just throw it up there. And there's not a lot of sophistication with how your, your product is presented there. So that limits how much you can do there.
Reverse engineer your understanding of Google and Amazon. So even if, you know, I, I search for products a lot on Amazon, I'll search for books on Amazon. But I also often enter those queries into Google. Now, Google will try to understand when I'm trying to buy something. And it can do shopping results. And Google does change the way it treats shopping results from everything else. But I don't always indicate to Google that I'm looking for shopping results. And I'm not always looking for shopping results. Sometimes, I'm looking for good books about a subject. What is your content being presented? Are you in those Google results, and what can you do, whether it's by blogs or interviews or other published things you can do on the web to really raise those relevancies for Google?
Then finally, you have to consider this an ongoing practice. This is not a one-shot deal where you throw metadata on the content and throw it out to the world. If that content is really going to have that evergreen sort of long tail existence, you really need to think about revisiting this periodically and continuing to improve on the relevance of, of your metadata and your content given the context that your products live in.
[Joshua Tallent] And Chris, I'll, on that point, it's something that I think is important to think about it. I get the question, How often should I update my data? How often should I go out and do that? Because of that long tail of printing publishing, and because of, you know, our backlist titles are selling so much more than frontlist titles, it is important to do that on a regular basis. Some publishers will do it yearly, some will do it more often for books that are selling well. So if you see sales for books that are, you know, that have been a perennial bestsellers, you want to make sure those are updated more often. I talked to a publisher a while back, and they said what they do is that they look at the, at the last three years of books that were published in the current month of the year.
So, I look back at the last three years of books that were published in the month of March, and I go back and look at all those titles, and I, you know, update them where necessary, and kind of tweak their data. That at least gets you one, one time a year where you're updating data for every product that you can. At least for the last three years, and then, again, for those bestsellers, you might do it every six months, or every three months. And if you're watching those titles, you'll start to see patterns that evolve. And as you're making changes or doing marketing, or doing advertising, some kind, it's gonna, it's gonna be able to, you know, drive that change in your data as well, but at least once a year it's a good idea to just go back and tweak things and double-check stuff.
[Christopher Hill] Sounds good.
[Marianne Calilhanna] Thank you both. We do have some questions, so we have about 10 minutes left. I will invite anyone, if you have a question, please feel free to submit it. We're going to try to cover these; if we don't, we will be in touch with you after this webinar. So, Joshua, could you give a little bit more info on that information regarding Amazon's move toward the 250-byte limit?
This person has been testing companies' keywords and found that they are still using between, between around 500 to 800 characters.
[Joshua Tallent] Yeah, so, Amazon has changed their story on this a lot over the last couple of years. Essentially, what they said a couple of years ago, was, We're going to limit the number of keywords that will actually index from what you provide to 250 bytes, 250 characters, essentially.
That was supposedly adopted in Vendor Central, but we had the same reports of publishers who were testing it and saying, Look, you know, we're, we're still seeing keywords in the 500 plus or 800 plus character range still being used by Amazon. They don't seem to have affected in any of the changes there. So, OK, so it's not really being effective, but we also had some publishers we were working with who were like, Yeah, it's actually, it definitely is cutting off at that 250-bytes limit. It was also, in some cases, we saw book to book. It wasn't necessarily across all of the products, the products of a single publisher. They would have some that were, you know, accepting larger numbers and some that weren't. I think the problem is that Amazon is not consistent in what they announced to the public, or to their vendors versus what they actually are doing on the backend. So there's some aspect of that.
Last year, at the earlier this year, actually, they, they changed that number to 210 bytes, they've made that very clear, we're changing it to 210 bytes, we're ignoring spaces and we're ignoring delimiters; if you have semicolons delimiting those keywords, those are ignored altogether. They kind of mash all the keywords into one, one and grouping. What's really interesting about that is that, again, despite that, I still heard the same reports of publishers who are seeing that it doesn't actually affect anything. I think your mileage may vary. Your best option and one of the things I would recommend you do is still put the best keywords at the front of the list. The good thing is that Amazon has said that even with the 210-byte limit, they won't cut off. They will truncate what they index, but they won't penalize you for having more. So you can give them up to 2000 characters' worth of keywords if you want.
They'll truncate it themselves to that 210-byte limit and supposedly ignore the rest of it, but even if they don't, you get the benefits. So, extend them as much as you can. But just make sure that your keywords, the best keywords are at the front of the list, so that they are indexed for sure.
[Marianne Calilhanna] Yeah, and that was, there is a follow-up question about that order of keywords to Amazon. So is it the order you are entering in the search box or the order they are listed in the keywords field?
[Joshua Tallent] So the order that you provide to Amazon is the order that they consider the most important. So whatever's at the top of your keywords list in your ONIX file is what they're gonna consider the most important keywords. And then those can be mixed and matched in whatever way; they have a very complex machine learning algorithm that will take different words from different places in your keyword list, and use it in different ways to return results, the book on the results for any type of search. So it won't matter if you have the right phrase. You don't even have to repeat the same word, if you have Japanese cooking and Japanese recipes, you can change. Just say, Japanese cooking recipes. Amazon will return your book for results for Japanese cooking, and for Japanese recipes, without the word Japanese being duplicated in your keywords list.
[Marianne Calilhanna] That was the next question about, um, your example that the "Japanese" was repeated, so it sounds like you would not recommend repeating "Japanese" in front of all those keywords.
[Joshua Tallent] No. Amazon says you don't have to, because the way their search works at it will actually return results, regardless.
[Marianne Calilhanna] And what do you know about how Google uses metadata from Amazon? Do they use index descriptions, keywords, BISACs, can you speak to that?
[Joshua Tallent] So, from what I've seen, Amazon does try to scrape Amaz– Google does try to scrape Amazon. The results are not nearly as good as you get on the Amazon search itself. So, you know, you'll see Amazon results show up, but usually, it's things like the title of the book. It's going to be maybe, some book description, if it's there.
The, but the real key for doing any kind of search is to focus on the Amazon search. I would recommend for publishers, or even people who sell other products, to focus on your website as being the key place where people get information about your book, and then point them to Amazon. So try to make your website the hub. Like Chris was talking about earlier, try to get your SEO to get your website, for that book, that book product page, to, be the top result on Google. If you can do that, then it won't matter if they're going to end up at Google, at Amazon, or Barnes and Noble, or Indigo in Canada or Waterstones in the UK. You can link out to those locations from your product page, but you also have the opportunity to capture e-mail addresses and to capture their attention and say, hey, we're the publisher of this book, we have these other books that you might like, right? So you want that. You want your website to be the key, if at all possible. So work on that SEO. Ignore whether Google will actually index Amazon properly, but focus on your own website and its SEO instead.
[Marianne Calilhanna] And someone just asked for keywords. They'd been told previously that there is no need to repeat words that are included in the title or description. Is that indeed correct?
[Joshua Tallent] Well, the title, for sure. You don't need to repeat words that are in the title. In the description, if you're doing what Chris was mentioning earlier about using taxonomies to think about how you write your descriptive copy, and those are potentially keywords that you will want to include in your keywords list, because, remember, Amazon does not index your book description for search. So if you have special words that you're using in your marketing copy that are something you're trying to use and to get people to think about this book, or you're using that in your marketing as well, and your advertising, if you're saying this is, you know, so I don't know, I just want to think of a word that would be useful in that case, but you want to, you want to put that in the keywords as well, so that you're not relying on some sort of indexing to happen in the description, which doesn't happen.
[Marianne Calilhanna] Thank you. How important is granular scene level metadata?
So this is a question coming from someone who, a media company, they have a video content library, and they're trying to understand to what degree granularity for scene level metadata for, you know, a company, a video content library.
[Joshua Tallent] So, Chris, would you, can you speak to that? Do you have much experience? I'm assuming you have quite a bit of experience on that side.
[Christopher Hill] So, well, really, when you're publishing things, for Google, at least, again, Google is looking for content. Now, Google will try to, for video content specifically, will take into account some of the keywords, but they have their technology now to transcribe. So if you look at, you know, YouTube or other ones, you can get transcriptions. They do a lot of that on the backend for the video to try to determine what at least parts of videos are.
And, again, keywords are we really don't know how much value for video content. It's still a mystery. But I know that, again, Google has the same problem with video. They deal with other things, which is people will try to put keywords into rank high, not necessarily to connect a user to a video they actually want to watch. So to fight that again, Google does their own transcriptions, they do things like that. One tip for video that I think could be really useful to, again, try to develop content that can help video, is to actually transcribe the video yourself.
So if you create a transcription of a video, and you publish that alongside the video on the page, that page for the video will then get full advantage of a quality transcription. Because, again, if you rely on Google's transcription, they're gonna miss things like your product name. They're gonna screw that up. You ever look at one of the automated transcriptions? Those keywords are gonna get butchered. So, by publishing a transcript that you've created, that you know is quality, you can really raise the relevance of your videos. So for video publishing, I think that's, that's a good starting place. Again, you can put those keywords in, but it's really questionable about how much you can really impact your rankings just through keywords alone.
[Joshua Tallent] I would add on to that too that I think it's very important for anyone who has chunkable content to consider the future of how chunkable content might actually be beneficial. Publishing used to be serialized in many ways, you know, some of the best novel ever written were serialized in magazines before they were ever published in a book form. And we're going back to serialization. And there's always changes in technologies, always changes and new products, new, uh, new things that people come out with, new ideas. I would recommend for anyone who publishes any kind of content, to try to think about that chunkable content metadata from the beginning, because at some point in the future, somebody's going to come out with some app that everybody loves. And it's going to be doing things that you don't expect. It's going to be, you know, having some impact in the market, and you're, like, all of a sudden, I have to create metadata for all of this stuff that I didn't have before. So, if you think about it, now, you'll be proactive. You can have a better impact in the long term, and be prepared for those future changes.
[Christopher Hill] Yeah, it's taking that product management strategy, you know, really treating this as an activity that, that has importance in and of itself, in an organization. You know, too, I still walk in to many organizations, and you'll find that, you know, the editor has been asked to write key words. Well, what does the editor know about your marketing strategy? It's probably not the right guy. And if they're good, they may go out and proactively look at how other products are marketed or do marketing activities, because they're trying to do a good job. But really, you need to make that a first-level assignment to somebody, somebody who has skill, who, who is tasked with those roles.
[Marianne Calilhanna] Well, thank you, we've come to the top of the hour. I'd like to thank everyone who's taken this time out of their day to join us. The DCL Learning Series comprises webinars. We also have a monthly newsletter and a blog, so I invite you to go to our website and subscribe. You can also access many other webinars related to content structure, XML, metadata from the on-demand webinar section at our website. We hope to see you at future webinars. Have a great day, and this concludes today's broadcast.