What the Publishing Industry Can Learn from AO3’s Tagging System

Before the likes of E. L. James and Anna Todd, it was hard to imagine fan fiction authors as having any sort of influence over the publishing industry. Love them or hate them, the two authors have brought fan fiction to mainstream conversations about the legitimacy (and legality) of transformative works. Notably, Archive of Our Own—a website for transformative works such as the above—was a recipient of the 2019 Hugo Award. That same year, the website (commonly called AO3) reached the milestone of five million fan works housed on its domain. That’s no small feat for an open source, collaborative project that started in 2007 as a means for fans to defend and preserve the legitimacy of fan fiction. AO3’s means of organizing its content has rightfully given the website a lot of attention. So what can the publishing industry stand to learn from AO3’s tagging infrastructure?

Gretchen McCulloch investigates the ins and outs of tagging on AO3 in an article for Wired. For publishing in the digital age, getting your content heard amongst all the noise is often an arduous task, and one that has no perfect formula. When AO3 began, tagging was still a relatively new feature on the internet. One of AO3’s known predecessors, livejournal.com, utilized a relatively basic tagging system in order to connect users with others that had similar interests. Many platforms utilize what is called “laissez-faire” tagging, where anything goes. On its own, tagging this way appears to be a rather simple tool: tag relevant things to your content, and the audience you want to reach will find it. Right?

On the contrary, tags that might seem obvious to some are perhaps more opaque to others. For example, think of a show like Parks and Recreation. Imagine you were to look for content relating to that show or its fandom. How do you go about navigating that? Do you search for #parksandrecreation, or #parksandrec? What about its characters? Do you include first names and last? What about side or one-off characters? Some users’ approach to this problem is to simply tag anything and everything. But doing so can be time-consuming, and there’s also no guarantee that you’ll be able to think of every possible related tag.

There are some archives that have tried to be more like the tried-and-true Dewey Decimal System, where tagging systems adhere to a stricter set of categories. For the most part, you will find these systems in place for many professional databases, and they’re helpful for cutting down on confusion from multiple tags that achieve the same end. However, they have drawbacks as well, such as the effort it takes to build and memorize these systems. Not only that, but they suffer from being outdated as well.

In their archive, AO3 eventually built their tagging system into something much more useful (and arguably sophisticated) that takes the best qualities from both of the previous methods: tag wrangling. Of course, this isn’t by any means an automated process. AO3 utilizes numerous volunteers—who can join in during recruitment periods to claim fandoms—to engage in this tagging process. Wrangling involves looking up new tags and finding ways to connect them with other applicable existing tags. What this means is that those who are publishing content online don’t have to know all the different iterations of tags necessary to get their content seen. Likewise, those searching for said published content don’t have to comb through every applicable tag to find the diamond in the rough.

Over the years, AO3’s tag wranglers have amassed millions of tags, having wrangled up to about 2.7 million as of 2019. Wranglers dedicate hours each week to combing through new tags to decide whether they should be treated as independent tags, synonyms of another tag, or as subsets of other existing tags. AO3’s use of human volunteers over machines for tag wrangling might have begun as a consequence of lack of proper technology in 2007, however, it is clear that the humanity behind the process is what ensures that it’s so intuitive.

In the publishing world, it might not be as necessary for so many human volunteers to provide context for things that fandom can’t do without. However, it stands to reason that AI has many drawbacks that could pose issues of mistranslation for those looking to implement automated processes for tag wrangling. At the very least, it would be interesting to see human tag wranglers working alongside AI to correct any mistakes. Certainly, AO3 is a great example of what publishing can be in terms of organizing and cataloging content in a user-friendly, intuitive way.

Leave a Reply

Your email address will not be published.