Building the Homosaurus: An International LGBTQ Linked Data Vocabulary

This post was written by Walt Walker, Head Cataloging Librarian at the William H. Hannon Library and member of the Homosaurus Editorial Board. We have featured Walt’s previous work here and here.

The Homosaurus is an international linked data vocabulary of Lesbian, Gay, Bisexual, Transgender, and Queer (LGBTQ) terms that can be utilized to provide libraries, archives, museums, and other institutions with a ready-to-use thesaurus of LGBTQ+ index terms that in turn can enhance the discoverability of their LGBTQ resources. The absence of such an index was seen as a major reason for the lack of indexing of gay and lesbian materials and information about LGBTQ+ life and existence. Standard indexing and classification systems do not offer a sufficiently detailed and up-to-date vocabulary to retrieve the wealth of resources available.

The Homosaurus began in 1997 as a Dutch and English gay and lesbian thesaurus primarily used by IHLIA, an LGBT archives in Amsterdam. Eventually, more terms relating to bisexuality, trans*, gender, and intersex concepts were added, but not methodically. About 15 years later, the thesaurus was transformed by adding more inclusive terms and making the structure more hierarchical, creating a second edition. K.J. Rawson, director of the Digital Transgender Archive (DTA) at the College of the Holy Cross in Worcester, Massachusetts, turned the second edition of the Homosaurus into an online linked data vocabulary. This allows library, archives, and digital collections to describe their LGBTQ collections using the same vocabulary, and then to link their use of the index terms (or subject headings) to the latest updated form in the online vocabulary.

In June 2016, I was selected to present my paper “Overcoming the Barriers: Improving Access to LGBTQ+ Content in Collections” at the “Without Borders” ALMS (Archives, Libraries, Museums, & Special Collections) Conference in London, England. There, I met Jack van der Wel from IHLIA at the conference and attended his presentation on the Homosaurus. Afterwards, we talked and he invited me to join the international editorial board he was setting up for the Homosaurus, so that we could work together to improve the vocabulary.

Seven board members started meeting online through Skype or BlueJeans every 1-2 months, starting in August 2016. We decided that we should create an abridged version that would be only LGBTQI terms, as there are many non-LGBTQI terms in the original Homosaurus. This would have more possible terms for use in indexing LGBTQI resources than general thesauri such as Library of Congress Subject Headings (LCSH), and could be used as a supplemental thesaurus in libraries, archives, and digital collections. At first we mostly discussed which terms in the original Homosaurus were not LGBTQ-specific and should be discarded, and then eventually we moved on to adding many new terms, including many transgender terms from the Digital Transgender Archive. A lot of the terms that were discussed were related to religion, ethnicity, sexual practices, feminism, family, kinship, social media, law, and different kinds of identity (sexual, gender, butch/femme, etc.).

A few members left and joined the Editorial Board over the past 3 years. We finished our revisions and turned our Word documents into the online linked data vocabulary that you can find at http://homosaurus.org/. Since it was launched online on May 1, 2019, we have started publicizing the Homosaurus so that we can increase its use. The Homosaurus has been approved to use as a thesaurus source for subject headings in MARC records, with the code being “homoit”. We have also created some publicity materials (such as the logo on this page), and have added two new members to help us revise and maintain the linked data vocabulary. There is now a page on the website for submissions of new terms and for suggestions of further revisions.