Learn about the Catalogue of Endangered Languages, the research project which provides the language information you see on the ELP site.
Of the 7,000 or so languages spoken on earth today, nearly half are in danger of being silenced within the next few generations. This is a humanitarian crisis of unprecedented scale and pace: as far as linguists and other scholars know, there has never been a time in human history when language diversity has faced such rapid destruction.
In order to learn about the current rates of language endangerment, or even to estimate the number and current status of the world’s endangered languages, we need a reliable source of information about the current state of language endangerment around the world. The Catalogue of Endangered Languages (ELCat) fulfills that need by providing information about language endangerment and vitality which is trusted and used by researchers, community organizations, policymakers, students, and other audiences worldwide.
The Catalogue is a resource which provides comprehensive, reliable, and up-to-date information about the current status of the endangered languages of the world, from the mid-20th century until the present. It provides a freely accessible database of language vitality information at a global scale, which is actively updated on an ongoing basis. It also gathers information about language vitality from multiple sources, and shares information from as many reliable people and publications as possible, rather than only presenting one source of information.
The Catalogue aims to inform and support scholars, revitalization practitioners, language advocates, educators, policymakers, and the general public by sharing all of this information freely. We believe that sharing information about language vitality and endangerment can support people to reverse this global crisis and create positive, pro-active responses to language endangerment.
The Catalogue of Endangered Languages (ELCat) is developed, published, and maintained by the University of Hawaiʻi at Mānoa (UHM) Department of Linguistics, under the direction of Dr. Gary Holton. It is published online through the website of the Endangered Languages Project, which has partnered with UHM and the Catalogue since its launch in 2012.
For more details about the Catalogue, the language information contained in it, and how that information can be used in language documentation and revitalization, research, and policy, see the book Cataloguing the World’s Endangered Languages (2018).
It is important to make clear what the Catalogue does and does not contain, and what it does and does not do. The way language vitality is represented here can be very useful, but it is not the only way of representing or understanding language vitality.
The Catalogue is a database of quantitative information (numbers and measurements). It shares information like estimated number of speakers, degree of intergenerational transmission (whether children are learning a language), and domains of use (what parts of life a language is used in). The Catalogue does not represent the qualitative (experiences, stories, perceptions, emotions) aspects of language vitality and endangerment.
Quantitative information, like what’s contained in the Catalogue, can be very useful for looking at broad global patterns of language vitality and endangerment. Using numbers and measurements makes it possible to compare the vitality of thousands of languages across every continent, in a general way. However, numbers cannot represent or describe the importance, meanings, or experiences of language loss or reclamation.
There is a lot of good writing and scholarship about the use of numbers to measure language vitality and endangerment. This body of work addresses the ways that quantitative research can both create incomplete or harmful understandings of language endangerment, as well as the ways numbers and statistics can be used to advocate for language revitalization and rights.
The information in the Catalogue is gathered from a wide range of places. The Catalogue project does not conduct firsthand research or fieldwork. Much of the information comes from published sources: books, journal articles, censuses, conference talks, and so on. The Catalogue research team regularly reviews new publications, and adds the information from these publications to the database.
We also gather information directly from individuals and organizations. Through our well-developed global networks, the Catalogue team and ELP are in touch with community organizations, academic institutions, NGOs, revitalization programs, scholars, tribal entities, and others who have firsthand knowledge about a language’s vitality. We invite our network to share information with us about the languages and language communities they are working with.
All of the information in the Catalogue is reviewed by the International Board of Directors, a group of academic linguists who work with and specialize in the languages of specific regions or language families. The public cannot edit the language information on the ELP site directly; they can, however, suggest new information, which will be reviewed by the International Board of Directors. This is to ensure that the information you find in the Catalogue meets widely used standards of academic rigor, and is as reliable as possible for research purposes.
The Catalogue aims to gather and share all available, reliable information about the vitality of each endangered language. This means our language information might look a little different than other databases: for each language, we feel it is important to present information from many different sources. This allows users to compare different vitality estimates over time, or see disagreements between sources. The Catalogue does not decide which description is “right”, but it does highlight one source of information (usually the most recent or thorough one) as the first set of information that appears on a language page. To see where information came from, or to compare multiple sources, you can consult the “Language Information by Source” section on each language page.
The Catalogue project also prioritizes ethical and responsible practices in gathering and sharing information. We aim to share only information that is appropriate to disseminate to the public, and our work is an ongoing conversation between the research team, the communities whose languages are being discussed, and the people who use the Catalogue.
If you wish to share more recent or accurate information about the vitality of a particular language, we invite you to contact us.
The system used by the Catalogue of Endangered Languages to assess levels of language vitality is called the Language Endangerment Index (LEI).
The LEI was developed in 2011 by the Catalogue research team at the University of Hawaiʻi at Mānoa. In the early development of the Catalogue project, the research team considered using existing assessment tools such as the EGIDS or UNESCO scale, but for various reasons, felt that these tools did not meet the needs of the Catalogue project (see Lee & Van Way 2016). The purpose of developing the LEI was to be able to better understand and represent patterns of language endangerment at the global level; to better reflect real-world factors in language shift; and to have a more theoretically nuanced tool for measuring language vitality than what was previously available.
There are two layers of measurement which happen within the LEI: the vitality rating, and the “certainty” rating, which refers to the amount of information that was available to create the vitality rating. These are described below.
Vitality rating: The LEI assesses four factors of language vitality:
- Total number of speakers/signers
- Trends in speaker numbers (whether they are increasing, stable, or decreasing)
- Intergenerational transmission (whether the language is being passed between generations)
- Domains of use (which parts of life the language is used in)
Each of these four factors is assigned a numeric rating from 0-5 (0 being least endangered, 5 being most endangered), based on the information presented by a specific source (a person, book, article, website, etc.).
Each of these four numeric ratings is then averaged into an overall “vitality rating.” In this measurement system, intergenerational transmission is weighted twice as heavily as the three other factors, since language transmission is considered within the field of linguistics to be a crucial component of language vitality.
Certainty rating: If no information is available about a given vitality factor, it is not scored. For example, if a source provides no information about how many people use a language, the vitality rating based on that source will not include a score for speaker/signer numbers.
The number of factors which are scored determine the “certainty” level of each vitality rating: for example, a label of “100% certain” does not mean that this rating is infallible, it simply means that information was available to score all of the LEI factors. Similarly, a label of “20% certain” means that information was only available for one, non-transmission factor. Low certainty ratings indicate that the vitality rating is based on sparse information, often because that information is not available to the Catalogue research team.
The vitality rating presented for each language is not meant to be the final word. These scores are provided for practical purposes, to give a quick and rough visual indication of a language’s endangerment status. A more in-depth description of the LEI is available in this article:
Lee, N. H., & Van Way, J. (2016). Assessing levels of endangerment in the Catalogue of Endangered Languages (ELCat) using the Language Endangerment Index (LEI). Language in Society, 45(2), 271–292. http://doi.org/10.1017/S0047404515000962
The goals and basic structure of the Catalogue were established in a 2009 workshop which convened roughly 50 academic linguists from around the world, supported by a US National Science Foundation grant titled “Collaborative Research: Endangered Languages Information and Infrastructure Project”. The Catalogue itself was initially developed under the direction of Dr. Lyle Campbell (University of Hawai‘i at Mānoa) and Dr. Anthony Aristar and Dr. Helen Aristar-Dry (LINGUIST List at Eastern Michigan University) in the initial project phase, 2010–2013.
In 2012, the first version of the Catalogue was released to the public as part of the new Endangered Languages Project website.
In 2014, the project moved into its second phase, and all research activity and the hosting of the ELP website moved to the University of Hawaiʻi at Mānoa. In 2016, Dr. Campbell retired, and Dr. Gary Holton became Director of the Catalogue.
The information in the Catalogue has been gathered and entered by a team of graduate student researchers from the University of Hawaiʻi at Mānoa and Eastern Michigan University over many years, and reviewed by the International Board of Directors. Since 2011, the research team has included:
- Carolina Aragon
- Liam Archbold
- Russell Barlow
- Anna Belew
- Amy Brunett
- Yen-ling Chen
- Jacob Collard
- Uliana (Kazagasheva) Donahue
- Cole Flottman
- Shirley Gabber
- Katie Butler Gao
- Bryn Hauk
- Raina Heaton
- Joelle Kirtley
- Eve Okura Koller
- Nala Huiying Lee
- Clemens Mayer
- Lwin Moe
- Josiah Murphy
- Colleen O'Brien
- Henry Osborne
- Melody Ann Ross
- Sean Simpson
- Kaori Ueki
- Gregory Vondiziano
- John Van Way
- Stephanie Walla
- Olivia Waring
- Stephanie (Locke) Witkowski
- Brent Woo
- Kristen (Dunkinson) Ikeda Yoza
The information in the Catalogue is reviewed and approved by the Catalogue’s International Board of Directors, a group of academic linguists who specialize in the languages of specific areas or language families. The Catalogue’s International Board of Directors are:
- Dr. Gary Holton (Catalogue Director), University of Hawaiʻi at Mānoa
- Dr. Willem Adelaar, Leiden University (Regional Director for South America)
- Dr. Greg Anderson, Living Tongues Institute for Endangered Languages (Regional Director for South Asia)
- Dr. Habib Borjian, Columbia University (Regional Director for Near East)
- Dr. David Bradley, La Trobe University (Regional Director for East Asia)
- Dr. Matthias Brenzinger, University of Cape Town (Regional Director for Africa)
- Dr. Lyle Campbell, University of Hawaiʻi at Mānoa (Regional Director for the Americas)
- Dr. Verónica Grondona, Eastern Michigan University
- Tracey Herbert, First Peoples’ Cultural Council
- Dr. Brian Joseph, The Ohio State University (Regional Director for Europe)
- Dr. Mary Linn, Smithsonian Institution
- Dr. Bill Palmer, University of Newcastle (Regional Director for the Pacific)
- Dr. Keren Rice, University of Toronto (Regional Director for North America)
- Dr. David Solnit (Regional Director for East and Southeast Asia)
For general information from the Catalogue of Endangered Languages, including this “About” page, cite as:
Catalogue of Endangered Languages. 2024. University of Hawaiʻi at Mānoa. http://www.endangeredlanguages.com
For general information from a language entry (e.g. language names, classification, or vitality status), cite:
"LANGUAGE NAME." Catalogue of Endangered Languages. 2025. University of Hawaiʻi at Mānoa. DATE OF ACCESS. < FULL URL OF SPECIFIC LANGUAGE PAGE >
For example:
"Xipaya." Catalogue of Endangered Languages. 2025. University of Hawaiʻi at Mānoa. Aug. 9, 2025. http://www.endangeredlanguages.com/lang/1001
Almost all information in the Catalogue of Endangered Languages includes a citation of the original source which provided this data (e.g. journal article, book, personal communication, etc.). You can find citation information at the top of the "Language Information by Source" box on each language page; if you wish to reproduce data such as speaker numbers, you may cite the original source provided there.
Dialects vs. Languages
The distinction between a “dialect” and a “language” is highly contested, not only by linguists but by language communities, policymakers, translators, educators, and anyone else who works with languages. Some scholars have questioned whether these terms are even useful to contemporary understandings of language; alternative frameworks such as “languoids” are increasingly used in linguistics and related fields, and the more general term “language variety” has long been a useful way to avoid making this distinction.
Within the academic discipline of linguistics, "languages" and "dialects" are defined in a variety of different ways. The most common linguistic criterion for distinguishing a language from a dialect is mutual intelligibility. If people speak somewhat differently but understand each other reasonably well, then they are considered to speak dialects of the same language. This criterion is difficult to apply in practice, since in many cases speakers will have some previous exposure or knowledge of the other variety. However, this linguistic criterion is not the only one used to distinguish languages from dialects. Sometimes the decision rests on political and social factors. For example, speakers of Swedish and Norwegian understand each other reasonably well, but for political reasons they are considered separate languages.
It is often difficult to make a clear judgment about whether a variety is a “language” or a “dialect” - there are many language varieties that are believed to be independent languages by some scholars, but are considered dialects of a single language by others, even when applying the criterion of mutual intelligibility.
The Catalogue does not claim to determine the difference between dialects and languages; however, this issue does need to be addressed in some way, as the original Catalogue design specifies a catalogue of endangered languages. For that reason, varieties which are unambiguously dialects of another language are not included as separate entries in the Catalogue. For the purposes of the Catalogue of Endangered Languages, since it is an academic project, priority is given to the linguistic criterion of mutual intelligibility of languages, although social and political factors are also important.
In cases where it is contested whether language varieties are dialects of one language, as opposed to closely related languages, the Catalogue includes separate entries for these varieties. These entries generally include comments from the Catalogue’s International Board of Directors, briefly summarizing the academic debate about the variety. As more is learned about these language varieties, it may become possible to gain greater understanding about the congruencies and distinctions between them. Their status as a “language” or “dialect” may simply remain contested, depending on who is talking about them. The differences between “dialects” and “languages” are nuanced and complicated, and dependent on wider issues like political, historical, economic, and sociocultural context.
As the Catalogue continues to grow and evolve, we will periodically revisit our thinking and approach to languages, dialects, varieties, languoids, etc. We welcome your comments and suggestions at: feedback@endangeredlanguages.com
Dormant and Awakening Languages
The Catalogue of Endangered Languages is meant to include all languages which are currently endangered, as well as languages that have lost all their speakers/signers within the past half-century (since roughly 1960). This time frame was chosen as a manageable scope for the research project, and reflects the increased availability of language vitality information in the published literature since the 1960s. Languages which currently do not have any (known) speakers/signers are included in the Catalogue under the endangerment category “Dormant,” or sleeping. This terminology is preferred and used by many Indigenous linguists and language advocates, and is increasingly used within academic linguistics, as it correctly indicates that a language with no living speakers/signers can still be “awakened,” or brought back into use.
The Catalogue of Endangered Languages does not use the terms “extinct,” “dead,” “dying,” or other similar terms. These labels are loaded and carry meanings that can be weaponized against language communities and revitalization efforts. Terms such as “language death” and “extinction” incorrectly imply that languages without living speakers/signers cannot be revived (since death is permanent), that their destruction is natural or inevitable (since all living things die), or that the peoples and communities who used these languages are somehow “gone” or “extinct”. These terms can be profoundly harmful and discouraging to communities engaged in, or wishing to engage in, language revitalization.
Languages which have had no known speakers/signers for hundreds or thousands of years are not included in the Catalogue, as this resource is meant to provide information on the current state of language endangerment worldwide. Detailed information about language shift and loss prior to the mid-20th century is beyond the scope of this project, although it is clear that vast numbers of languages have been silenced over the last 500 years.
Languages that are believed to have become dormant since 1960 are included in the Catalogue for a number of reasons. First, it is useful to look at language vitality and loss within the recent past (60 years or so) in order to understand the current state of languages around the world. In addition, there have been numerous cases where a language was believed to have no more living speakers/signers, but then a living speaker/signer came forward years or decades later. This is another reason the Catalogue includes languages which have recently become dormant: there may still be living speakers/signers.
Languages that went through a period of having no known speakers/signers, but which are now being revived and spoken/signed again, are listed in the Catalogue as “Awakening.” This status means that there is a concerted effort underway to bring a language back into use - the language is “waking up” through its speakers/signers, after a period of dormancy.
Finally, we must note that many languages included in the Catalogue have little or no recent information available to researchers about their vitality. The newest vitality information available about some languages may be years, or decades, old. This means that some languages listed as “endangered” may now in fact be dormant - this also means that some languages listed as “dormant” may now be awakening. Communities around the world are increasingly reclaiming and reviving their languages, and an ever-growing number of languages are being awakened. We hope to be able to update many language entries from the “dormant” to the “awakening” category in the years ahead.
As an online resource, the Catalogue of Endangered Languages is meant to be continuously updated and expanded. The languages in this resource, like all human communities and practices, are constantly changing; our goal is to make new and updated information available as soon as it is known to the Catalogue team.
We do our very best to keep this resource up to date, by constantly verifying and adding new information from publications, news stories, community surveys, and communications from language communities and scholars.
However, the Catalogue is constrained by its access to information about the world’s languages - there is not always an easy way for the research team to know about changes in a language’s vitality, unless someone disseminates that information to the wider world. For that reason, you may see information which is several years old, incomplete, or otherwise not optimal.
If you see information that is missing or outdated, please contact us! We rely on the people who are knowledgeable about a language to help us provide good information. It is important to us to ensure that languages are represented as accurately as possible in the Catalogue, and we are grateful to all who have shared their knowledge to improve the database over the years.
You can see the information in the Catalogue by visiting the language map and search page. If you'd like to download all of the information as a .csv file, visit the "Download the Data" link.
