Half of the world languages are dying really fast – how you can save yours

- July 4, 2017 in open-education

Languages are a gateway to knowledge. How can digital tools be used to help native language speakers access and contribute knowledge? In this blog, Subhashish Panigrahi shows how endangered languages can be documented and preserved using open standards and tools. The world’s knowledge that have been accumulated and coded over ages in different languages are valuable to learn about others’ cultures, traditions, and everything about their life. But not every language is not privileged to be a language of knowledge and governance. Almost half of the 6909 living languages of the world will be vanishing in a century’s time. The most linguistically diverse places like Papua New Guinea are also the most dangerous places for languages. Every two weeks, a language dies and with it a wealth of knowledge forever. In my home country India alone, there exist more than 780 languages. The rate in which languages are dying here is extremely high as over 220 languages from India have died in the last 50 years, and 197 languages from the country are identified as endangered by UNESCO.

Word cloud depicting several Indian languages in their native scripts

With these languages dying, there die all that knowledge that is preserved in those languages. Languages that do not have tools for everyone to access knowledge and contribute to often go out of use. India for example is home to the highest number of visually impaired and illiterate people in the entire world: more than 15 million Indians are visually impaired and 30% are illiterate. But there do not exist many digital accessibility tools either for web or mobile, even though there are about 450-465 million internet users and 60% of them are mobile users. In fact, accessibility tools for most Indian languages are not affordable and are proprietary in nature.
There have been some efforts by the Indian government—like the Central Institute of Indian Languages (CIIL)—to grow the 22 officially recognized languages and some of indigenous languages. Founded in 1969, CIIL has been working to deepen research on Indian languages, and a program called “Protection and Preservation of Endangered Languages of India” was introduced in 2014 to help CIIL specifically to begin several projects for the conservation of endangered languages. Only 10-30% of India’s population can understand English, which is predominantly the language of the Internet. A recent report that was published by Google and KPMG states that more than 70% of the India’s Internet users trust content in their native language over English. The lack of native language content and the lack of electronic accessibility tools therefore plays an important factor in stopping a large number of people from accessing information and contributing to the knowledge commons. When confronted with a problem of this magnitude, there are a few vital things that must be to done to preserve and grow dying languages. Creation of audio-visual documentation of some of the most important socio-cultural aspects of the language such as storytelling, folk literature, oral culture and history is a start. When done by native language speakers, along with annotations of the same in done in a widely-spoken language such as English or Hindi, it is one way of creating digital resources in a language. These resources can be used to create content and linguistic tools to grow the languages’ reach. Sadly, there is little focus from the central government on many of these languages, but there are some effort from several organisations to document native languages. There is something every single individual that speaks a less-spoken language or is in contact with a native speaker of an endangered/indigenous language can do. Languages that are dying need digital activism to grow educational and accessibility tools.That can happen when more public and open repositories like dictionaries, pronunciation libraries, and audio-visual content are created.

Wiki Weekend Tirana 2016 (photo: Anxhelo Lushka)

However, not many people know how to contribute in a form that can used by others to grow resources in a language. Especially in India, contributing to a language is largely skewed by the notion of producing and promoting literature. But in a country where more than 30% of the population is illiterate and a large number of languages are spoken languages (without a written counterpart), it is important that the language content is predominantly audio-visual and not just text-based. More importantly, there is a need for openness so that the whole idea of growing languages does not get jeopardized by proprietary methods and standards.

There are plenty of things anyone can contribute for documenting a language depending on their own skillset.

Every language has a wealth of oral literature, which is the most crucial thing to document for a dying language. Several cultural aspects like folk storytelling, folk songs, other narratives like cooking, local festival celebration, performing art forms and so on can be documented in audio-visual forms. Thanks to cheaper smartphones and an ocean of free and open source software, anyone can now record audio, take pictures and shoot videos in really good quality without spending anything on gears. There are open toolkits that aggregate open source tools, educational resources and sample datasets that one can modify and use for their own language.

A home recording setup for the Kathabhidhana project (photo: Subhashish Panigrahi)

In the age of AI and IoT, one can indeed build resources that will enable their languages to be more user friendly. As explained earlier, most screen reader software that the visually impaired or illiterate people would use do not exist because of the lack of good quality text-to-speech engines. Creating pronunciation libraries of words in a language can help a lot in building both text-to-speech and speech to text engines that eventually can better the screen readers and other electronic accessibility solutions. Cross-language open source tools like LinguaLibre, Kathabhidhana, and Pronuncify help record large number of pronunciations. Similarly, for languages with an alphabet, educational resources for language learning can be created with open source tools like Poly and OpenWords. Building these resources might not result in transforming the state of many endangered languages quickly but will certainly help in gradually bettering the way many people access knowledge in their language. The work of some of the groundbreaking initiatives like the Global Language Hotspots by the Living Tongues Institute for Endangered Languages and National Geographic can be used to start language documentation projects. But it is always recommended to make the work output available with open standards so that others can build solutions on the top of existing interventions. However, there is not much about the actual outcome of any government-led activities for endangered language documentations, and especially if there is any open access to the published works. “People’s Linguistic Survey of India” (PLSI), a non-government-led survey was being conducted during 2012-13 in the leadership of Ganesh Devy. A few years back, Gregory Anderson, founder of Living Tongues, and Prof. K. David Harrison, associate professor of Swarthmore College in Pennsylvania, US discovered a hidden language called Koro spoken in Arunanchal Pradesh. In 2014, Marie Wilcox, the last living speaker Wukchumni, a North American language, created a dictionary to keep her language alive. Imagine, where these languages would have ended up if Anderson and Harrison, and Marie did not take these baby steps back then.

Open source in everyday life: How we celebrated the Software Freedom Day in Bengaluru

- October 26, 2016 in free software, india, OK India, Open Software, south east asia

The free and open source software (FOSS) enthusiasts just celebrated the Software Freedom Day (SFD) on September 17 all across the world. This year, a small group of six of us gathered to celebrate SFD in the Indian city of Bengaluru. The group consisted of open source contributors from communities such as Mozilla, Wikimedia, Mediawiki, Open Street Map, and users of FOSS solutions. Each participant shared their own stories of how they got connected with FOSS and what component it plays in their day-to-day life. From how a father has been trying to introduce about open source to his young son while migrating from proprietary to open source back and forth as his job demands so, to an Open Street Map contributor who truly believes that large-scale contributions to open source can make the software as robust as proprietary ones and even better because of the freedom that lies in it. All of those who gathered agreed with the fact that FOSS has widened their freedom in choosing how they want to use, share and remix the software they use. When Software Freedom Day was started in 2004, only 12 teams from different places joined. It grew to a whopping 1000 by 2010 across the world. About the aim of the celebration, SFD’s official website says, “Our goal in this celebration is to educate the worldwide public about the benefits of using high-quality FOSS in education, in government, at home, and in business — in short, everywhere! The non-profit organization Software Freedom International coordinates SFD at a global level, providing support, giveaways and a point of collaboration, but volunteer teams around the world organize the local SFD events to impact their communities.” sfd_2016_bengaluru_by_nima_lama-cc-by-sa-4-0The participants in our group bounced both technical and philosophical questions to each other to gauge the actual usage of FOSS in real life, and we are moving towards adopting openness as a society. And all the participants also agreed that there is a significant disconnect in communicating widely about the work that many Indian FOSS and other free knowledge communities are doing. So they planned to meet more regularly in events organized by any of the FOSS communities and try to connect with more people using social media and chat groups so that these interactions shape into an annual event to bring all open communities under one roof.   What are FOSS, Free Software, Open Source,  and FLOSS?   Free and open source software (FOSS or F/OSS), and Free/Libre and Open-Source Software (FLOSS) are umbrella terms that are used to include both Free software and open source software. Adopted by well-known software freedom advocate Richard Stallman in 1983, the free software has many names — libre software, freedom-respecting software, and software libre are some of them. As defined by the Free Software Foundation, one of the early advocates of software freedom, free software allows users not just to use the software with complete freedom, but to study, modify, and distribute the software and any adapted versions, in both commercial and non-commercial form. The distribution of the software for commercial and non-commercial way, however, depends on the particular license the software is released under. The Creative Commons licenses have recommendations for a broad range of free licenses that one can choose for the software-related documentations and any creative work they create. Similarly, there are several different open licenses for software and many other works that are related to software development.  “Open Source” was coined as an alternative to free software in 1998 by educational advocacy organization Open Source Initiative. Open source software is created collaboratively, made available with its source code, and it provides the user rights to study, change, and distribute the software to anyone and for any purpose. Supported by several global organizations like Google, Canonical, Free Software Foundation, Joomla, Creative Commons and Linux Journal, Software Freedom Day draws its inspiration from the philosophy that was grown by people like Richard Stallman who argues that free software is all about the freedom and not necessarily free of cost but provides the liberty to users from [proprietary software developers’] unjust power. SFD encourages everyone to gather in their own cities (map of places where SFD was organized this year), educate people around them about free software, promote on social media (with the hashtag #SFD2016 this year), even hacking with free software, organizing hackathons, running free software installation camps, and even going creative with flying a drone running free software!


From South Asia, there were 13 celebratory events in India, 8 in Nepal, 1 in Bangladesh and 4 in Sri Lanka. South Asian countries have seen the adoption of both free software and open source software, in both individual and organizational level and by the government. The Free Software Movement of India was founded in Bengaluru, India in 2010 to act as a national coalition of several regional chapters working for promoting and growing the free software movement in India. The Indian government has launched an open data portal at portal for, initiated a new policy to adopt open source software, and asked vendors to include open source software applications while making Requests for proposals. Similarly, several free and open source communities and organizations like Mozilla India, Wikimedia India, Centre for Internet and Society, Open Knowledge India in India, Mozilla Bangladesh, Wikimedia Bangladesh, Bangladesh Open Source Network, Open Knowledge Bangladesh in Bangladesh, Mozilla Nepal, Wikimedians of Nepal and Open Knowledge Nepal in Nepal, Wikimedia Community User Group Pakistan in Pakistan, Lanka Software Foundation in Sri Lanka, that are operating from the subcontinent also promote free and open source software.

We promote open source and open Web technologies in the country. We are open to associate/work with existing open source or other community-run, public benefit organizations.
“Internet By The People, Internet For The People” (from Mozilla India wiki)

Mohammad Jahangir Alam, a lecturer from Southern University Bangladesh argues in a research paper that the use of open source software can help the government save enormous amount of money that are spent in purchasing proprietary software, “A large sum of money of government can be saved if the government uses open source software in different IT sectors of government  offices and  others sectors,  Because the government is providing computers to all educational institute from school to university level and they are using proprietary software. For this reason, the government is to expend a significant amount of many for buying proprietary software to run the computers. Another one is government paying a significant amount of money to the different vendors for buying different types of software to implement e-Governance project. So, the Government can use open source software for implanting projects to minimize the cost of the projects.” Check more ideas for celebrating Software Freedom Day, and a few more here while planning for next year’s Software Freedom Day in your city.

OpenGLAM at Wikimania 2014

- August 27, 2014 in Events/Workshops, Featured, GLAM-Wiki

GLAM in India: 10 tips for successful GLAM projects

- May 22, 2014 in Featured

