‘Because English is not enough’
For the internet to really matter to Indians, it must speak to them in their own language. But this is easier said than done in a country with 22 official languages and over 1,500 mother tongues. Ram Prakash Hanumanthappa, a 34-year-old tech-entrepreneur from Bangalore, was among the first Indians to predict this problem back in his days at IIT Madras and to do something about it. A decade ago, he started work on Quillpad, a technology that would transform whatever you typed on your standard English keyboard into the familiar curlicues of Malayalam or Gujarati. You could now blog in flawless Tamil, quote Amrita Pritam in her original untranslated beauty, and email Diwali wishes to your family in a language close to your heart and theirs. "When I made Quillpad, I wondered why we shouldn't be able to chat with friends and family in a language we've grown up speaking. No one believed then that I could make money with it. Today, as the internet goes to Tier-II towns, Indian language content is starting to matter more than ever," says Hanumanthappa.
"Because English is not enough" – that's the tagline of Quillpad. It's the flagship product of Tachyon Technologies, a 10-man company operating out of a single room in JP Nagar, south Bangalore. Occupying either sides of a long table here are young geeks with one eye trained on the Olympics and the other on green boards covered with exponential equations. Although they themselves speak the language of maths — "I multiply matrices for a living," says Hanumanthappa — they have helped script a brighter future for Indic languages on the web. Over 335 million words have so far been typed on Quillpad, and that is not counting the licences they have sold to companies like Yahoo and to the UID project, which uses it for enrolling India's millions.
An annual Quillpad licence for a single language costs several lakh rupees, and yet, many media websites and content companies have preferred it to the free Google transliteration software available online. The reason is accuracy. Compared to other existing solutions — such as Indic language keyboards that map 50-plus sounds to the 26 letters in the English alphabet, and dictionary-based transliteration tools with limitations — Quillpad is simple, intuitive, and almost always, spot on. "The challenge is that Indian languages are phonetically richer. There are three types of na in Tamil, for instance. You cannot expect the user to type in notations every time, so how do you make sure the computer chooses the right na?," says Hanumanthappa.
He built a mathematical model based on machine learning algorithms that could discern patterns in the usage of characters and learn to avoid errors stemming from phonetic differences, untyped characters like the halanth in Hindi, and superfluous characters like double a's. "Take Indian names and their English transliterations. When we read someone's name — Prashant, for instance — we know the context and intuitively pronounce the 't' and the a's right. We don't worry about disambiguation. This is what our algorithms attempt to do," he explains. Essentially, Hanumanthappa took the corpus of words available online in Unicode for each language and fed them into the statistical model: every time an English language letter is typed, a decision tree is first formed, listing the numerous phonetic probabilities, from which one is picked based on history of usage. All this happens instantaneously, of course. "We also correctly interpret words borrowed from English," Hanumanthappa says.
Tachyon isn't resting on its laurels. Quillpad Touch, a free app for the iPad, combines a simple predictive Hindi keyboard with gestural input, so you can type a letter and instead of combing the keyboard for maatras, just draw one instead. There is another piece of technology that's ready for licensing: it promises to transform your cellphone camera into a scanning and transliteration tool. "Imagine if you could see the world in your language — signboards on streets, comic books, newspapers. You could just take a picture of something and the tool will extract all the text from it, convert it to your language, and even put it back where it was in the same font size and colour," Hanumanthappa says. He hopes the technology will be used to digitise Indian-language books and make them searchable.
- Destitute, orphan students outclass rest in Andhra Class 10 exams
- To re-energise ties, PM wants to visit US, waits for confirmation
- NIA court says no terror link, frees 'Hizbul militant' Liyaqat on bail
- CBI arrests its coal allotments investigator on bribery charge
- ‘Cricketer-bookie Amit may have used Jiju to reach Sree’
- BCCI chief N Srinivasan says police must prove spot-fixing allegations