Sign In / Register
Make This My Home Page | Feedback |RSS
You are here: IE »   Story

MSU software to convert Gujarati books for digital age

  • Print
  • Mail This Article
  • Comments
  • Add to favorites
  • If the Maharaja Sayajirao University (MSU) has its way, Gujarati literature will be no further than a click of the mouse. Experts at MSU are trying to develop a software named ‘Optical Character Recognition (OCR)’, which could convert the images of Gujarati books into text form for convenient storage and speedy retrieval.

    It will be no mean feat as the software would have to be equipped to recognise more than 1,000 different symbols. Jignesh Dholakia, Director of the project told The Indian Express : “OCR software that can convert scanned images of pages in European and English languages into text are easily available. These are fairly accurate as they have to deal with only around 70-80 different symbols and the writing style is also linear. In the case of Indian language books, there are hundreds of different symbols to be recognised and the modifiers for the basic characters can occur on all four sides of the character. The situation becomes even more complex due to the occurrence of similar-looking characters. Gujarati language OCR has to deal with more than 1,000 different symbols of vowels, consonants and conjunctions with different vowel modifiers.”

    Ads by Google

    MSU is a participant in a pioneering national-level consortium for the development of OCR technology for Indian languages along with other institutes such as IIT Delhi, Indian Institute of Science (IISC) Bangalore, Indian Statistical Institute (ISI) Kolkata, IIIT Hyderabad and CDAC. MSU is targeting the development of the technology for Gujarati script. For the first time, a large corpus of annotated images of 25 printed Gujarati books is being prepared for training and testing of the OCR system. The study is part of a major project funded by the Ministry of Communications and Information Technology (MCIT), Government of India.

    ... contd.

    Next12
    Best WishesBy: Parantap Vyas | 18-Jan-2009 Reply | Forward Best of Luck for Gujarati OCR.We are eagerly waiting for that.
    Post a Comment
    Name:
    Email:
    Title:
    Maximum characters allowed     
    Comment:
    TERMS OF USE:
    The views, opinions and comments posted are your, and are not endorsed by this website. You shall be solely responsible for the comment posted here. The website reserves the right to delete, reject, or otherwise remove any views, opinions and comments posted or part thereof. You shall ensure that the comment is not inflammatory, abusive, derogatory, defamatory &/or obscene, or contain pornographic matter and/or does not constitute hate mail, or violate privacy of any person (s) or breach confidentiality or otherwise is illegal, immoral or contrary to public policy. Nor should it contain anything infringing copyright &/or intellectual property rights of any person(s).
    I agree to the terms of use.