Git Product home page Git Product logo

parameshkrishnaa / scl_2018 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ambaji57/scl_2018

0.0 0.0 0.0 141.31 MB

Sanskrit Computational Linguistics Tools

License: GNU General Public License v2.0

Shell 0.08% JavaScript 0.48% C++ 0.03% Python 0.01% Perl 0.42% C 3.99% Objective-C 0.01% Java 0.15% OCaml 0.26% Prolog 0.01% CLIPS 0.28% XSLT 0.01% CSS 0.11% TeX 0.32% Makefile 0.09% HTML 93.37% Yacc 0.01% Lex 0.15% Perl 6 0.23%

scl_2018's Introduction

The distribution contains various tools related to Sanskrit computations developed under the guidance of Amba Kulkarni since 2002.

Pre-requisites:
apache HTTP server
bash
lttoolbox
graphviz
libgdbm (required for hash tables in perl)
gcc/g++
flex
bison
perl
python
java (for Ashtadhyayi simulator)

Perl modules:
Time::Out (0.11 or above)
GDBM_File.pm

Ocaml
Ocamlpr4 patch

The distribution comes in two forms:
a) Following modules bundled together
   Morph analyser, Morph generator, Sandhi, Sandhi splitter, Anusaaraka Skt-Hnd MT system, transliteration modules, Amarakosha and Sankshepa Ramayana
  (scl.tgz)

To install Sanskrit Computational Linguistics tools

a) tar -xvzf scl.tgz 

b) cp appropriate SPEC/spec*.txt file to scl directory. Name it as spec.txt
   Check that the paths are correct, else make necessary changes.

c) ./configure  (./configure_server for server version)

d) make all

e) sudo make install (install_server for server version)


All the packages are available under GPL. You would have received a copy of GPL license with this package.

In case of any queries, please contact [email protected].


-- Amba Kulkarni
15th July 2012
-------------------------------------------------------------------------------

History:
We acknowledge the help of ASR Melkote who had given their resources of Morphological Analyser in 2002. This formed a starting point for us. 

Mr. Jain worked on the Sanskrit morphological analyser from 2002-2003 towards his M.Tech. thesis at IIIT-H.

Ms. Sheeba worked as a part of work on her Ph.D. thesis contributed towards the development of morphological analyser from 2004-2006. Her major contribution was for subantas and kridantas.

Mr. Anil Gupta contributed for the development of tinganta analyser between 2006-2007, especially with the Dhaturatnakar entries.

Later from 2004-2006, various students at the Rashtriya Sanskrit Vidyapeetham Tirupati contributed to the development of Sandhi package. Contribution of Ms. Sivaja Nair, Pankaj Vyasa and Ms. Sushama Vempati deserve special mention.

University of Hyderabad later supported further development under the University of Potential Excellence scheme from 2006-2007.

During 2006-2008 Pawan Goyal, IIT Kanpur worked with Amba Kulkarni towards the development of Ashtadhyayi simulator.

Though Amba Kulkarni worked on various modules at her own pace, the project got a boost when the Technology Development for Indian Languages(TDIL)  division of Ministry of Information and Communication Technology supported the activity in the form of a Consortium of 7 Institutes (2009-13).

The Principal Investigators at the 7 institutes are:
Amba Kulkarni, Department of Sanskrit Studies, University of Hyderabad (Consortium Leader)
Dipti Mishra Sharma, IIIT-H, Hyderabad
Girish Nath Jha, Special Center for Sanskrit, JNU, Delhi
Veeranarayan Pandurangi, JRRSU, Jaipur
Tirumala Kulkarni, PPVP, Bangalore
S. S. Murty, RSVP, Tirupati
Shrinivas Varkhedi, Director, Sanskrit Academy, Hyderabad

Under this project on 'Development of Sanskrit Computational tools and Sanskrit-Hindi Machine Translation system', following tools have been developed:

a) Morph analyser
b) Morph generator
c) Sandhi
d) Sandhi Splitter
e) Sanskrit-Hindi Machine Translation system (Sampark and Anusaaraka models)
f) Compound Processor

All these modules were developed at the Department of Sanskrit Studies, University of Hyderabad.

Various consortium members have contributed by developing annotated tests for building these modules. In addition JNU developed a POS tagger and IIIT-H deveoped a POS tagger and a parser, which are not part of this distribution.

During 2015-17 Amba Kulkarni was awarded a fellowship at Indian Institute of Advanced Study, Shimla. During this period she improved the parsing algorithms taking into account the yogyataa as a constraint.

Following persons had major contribution in the development of the tools:
a) Dr. Sheeba
b) Dr. Devanand Shukl
c) Mr. Anil Gupta
d) Ms. Bhavani
e) Ms. Gauri
f) Ms. Kiranmayi
g) Mr. Karunakar
h) Dr. Shivaja
i) Dr. Shailaja
j) Dr. Pavankumar Satuluri
k) Dr. Arjun K

In addition Converters and Transliteration modules for converting/transliterating from one scheme to the other are developed. Following schemes have been addressed.
a) Unicode Devanagari (UTF-8)
b) WX
c) Velthuis
d) Itrans
e) SLP
f) Kyoto Harvard

Dr. Sivaja Nair worked on her Ph.D. thesis on the Amarakosha from 2007-2011. The package she developed in the process is also available for distribution.

Dr. Anil Kumar developed the Compound processor as a part of his PhD thesis from 2008-2011.

Dr. Shailaja developed the concordance of three Paninian Dhatuvrttis as a part of her PhD thesis from 2009-13.

Dr. Pavankumar Satuluri developed the compound generator as a part of his PhD thesis (2011-15).

Dr. Arjun developed the Nyaayacitradiipikaa, an analyser for Navya Nyaya Expressions as a prt of his PhD thesis (2013-17).

Since 2007 Amba Kulkarni is also collaborating with Gerard Huet, INRIA. As a result of this collaboration, an inter-communication between the Sanskrit Heritage tools and the Anusaaraka tools has been possible.

Finally I would like to acknowledge Prof. K V Ramkrishnamacharyulu for his guidance throughout the development of these tools.

-------------------------------------------------------------------------------

scl_2018's People

Contributors

ambaji57 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.