Git Product home page Git Product logo

ud_lithuanian's Introduction

UD_Lithuanian

Resourses and documentation for a Lithuanian Universal Dependencies treebank

ud_lithuanian's People

Contributors

olesar avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

nofenigma

ud_lithuanian's Issues

Participles

alksnis 2.1

@zeman converted adjectival participles in alksnis to VerbForm=Part; their padalyvis to VerbForm=Gdv; their pusdalyvis to VerbForm=Conv; and their būdinys to VerbForm=Ger.

Lithuanian-HSE 2.3

  • VerbForm=Conv, upos="VERB": dirbdamas, atsakydamas, atlikdami, ... (half-participles, Lt. pusdalyvis "a converb denoting an action that is coagential and simultaneous" with the main action,"presents an accompanying, secondary occurrence, usually perceived as the circumstance of the main action") = PartPus in HSE = CNV (converb) in Arkadiev 2010 = same-subject converbs in Menchi ms.
  • VerbForm=Gdv, upos="VERB": siekiant, nepaisant, norint, sakant, likus, ... (indeclinable participles, Lt. padalyvis), = PartPad in HSE = non-agreeing active participles in Arkadiev = CVB (converb) SIM/ANT in Pakerys 2011 = gerunds (different-subject converbs) in Menchi ms., Nau&Arkadiev 2015
  • VerbForm=Ger, upos="VERB": kniedyte, prašyte (manner of action participle, Lt. būdinys) = CVB (echo-converb) in Nau&Arkadiev 2015
  • pasakytina (lemma: pasakyti 'say', m.sg. pasakytinas) (participle of necessity), not marked in Alksnis (Case=Nom|Gender=Fem|Number=Sing|Polarity=Pos|Variant=Short|VerbForm=Part), marked as VerbForm=Part|Voice=Necess in HSE = DPP (debitive (passive) partciple) in Nau&Arkadiev 2015

Verb forms

  1. Use Tense and Aspect to split PastIter, PastSimp, PresHab (Past,Pres,Fut - ok)

Lemmatization of verbs with prefixes

5 types of lemmas:

  • dictionary-like: būti
  • ne-, negation: nebūti
  • be-, continuative: bebūti
  • te-, restrictive-: tebūti
  • also combinations of prefixes: nebebūti

Personal and possessive "plural" pronouns (lemmatization)

It seems that semantika.lt lemmatizes forms of mūs, jūs, mūsų, jūsų as and tu.
I have noticed that this lemmatization is present in Alksnis' datasets, too.

We should be accurate to not forget to fix this in our annotation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.