Projects

LvAuthor

LvAuthor is a platform used for text analysis. It creates the xml files used by gnosly.com and English Listening Game.

The main tasks and algorithms are the following:

  • text cleaning: UTF-8 conversion, special chars removing
  • paragraph splitting
  • period splitting
  • phrase analysis: phrase slitting, phrase type determination (incisive, direct, indirect, etc)
  • word splitting
  • Part of speech (POS) assignment of word. Automatic disambiguation with human supervision
  • audio matching: words of text are discovered in audiobook file determining the time when the each word begin and end
  • word translation: words are translated automatically
  • english text phrase and Italian text phrase correlation: each english phrase is correlated with the related Italian phrase, comparing the original text of the story with the human translated one 
  • revision applying: fix made by the professor on word translation or phrase correlation is applied
  • final xml file building

The technologies used for this app are the following:

  • Netbeans Platform
  • NLP algorithms
  • Ant with custom ant task, XSLT
  • Java
  • Postgresql, JPA (Hibernate) 
  • GIT, Maven
2013, November, 30

Related Articles

Share it!
Fabrizio Giovannetti