Tutorial Challenge: Multilingual Part-Of-Speech Tagger

The goal of this challenge is straight-forward: An HTML page has one text area, where you can post a text. The language of the text should be detected and then the following should be highlighted: Verbs should be highlighted in green, Nouns in red, Adjectives in orange and Articles in yellow.The highlighting should work for 5-10 languages of your choice. (The choice of colour is of course not strict, but it has to be the same across languages).

A mockup can be found here:


Some suggestions of resources that can be used, i.e. you can use anything else:

  • The connection between Stanford CoreNLP and OLiA is currently only implemented for the English pre-trained model and only for the Penn tag set.
  • In some months, KAIST will produce a NIF adapter for a Korean POS tagger.
  • Not all components required for this task have NIF adapters currently.
Stanford CoreNLP

I created this implementation to provide a reference implementation for NIF 1.0.

StanfordCore is an NLP tool, that combines lemmatizing, POS-tags, dependency parsers and many more layers. The tool currently only produces NIF output, but might be extended to read NIF input as well. There is a Demo Web service available


Homepage Corenlp.shtml
Additionalparameter None
Status Reference Implementation for NIF 1.0 . Provides lemmas, POS tags and also (experimental) Syntax trees

I created this implementation to provide a reference implementation for NIF 1.0.

The SnowBall libraries provide basic implementations for stemming algorithms for a lot of languages.  This NIF implementation encapsulates the stemmer.

Homepage Snowball.tartarus.org
Status Reference Implementation for NIF 1.0 .
Additionalparameter Other languages are available with: stemmer=PorterStemmer or stemmer=HungarianStemmer and others from:  http://lucene.apache.org/java/2_4_0/api/contrib-snowball/index.html

FOX participated in the initial field test before NIF 1.0 and has not yet been updated.
It is best to try the online demo of FOX at http://fox.aksw.org

Currently, FOX only allows POST so the API can not be called within the browser but only with curl:

curl  -d "type=TEXT&nif=TRUE&task=NER&output=TURTLE&text=My%20favorite%20actress%20is%20Natalie%20Portman!"

Homepage Fox.aksw.org
Status implements some features of NIF-1.0, but: GET is missing, ‘input-type’ is ‘type’
Demo Index.html
Webserviceurl Api
Nlpdomain Entity Linking

