Tutorial Challenge: Multilingual Part-Of-Speech Tagger

The goal of this challenge is straight-forward: An HTML page has one text area, where you can post a text. The language of the text should be detected and then the following should be highlighted: Verbs should be highlighted in green, Nouns in red, Adjectives in orange and Articles in yellow.The highlighting should work for 5-10 languages of your choice. (The choice of colour is of course not strict, but it has to be the same across languages).

A mockup can be found here:

Code:

Some suggestions of resources that can be used, i.e. you can use anything else:

  • The connection between Stanford CoreNLP and OLiA is currently only implemented for the English pre-trained model and only for the Penn tag set.
  • In some months, KAIST will produce a NIF adapter for a Korean POS tagger.
  • Not all components required for this task have NIF adapters currently.
This entry was posted in Tutorial Challenges. Bookmark the permalink.

Comments are closed.