Tutorial Challenge: Semantic Search

According to the Get Involved page each blog post has to start with a short introduction:
My name is Sebastian and I wrote his challenge to give you a rough template for writing your own challenge. Besides I think, that the problem can be easily solved with NIF and it is a good showcase.

The goal of this challenge is to create a Semantic Search. In this context this means the following.

For a given text (see below) a user gets a search form and can enter one or several search terms. The search shall return all sentences that have “something to do” with the search term. Additional information should also be shown.

Most of the following requirements should be met:

  • Synonyms should be included, i.e. searching for “USA” returns sentences with “United States”
  • Some form of normalisation (stemming, lemmatising, stopword removal) should be applied.
  • DBpedia Instances, that are in the text and match the search should shown. They can also be shown to disambiguate the search, i.e.  when searching for “Bush”  or “Madonna”.
  • Related and similar instances to the found DBpedia instances, that are also in the same text, i.e. Barack Obama is related to United States.

Given text

this text should be used:

Mockup

A static mockup, where only “USA” can be searched can be found here

Code:

Some suggestions of resources that can be used, i.e. you can use anything else.

  • Snowball Stemmer
  • FOX
  • Stanford CoreNLP 
  • Pablo Mendes . Some information from the data set was extracted and loaded here in a Virtuoso Triples Store: http://hanne.aksw.org:8890/sparql . Graph: http://dbpedia.org/lexicalizations You can get alternative surface forms for DBpedia with SPARQL Queries.

 

This entry was posted in Tutorial Challenges. Bookmark the permalink.

Comments are closed.