The first presentation of the Natural Language Processing session was given by Mathijs Pieters from the University of Groningen titled “Comparison of Machine Learning Techniques for Multi-label Genre Classification”. This joint work with Marco Wiering won the SNN best paper award and studies classic and novel text classification techniques with the aim to recognize the genre of a movie based on its subtitles. A new dataset was constructed and a novel technique combining a histogram and the Word2vec model was introduced. Experiments were performed with six different machine learning techniques that use the word information as input such as a combination of long short term memory networks with convolutional neural networks. The results showed that the more complex methods did not outperform the simpler bag-of-word technique with a multi-layer perceptron, and that the introduced histogram performed the best from all methods using Word2vec embeddings.

Verna Dankers from the University of Amsterdam then presented two papers: “Modelling the Generation and Retrieval of Word Associations with Word Embeddings” and “Modelling Word Associations and Interactiveness for Describer Agents in Word-Guessing Games” which are co-authored with Aysenur Bilgin and Raquel Fernández. The first presentation described the use of semantic vector spaces for the Location Taboo Game. In this game, an agent has to guess the name of a city from hints provided by a describer agent. The agent therefore has to learn to associate hints such as “Great pizza” with possibly first a country and then a specific city based on more hints. The results showed that the agent could correctly guess the name of the intended city in around 27% of the games. In the second presentation, Verna Dankers discussed the role of the describer agent for the same game, with the aim to optimize a describer agent in such a way that it gives informative hints so that humans can guess the name of the city. To make this game more complex, a list of Taboo words is given, which are not allowed to be used in the hints. Different approaches were tested in simulation and in an empirical study and the results showed that humans who knew the name of a city could guess it in half of the cases after a limited amount of hints given by the best system.

The fourth talk was presented by Zoltán Szlávik from IBM in Amsterdam and discussed his joint work with Nikita Galinkin, Lora Aroyo and Benjamin Timmermans titled “Catch Them If You Can: Malicious Behavior Simulation in Deep Question Answering”. The presentation was about the problem when question-answering systems are developed using human input, when users are malicious and enter incorrect information. Although the talk was focused on the cultural heritage domain, the presenter strongly argued that for building intelligent systems in this way, malicious user actions should be detected as soon as possible.