Conference paper
Open semantic analysis: The case of word level semantics in Danish
The present research is motivated by the need for accessible and efficient tools for automated semantic analysis in Danish. We are interested in tools that are completely open, so they can be used by a critical public, in public administration, non-governmental organizations and businesses. We describe data-driven models for Danish semantic relatedness, word intrusion and sentiment prediction.
Open Danish corpora were assembled and unsupervised learning implemented for explicit semantic analysis and with Gensim’s Word2vec model. We evaluate the performance of the two models on three different annotated word datasets. We test the semantic representations’ alignment with single word sentiment using supervised learning.
We find that logistic regression and large random forests perform well with Word2vec features.
Language: | English |
---|---|
Year: | 2017 |
Proceedings: | 8th Language and Technology ConferenceLanguage & Technology Conference |
Types: | Conference paper |
ORCIDs: | Nielsen, Finn Årup and Hansen, Lars Kai |