Publications associées à Problems and Procedures to Make Wordnet Data (Retro)Fit for a Multilingual Dictionary

The Current State of the OBI DICT Project: A Bilingual e-Dictionary of Oracle-Bone Inscriptions with AI Image Recognition

This article reports on the current state of the OBI DICT project, a bilingual e-dictionary of oracle-bone inscriptions (OBI), incorporating artificial intelligence (AI) image recognition technology. It first provides a brief overview of the development of ...

Buro Van Die Wat2024

Modeling Structured Data in Attention-based Models

Alireza Mohammadshahi

Natural language processing has experienced significant improvements with the development of Transformer-based models, which employ self-attention mechanism and pre-training strategies. However, these models still present several obstacles. A notable issue ...

EPFL2023

Efficient Text-based Reinforcement Learning by Jointly Leveraging State and Commonsense Graph Representations

Mattia Atzeni, Mrinmaya Sachan

Text-based games (TBGs) have emerged as useful benchmarks for evaluating progress at the intersection of grounded language understanding and reinforcement learning (RL). Recent work has proposed the use of external knowledge to improve the efficiency of RL ...

ASSOC COMPUTATIONAL LINGUISTICS-ACL2021

Learning computationally efficient static word and sentence representations

Prakhar Gupta

Most of the Natural Language Processing (NLP) algorithms involve use of distributed vector representations of linguistic units (primarily words and sentences) also known as embeddings in one way or another. These embeddings come in two flavours namely, sta ...

EPFL2021

Automatic Call Sign Detection: Matching Air Surveillance Data with Air Traffic Spoken Communications

Petr Motlicek, Amrutha Prasad

Voice communication is the main channel to exchange information between pilots and Air-Traffic Controllers (ATCos). Recently, several projects have explored the employment of speech recognition technology to automatically extract spoken key information suc ...

MDPI2021

Approaching Ontology Alignment through Representation Learning to Bridge the Semantic Gap in Engineering Applications

Prodromos Kolyvakis

The current information landscape is characterised by a vast amount of relatively semantically homogeneous, when observed in isolation, data silos that are, however, drastically semantically fragmented when considered as a whole. Within each data silo, inf ...

EPFL2020

Dictionary Learning for Two-Dimensional Kendall Shapes

Michaël Unser, Virginie Sophie Uhlmann, Julien René Pierre Fageot, Anna You-Lai Song

We propose a novel sparse dictionary learning method for planar shapes in the sense of Kendall, namely configurations of landmarks in the plane considered up to similitudes. Our shape dictionary method provides a good trade-off between algorithmic simplici ...

SIAM PUBLICATIONS2020

Towards semantics-driven modelling and simulation of context-aware manufacturing systems

Damiano Nunzio Arena

Systems modelling and simulation are two important facets for thoroughly and effectively analysing manufacturing processes. The ever-growing complexity of the latter, the increasing amount of knowledge, and the use of Semantic Web techniques adhering meani ...

EPFL2019

Deep Micro-Dictionary Learning and Coding Network

Yan Yan, Wei Wang, Wei Xiao

In this paper, we propose a novel Deep Micro-Dictionary Learning and Coding Network (DDLCN). DDLCN has most of the standard deep learning layers (pooling, fully, connected, input/output, etc.) but the main difference is that the fundamental convolutional l ...

IEEE2019

Word Sense Consistency in Statistical and Neural Machine Translation

Xiao Pu

Different senses of source words must often be rendered by different words in the target language when performing machine translation (MT). Selecting the correct translation of polysemous words can be done based on the contexts of use. However, state-of-th ...

EPFL2018

Towards Producing Human-Validated Translation Resources for the Fula language through WordNet Linking

Khalil Mrini, Martin Benjamin

We propose methods to link automatically parsed linguistic data to the WordNet. We apply these methods on a trilingual dictionary in Fula, English and French. Dictionary entry parsing is used to collect the linguistic data. Then we connect it to the Open M ...

ACL Anthology2017

On Modeling the Synergy Between Acoustic and Lexical Information for Pronunciation Lexicon Development

Marzieh Razavi

State-of-the-art automatic speech recognition (ASR) and text-to-speech systems require a pronunciation lexicon that maps each word to a sequence of phones. Manual development of lexicons is costly as it needs linguistic knowledge and human expertise. To fa ...

EPFL2017

Problems and Procedures to Make Wordnet Data (Retro)Fit for a Multilingual Dictionary

Martin Benjamin

The data compiled through many Wordnet projects can be a rich source of seed information for a multilingual dictionary. However, the original Princeton WordNet was not intended as a dictionary per se, and spawning other languages from it introduces inheren ...

Alexandru Ioan Cuza University of Iasi2016

Cross-lingual Linking of Multi-word Entities and their corresponding Acronyms

Maud Ehrmann

This paper reports on an approach and experiments to automatically build a cross-lingual multi-word entity resource. Starting from a collection of millions of acronym/expansion pairs for 22 languages where expansion variants were grouped into monolingual c ...

European Language Resources Association (ELRA)2016

Closed-Loop Lifecycle Management of Service and Product in the Internet of Things: Semantic Framework for Knowledge Integration

Dimitrios Kyritsis, Min-Jung Yoo

This paper describes our conceptual framework of closed-loop lifecycle information sharing for product-service in the Internet of Things (IoT). The framework is based on the ontology model of product-service and a type of IoT message standard, Open Messagi ...

Mdpi Ag2016

Discourse-level features for statistical machine translation

Thomas Meyer

Machine Translation (MT) has progressed tremendously in the past two decades. The rule-based and interlingua approaches have been superseded by statistical models, which learn the most likely translations from large parallel corpora. System design does not ...

EPFL2015

Acoustic and Lexical Resource Constrained ASR using Language-Independent Acoustic Model and Language-Dependent Probabilistic Lexical Model

Mathew Magimai Doss, Ramya Rasipuram

One of the key challenges involved in building statistical automatic speech recognition (ASR) systems is modeling the relationship between subword units or “lexical units” and acoustic feature observations. To model this relationship two types of resources ...

2015

Knowledge-based decision support for improving the adaptive product development process

Predrag Spasojevic

The product development process (PDP) is a complex process encompassing many very diverse activities, and involving a fairly big number of actors, spread across different professions, teams and companies. In todayâ s product development, more often than n ...

EPFL2015

On Learning Grapheme-to-Phoneme Relationships through the Acoustic Speech Signal

Mathew Magimai Doss, Ramya Rasipuram

Automatic speech recognition (ASR) systems, through use of the phoneme as an intermediary unit representation, split the problem of modeling the relationship between the written form, i.e., the text and the acoustic speech signal into two disjoint processe ...

2014

Multilingual Lexicography with a Focus on Less-Resourced Languages: Data Mining, Expert Input, Crowdsourcing, and Gamification

Martin Benjamin

This paper looks at the challenges that the Kamusi Project faces for acquiring open lexical data for less-resourced languages (LRLs), of a range, depth, and quality that can be useful within Human Language Technology (HLT). These challenges include accessi ...

2014