A new model driven architecture for deep learning-based multimodal lifelog retrieval
Date issued
2018
Journal Title
Journal ISSN
Volume Title
Publisher
Václav Skala - UNION Agency
Abstract
Nowadays, taking photos and recording our life are daily task for the majority of people. The recorded information
helped to build several applications like the self-monitoring of activities, memory assistance and long-term
assisted living. This trend, called lifelogging, interests a lot of research communities such as computer vision, machine
learning, human-computer interaction, pervasive computing and multimedia. Great effort have been made
in the acquisition and the storage of captured data but there are still challenges in managing, analyzing, indexing,
retrieving, summarizing and visualizing these captured data. In this work, we present a new model driven
architecture for deep learning-based multimodal lifelog retrieval, summarization and visualization. Our proposed
approach is based on different models integrated in an architecture established on four phases. Based on Convolutional
Neural Network, the first phase consists of data preprocessing for discarding noisy images. In a second step,
we extract several features to enhance the data description. Then, we generate a semantic segmentation to limit
the search area in order to better control the runtime and the complexity. The second phase consist in analyzing
the query. The third phase which based on Relational Network aims at retrieving the data matching the query. The
final phase treat the diversity-based summarization with k-means which offers, to lifelogger, a key-frame concept
and context selection-based visualization.
Description
Subject(s)
multimodalita, vyhledávání, shrnutí, vizualizace, konvoluční neuronová síť, relační síť
Citation
WSCG 2018: poster papers proceedings: 26th International Conference in Central Europe on Computer Graphics, Visualization and Computer Visionin co-operation with EUROGRAPHICS Association, p. 8-17.