MULTIMODAL SYSTEMS

MULTIMODAL SYSTEMS

_
iten
Code
80164
ACADEMIC YEAR
2018/2019
CREDITS
6 credits during the 2nd year of 8733 Computer Engineering (LM-32) GENOVA
SCIENTIFIC DISCIPLINARY SECTOR
ING-INF/05
LANGUAGE
Italian
TEACHING LOCATION
GENOVA (Computer Engineering)
semester
1° Semester
Teaching materials

OVERVIEW

This course provides students with foundational conceptual knowledge, methodologies, and tools for designing, implementing, and evaluating computer systems that can capture, represent, and automatically analyze the behavior of their users (e.g., in terms of gesture, movement, facial expressions, speech) and interact with them by generating multisensory feedback (e.g., images, sounds, control of actuators) in real-time.

AIMS AND CONTENT

LEARNING OUTCOMES

"The course provides the student with the basis for the design and development of man-machine interfaces and advanced software systems, based on interaction through multiple sensory channels and on the processing and communication of audio and video content. Design of non-desktop desktop interfaces, including multimodal interfaces for mobile systems (tablets, smartphones), with examples in various application contexts (consumer, health, culture, entertainment), including through EyesWeb platform tutorials (http: // www. Infomus.org/eyesweb_eng.php). "" The course is usually in Italian with teaching materials in English. In the presence of non-Italian students, the course will be in English: in this case the teachers Possibility to provide, upon request and in additional hours, teaching support specific to Italian language students with difficulty in English."

AIMS AND LEARNING OUTCOMES

The course aims at introducing the foundational knowledge needed for designing and developing computer systems that can interact with their users naturally, by exploiting multiple sensory channels. This requires students to know and apply technologies for capturing, representing, and automatically analyzing the behavior of the users – e.g., algorithms for detecting and analyzing gesture, full-body movement, facial expressions, speech – and for generating multisensory feedback (e.g., images, sounds, control of actuators) in real-time. At the end of the course, the student will:

  • Know and understand the motivations for using a multimodal interactive system for a specific application, the logical architectures that describe the major components a multimodal interactive system consists of, the guidelines for designing and developing multimodal interactive systems, the application areas where multimodal interactive systems can be successfully exploited.
  • Know the most relevant devices for capturing data that can characterize the behavior of the users and understand how they work and when and how they can be used.
  • Know the most important techniques for representing and automatically analyzing the behavior of the users and understand how and when to apply them. Techniques receive as input data coming from multiple sensor devices covering multiple sensory channels.
  • Be able to analyze specific use cases in selected application areas to evaluate pros and cons for developing a multimodal interactive system rather than a traditional graphical user interface.
  • Be able to design a multimodal interactive system and to implement its major components, by using the development tools presented during lectures and hands-on in the course.
     

PREREQUISITES

None. It is useful, even if not strictly required, having some basic knowledge on human-machine interaction topics (design, development, and evaluation cycle of traditional user interfaces, interaction design methodologies). It is also useful having a basic programming experience.

Teaching methods

The course includes theoretical and practical lectures (approximately 32h of theoretical lectures and 16h of practical lectures for a total of 48h of lectures). Theoretical lectures aim at introducing the concepts and the techniques the course focuses on. Practical lectures consist of hands-on and enable students to apply the presented concepts and technologies in specific case studies. Students can attend the practical lectures by using their laptops in the classroom. Practical lectures exploit different tools such as the EyesWeb XMI platform for gesture and movement analysis and the MIR Toolbox for Matlab for audio processing.

Lectures are usually in Italian with teaching material in English. In presence of non-Italian students, lectures will be in English. In this case, the teacher is available to provide, upon request and in additional hours, teaching support specific to Italian language students having difficulties in understanding English.

SYLLABUS/CONTENT

1. Introduction to multimodal systems

  • Post-WIMP and multimodal user interfaces 
  • Definition of multimodal system 
  • Motivations for developing multimodal systems
  • Guidelines for designing multimodal systems
  • Frameworks: the W3C Multimodal Interaction Framework, the multi-layered framework for analysis of nonverbal communication

2. Visual modality

  • Video and motion capturing devices
  • Techniques for automatic analysis of full-body movement and gesture: computation of movement features, segmentation of movement streams, gesture recognition, analysis of expressive content in full-body movement and gesture
  • Techniques for automatic analysis of facial expression: face detection, computation of facial features, automatic extraction of Action Units
  • Hands-on using the EyesWeb XMI platform

3. Auditory modality

  • Audio capturing devices
  • Techniques for automatic computation and analysis of audio features: temporal, spectral, and cepstral features
  • Introduction to automatic analysis of speech
  • Introduction to sound and music computing
  • Hands-on using Matlab

4. Multimodal fusion

  • Levels of fusion: early fusion model, late fusion model
  • Mthods for multimodal fusion: rule-based methods, classification-based methods, and estimation-based methods

5. Case studies

  • Concrete examples of design and development of multimodal systems in selected application scenarios, including museums and cultural heritage, performing arts, education, well-being, and rehabilitation.

RECOMMENDED READING/BIBLIOGRAPHY

Learning material includes pdf copies of the slides presented at the lectures, examples and exercises for the practical lectures (e.g., EyesWeb applications and Matlab scripts) and a collection of scientific papers the teacher provides the students with. Learning material is made available on AulaWeb. Learning material is provided in English.​

TEACHERS AND EXAM BOARD

Ricevimento: Students can make an appointment to meet with the teacher. To set up a meeting with the teacher please make an appointment by sending an email to gualtiero.volpe@unige.it or by calling one of the following telephone numbers: 0103536542 (office), 0102758252 (lab at Casa Paganini).

Exam Board

GUALTIERO VOLPE (President)

RADOSLAW NIEWIADOMSKI

ANTONIO CAMURRI

LESSONS

Teaching methods

The course includes theoretical and practical lectures (approximately 32h of theoretical lectures and 16h of practical lectures for a total of 48h of lectures). Theoretical lectures aim at introducing the concepts and the techniques the course focuses on. Practical lectures consist of hands-on and enable students to apply the presented concepts and technologies in specific case studies. Students can attend the practical lectures by using their laptops in the classroom. Practical lectures exploit different tools such as the EyesWeb XMI platform for gesture and movement analysis and the MIR Toolbox for Matlab for audio processing.

Lectures are usually in Italian with teaching material in English. In presence of non-Italian students, lectures will be in English. In this case, the teacher is available to provide, upon request and in additional hours, teaching support specific to Italian language students having difficulties in understanding English.

LESSONS START

Lectures are scheduled in the first term, from September 17th, 2018 to December 19th, 2018. The schedule is as follows: Wednesday 8:00 - 10:00 and Thursday 12:00 - 14:00, B3 room (Via all'Opera Pia, B building).

EXAMS

Exam description

The exam consists of a project and an oral discussion. The project is assigned by the teacher and concerns the design of a multimodal interactive system for a specific application. It may include the development of software modules for audio and video processing aiming at contributing to the analysis of the behavior of the users, a deeper analysis of specific topics presented during the lectures by means of a targeted bibliographical search, the analysis of existing solutions, including testing and assessment of algorithms and their performances. The oral discussion consists of the presentation of the results of the project and may also include questions on topics addressed during the lectures.

Assessment methods

The project aims at assessing the extent to which the student is able to analyze a case study, to evaluate the opportunity of developing a multimodal interactive systems in such a context, to design a multimodal interactive system, and to apply the technologies presented during the lectures. The oral discussion aims at assessing the extent to which the student knows and understands the foundational theoretical aspects of multimodal systems (motivations, logical architectures, and guidelines for design and development) as well as the student’s understanding of the major data acquisition devices and of the techniques for representation and automated analysis of the behavior of the users. The teacher will assess the quality of the project and of the presentation of the results, the capability of critical reasoning on the developed project, the correct use of specialized lexicon, the deepness of the student’s knowledge and understanding of the topics addressed in the course, and the capability of presenting such a content properly.

FURTHER INFORMATION

Master theses are available concerning the topics presented in the course and addressing the design and development of multimodal interactive systems in areas of interest for the scientific and technological research carried out at the Casa Paganini – InfoMus research center of DIBRIS – University of Genoa (www.casapaganini.org). For students that are interested in a master thesis on these topics, the course provides the theoretical and practical knowledge, which is needed to carry out the work in the thesis.