DATA WAREHOUSING

DATA WAREHOUSING

_
iten
Code
101798
ACADEMIC YEAR
2020/2021
CREDITS
9 credits during the 1st year of 10852 COMPUTER SCIENCE (LM-18) GENOVA
SCIENTIFIC DISCIPLINARY SECTOR
INF/01
LANGUAGE
English
TEACHING LOCATION
GENOVA (COMPUTER SCIENCE )
semester
1° Semester
Teaching materials

AIMS AND CONTENT

LEARNING OUTCOMES

Learning the theoretical, methodological, and technological fundamentals of data management and analysis in decision support systems, with a specific reference to data warehousing architectural and design issues, as well as key elements of data integration and governance, data quality and cleaning, ExtractionTransformation-Loading processes, conceptual, logical, and physical design of data warehouses, storage architectures and scalable parallel processing, use of data warehouses for business reporting and online analytical processing.

AIMS AND LEARNING OUTCOMES

DESCRIBE the principles for data analysis and large-scale data analysis

UNDERSTAND the differences between data management (OLTP) systems and data analysis (OLAP) systems

UNDERSTAND the differences between design issues and methodologies for databases and for datawarehouses

UNDERSTAND the main issues in data quality, data integration, and data governance

UNDERSTAND the main issues in data warehouse design, with specific reference to conceptual design, ROLAP logical design, view selection, physical design and ETL design

UNDERSTAND the main issues in large scale data analysis 

SELECT the most adequate systems and languages for a given analysis context

USE some of the presented systems for data exploration, data reconciliation, data warehouse storage, data reporting and OLAP querying

USE some of the presented systems for construncting a datawarehouse from a given operational dataset and for performing non-trivial analyses on it

SOLVE exercizes related to the design of data warehouses and OLAP queries

PREREQUISITES

Fundamentals of database models, languages, and systems.

  • Entity-relationship (conceptual) model
  • Relational (logical) model
  • Schema normalization
  • Relational algebra & SQL
  • Indexes
  • Transactions

Teaching methods

Class, project, and outside preparation.

SYLLABUS/CONTENT

The course will present the main architectural and design issues related to data management and analysis in data support systems (data warehousing), comparing them with traditional transactional systems.

  • Introduction. IT technologies to support decisions. Differences between OLAP and OLTP. Data warehousing and Data mining. Business Intelligence.
  • Data integration and data quality.
  • Data models for data warehouses. Conceptual data model. Dimensions, measures and hierarchies. Multidimensional data model. ROLPAP models: star and snowflake schemas.
  • Back-end.Architectures. Storage structures and indexes in OLAP. Materialized views. Optimization of OLAP queries.
  • Data warehouse design. Conceptual, logical and physical design.
  • ETL functionalities and approaches.
  • Front-end. OLAP queries and reporting. SQL OLAP extensions. 
  • Large scale data analysis: Hive and SparkSQL

RECOMMENDED READING/BIBLIOGRAPHY

  • ​M. Golfarelli, S. Rizzi. Data Warehouse Design. Mc-Graw Hill 2009. 
  • R. Kimball, M. Ross. The Data Warehouse Toolkit. Wiley, 2013.
  • C. Jensen, T. Bach Pedersen, C. Thomsen. Multidimensional Databases and Data Warehousing. Morgan&Claypool, 2010.
  • A.Vaisman, E. Zimányi. Data Warehouse Systems: Design and Implementation. Springer, 2014

TEACHERS AND EXAM BOARD

Ricevimento: Appointment by email or by Microsoft Teams Office: Valle Puggia – 301

Ricevimento: Appointment by email or by Microsoft Teams Office: Valle Puggia – 327

Exam Board

GIOVANNA GUERRINI (President)

SIMONE MINISI

BARBARA CATANIA

LESSONS

Teaching methods

Class, project, and outside preparation.

EXAMS

Exam description

Written examination, oral examination (including project discussion).

Assessment methods

Details on how to prepare for the examination and the required degree of knowledge for each topic will be provided during the lessons.

During the semester,  some assignments (groupwork) as well as a project will be proposed. The project is mandatory.

The written exam consists of a set of questions and exercizes on basic topics of the course; the goal of this test is to verify the understanding of the main issues addressed during the lessons.

The oral exam consists of an in-depth discussion of the solutions developed by the student for the given project, in order to assess whether the student has reached an appropriate level of knowledge. For students that do not successfully complete the assignments the oral exam will also include theoretical questions and / or practices of the course topics.