## OVERVIEW

Provide the students with the basic skills for extracting knowledge from large data sets.

## AIMS AND CONTENT

LEARNING OUTCOMES

Develop the basic skills for extracting knowledge and knowledge from large data sets, in particular by forming an

- understanding of the value of data mining in solving real-world problems
- understanding of foundational concepts underlying data mining
- understanding of algorithms commonly used in data mining tools
- ability to apply data mining tools to real-world problems

AIMS AND LEARNING OUTCOMES

At the end of the course students will

- be able to understand and handle the main concepts and techniques of data mining
- be able to apply autonomously the main techniques of data mining to solve real-world problems
- to develop further knowledge about data mining techniques and applications

TEACHING METHODS

Combination of traditional lectures and lab sessions

SYLLABUS/CONTENT

**First part: introduction to aata mining and applications in fraud detection**

Introduction to Data Mining, Data science and big data analytics

Main techniques

The Data Mining Process - CRISP

Seven Class of Algorithms

Supervised Learning – Classification

Unsupervised Learnimg – Clustering

Outliers detection

Regression

Reinforced Learning

Ranking

Deep Learning

Top ten data mining algorithms

Examples and application using WEKA

Application to marketing, finance and medicine

Big Data and Hadoop

The NOSql paradigm

**Second part**: **Machine Learning Algorithms for Data mining**

Introduction to Data Mining and Machine Learning.

Taxonomy of the Data Mining problems

Statistical Inference

Support Vector Machines (extension to kernels)

Support Vector Regression (extension to kernels)

K-means and Spectral Clustering

Decision Trees and Random Forests

Model Selection and Error Estimation

RECOMMENDED READING/BIBLIOGRAPHY

- Aggarwal, C- C. Data mining: the textbook. Springer, 2015.
- Shalev-Shwartz, S., and Shai B. D. Understanding machine learning: From theory to algorithms. Cambridge University Press, 2014.
- Ian H. Witten, Eibe Frank, Mark A. Hall (2000). Data Mining: Practical Machine Learning Tools and Techniques (The Morgan Kaufmann Series in Data Management Systems) ISBN-13: 978-0123748560. Disponibile presso il CSB di Ingegneria 006.312 WIT --> disponibile anche online a http://www.sciencedirect.com/science/book/9780123748560
- Clifton Phua, Vincent Lee, Kate Smith and Ross Gayler (2005). A Comprehensive Survey of Data Mining-based Fraud Detection Research, Computing Research Repository, abs/1009.6119. Disponibile online --> http://arxiv.org/abs/1009.6119
- N. Cristianini, J. Shawe-Taylor, An introduction to support Vector Machine and other kernel-based learning methods, Cambridge University Press, 2006 disponibile ING e ECO
- A. Ng, M. Jordan, Y. Weiss, On spectral clustering: Analysis and an algorithm, NIPS 2001. --> disponibile anche online a http://papers.nips.cc/paper/2092-on-spectral-clustering-analysis-and-an-algorithm.pdf
- Dispense/Handouts

## TEACHERS AND EXAM BOARD

**Office hours:** By appointment arranged by email with Luca Oneto luca.oneto@unige.it and Fabrizio Malfanti <fabrizio.malfanti@intelligrate.it>
For organizational issues contact by email Eva Riccomagno <riccomagno@dima.unige.it>

Exam Board

FABRIZIO MALFANTI (President)

EVA RICCOMAGNO (President)

LUCA ONETO

## LESSONS

TEACHING METHODS

Combination of traditional lectures and lab sessions

LESSONS START

The class will start according to the academic calendar.

Class schedule

## EXAMS

EXAM DESCRIPTION

To take the exam, you must sign up online.

The examination of the first part consists of the discussion of a group project on a topic agreed with the lecturer and of a written examination on which the oral examination can be based.

The examination of the second part consists of the discussion of a project on a topic agreed with the lecturer and developed autonomously by the student.

The final mark is the weighted average of the marks of the two parts with weights the number of ECTS of each part, namely 3 ECTS for each part.

ASSESSMENT METHODS

The exam will check if the student has learned the methodologies and techniques for extracting knowledge from a big set of data through a small project which requires the solution of a real world data mining problem.

Exam schedule

Date | Time | Location | Type | Notes |
---|---|---|---|---|

29/01/2018 | 09:00 | GENOVA | Scritto + Orale | |

12/02/2018 | 09:00 | GENOVA | Scritto + Orale |