distributeddataanalysisandmining's Introduction

DDAM

Progetto svolto in collaborazione con Martina Trigilia, Francesco Santucciu e Michele Andreucci

Distributed Data Analysis and Mining - Spark (Hadoop)

Analysis of the dataset Australia, Rain Tomorrow.

Tasks:

Data Understanding
Data Preparation
Classification and Clustering
Regression

About the course: "this course aims at teaching the basic theoretical concepts behind the MapReduce distributed computing paradigm, and Hadoop in particular, and at building expertise in the practical usage of high-performance computing tools for data engineering, analysis and mining. In particular, the students will learn how classical data mining algorithms can be applied to Big Data using Hadoop (Spark). Real (and open source) datasets will be used to present examples and to let the students build their own projects".

Recommend Projects

bianchimario / distributeddataanalysisandmining Goto Github PK

distributeddataanalysisandmining's Introduction

DDAM

distributeddataanalysisandmining's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent