High-Performance Data Analytics

Overview

Data-driven science requires the handling of large volumes of data in a quick period of time. Executing efficient workflows is challenging for users but also for systems. This module introduces concepts, principles, tools, system architectures, techniques, and algorithms toward large-scale data analytics using distributed and parallel computing. We will investigate the state-of-the-art of processing data of workloads using solutions in High-Performance Computing and Big Data Analytics.

Key information

Name	Value
Contact	Julian Kunkel
Venue	virtual
Time	tba
Language	English
Module	Modul B.Inf.1712: Vertiefung Hochleistungsrechnen, Module M.Inf.1236: High-Performance Data Analytics
SWS	4
ECTS	6
Presence time	56 hrs
Independent study	124 hrs

Learning Objectives

Topics cover:

Challenges in high-performance data analytics
Use-cases for large-scale data analytics
Performance models for parallel systems and workload execution
Data models to organize data and (No)SQL solutions for data management
Industry relevant processing models with tools like Hadoop, Spark, and Paraview
System architectures for processing large data volumes
Relevant algorithms and data structures
Visual Analytics
Parallel and distributed file systems

Guest talks from academia and industry will be incorporated in teaching that demonstrates the applicability of this topic.

Weekly laboratory practicals and tutorials will guide students to learn the concepts and tools. In the process of learning, students will form a learning community and integrate peer learning into the practicals. Students will have opportunities to present their solutions to the challenging tasks in the class. Students will develop presentation skills and gain confidence in the topics.

Examination

Written (90 Min.) or oral (ca. 30 Min.) → depends on the number of attendees.

Responsible

Prof. Dr.Julian Kunkel

Assistant

%!s(<nil>) %!s(<nil>)