Statistical and Biological Physics

Breadcrumb Navigation


Data science for physicists

Lecturers: Fabian Rost and Steffen Rulands

The course, consisting of 3 lectures, will give an introduction to the handling of large, high-dimensional datasets that have become common in biology, social sciences and other fields ("big data"). The course will cover essential topics of data science in a compressed form. Specifically, the content of the course will be as follows:

A brief introduction to R
Efficient manipulation of large datasets
Data visualisation with ggplot2
Dimensionality reduction and clustering

The course is addressed at people who are already familiar with computer programming, but not necessarily in R. Despite the name of the course non-physicists are most welcome to join.


Source code and data files used in the lecture can be downloaded from GitHub.

Lecture 1 (data wrangling): Slides (.PDF)

Lecture 2 (data visualisation): Slides (.PDF)