Data science for physicists
Lecturers: Fabian Rost and Steffen Rulands
The course, consisting of 3 lectures, will give an introduction to the handling of large, high-dimensional datasets that have become common in biology, social sciences and other fields ("big data"). The course will cover essential topics of data science in a compressed form. Specifically, the content of the course will be as follows:
A brief introduction to R
Efficient manipulation of large datasets
Data visualisation with ggplot2
Dimensionality reduction and clustering
The course is addressed at people who are already familiar with computer programming, but not necessarily in R. Despite the name of the course non-physicists are most welcome to join.
Materials
Source code and data files used in the lecture can be downloaded from GitHub.
Lecture 1 (data wrangling): Slides (.PDF)
Lecture 2 (data visualisation): Slides (.PDF)
Downloads
- slides_data_wrangling (17 MByte)
- slides_visualisation (6 MByte)