# Introduction to Information Theory (Fall 2020)

**Program:** Bachelor Mathematics, Informatics, AI, Physics & Astronomy (WI, WInf, IN, KI, NS)

**Lecturer:** Michael Walter

**Teaching assistants:** Freek Witteveen, Mees de Vries

**Schedule:** Nov-Dec 2020

**Further information:** DataNose, Canvas

See **here** for the latest edition of this course.

The Canvas course page is the primary source for all course material. This page is offered only for your convenience.

## Course description

*Reading your email on the metro may not strike you as particularly memorable. Isn’t it remarkable, though, that – despite relying on a noisy cellphone signal underground – none of your messages arrive corrupted? In this course you will learn the mathematics of how this is achieved.*

This course will give an introduction to *information theory* – the mathematical theory of information. Ever since its inception, information theory has had a profound impact on society. It underpins important technological developments, from reliable memories to mobile phone standards, and its versatile mathematical toolbox has found use in computer science, machine learning, physics, and even pure mathematics.

Starting from probability theory, we will discuss how to mathematically model *information sources* and *communication channels*, how to optimally *compress* information, and how to design *error-correcting codes* that allow us to reliably communicate over noisy communication channels. We will also see how techniques used in information theory can be applied more generally to *make predictions* from noisy data.

Students interested in writing a thesis in quantum information theory are also encouraged to follow Maris Ozol’s course Introduction to Quantum Computing.

## Recommended prior knowledge

We will discuss rigorous definitions and proofs, hence some “mathematical maturity” will be assumed. We will use *basic probability theory* throughout the course (but will remind you of the most important facts). In addition, part of the homework will require *programming*. You can use the programming language of your choice; examples and solution will be given in Python.

## Lecture notes and video recordings

The video recording are available on Canvas.

- Lecture 1 (Oct 27, 2020): Welcome, Introduction to Information Theory [✍]
- Lecture 2 (Oct 29, 2020): Probability Theory Refresher [✍]
- Lecture 3 (Nov 3, 2020): Entropy, Compression, and a First Look at Symbol Codes [✍]
- Lecture 4 (Nov 5, 2020): Symbol Codes: Lossless Compression, Huffman Algorithm [✍]
- Lecture 5 (Nov 10, 2020): Block Codes: The Source Coding Theorem, its Proof, and Variations [✍]
- Lecture 6 (Nov 12, 2020): Stream Codes: Lempel-Ziv Algorithm [✍]
- Lecture 7 (Nov 17, 2020): Stream Codes: Arithmetic Coding [✍]
- Lecture 8 (Nov 19, 2020): Joint Entropies & Communication over Noisy Channels [✍]
- Lecture 9 (Nov 24, 2020): The Noisy Coding Theorem [✍]
- Lecture 10 (Nov 26, 2020): Proof of the Noisy Coding Theorem [✍]
- Lecture 11 (Dec 1, 2020): Proof of the Converse, Shannon’s Theory vs Practice [✍]
- Lecture 12 (Dec 3, 2020): Reed-Solomon Codes [✍], [▶] on decoding errors at unknown locations
- Student Presentations (Dec 8-9, 2020)
- Lecture 13 (Dec 10, 2020): Message Passing for Decoding and Inference, Outlook [✍]

Main reference is David J. C. MacKay’s textbook *“Information Theory, Inference, and Learning Algorithms”*, Cambridge (2003), which is available online. 📖

## Homework

- Problem set 1 [pdf] [colab]
- Problem set 2 [pdf] [colab]
- Problem set 3 [pdf] [colab]
- Problem set 4 [pdf] [colab]
- Problem set 5 [pdf] [colab]
- Problem set 6 [pdf] [colab]

## Exercises

- Practice set 1 [pdf]
- Practice set 2 [pdf]
- Practice set 3 [pdf]
- Practice set 4 [pdf]
- Practice set 5 [pdf]
- Practice set 6 [pdf]
- Practice set 7 [pdf]
- Practice set 8 [pdf]
- Practice set 9 [pdf]
- Practice set 10 [pdf]
- Practice set 11 [pdf]
- Practice set 12 [pdf]

## Learning objectives

At the end of the course, you will be able to:

- mathematically model memoryless and Markov information sources using probability theory
- mathematically model memoryless information channels using probability theory
- compute basic properties of probability distributions, such as entropies and essential bit contents, using mathematical techniques such as Jensen’s inequality
- define typical sets and estimate their size and probability
- prove and apply Shannon’s source coding theorem
- prove and apply the Kraft inequality for uniquely decodable (UD) codes, its converse for prefix codes, and use it to bound the average length for optimal UD or prefix codes
- design optimal symbol codes using Huffman’s coding algorithm
- analyze and apply the Lempel-Ziv algorithm for stream coding
- analyze and apply arithmetic coding for different probabilistic models
- compute basic properties of joint probability distributions, such as conditional entropies and mutual informations
- prove and apply Shannon’s noisy channel coding theorem
- understand basic properties of block codes, such as their rate and various probabilities of error
- encode Reed-Solomon codes and decode them for erasure errors
- understand maximum likelihood approach to codeword and bitwise decoding
- design and apply message passing algorithms for encoding and inference problems
- independently read about a topic in information theory and present it to your peers

We will check off these objectives as we move along in the course.

## Format

Lectures, exercises classes, self-study, and a short presentation.

## Assessment

The final grade will be determined by the following calculation:

*60% exam grade + 30% homework grade + 10% presentation grade*

In addition, your exam grade alone has to be at least a 5. The same rule applies for the re-sit exam.

There will be one *homework* problem set per week (in total 6), posted on the course homepage by Tuesday.
You must submit your completed homework (on Canvas) on Monday the week after.
We will ask you to collaborate and submit in *groups*.
The solutions will be discussed in the Wednesday exercise class (among other things).
Assignments will be accepted late only if you have extenuating circumstances (such as sickness or family emergency) and provided you confirm with the lecturer before the deadline.
Your problem set with the lowest score will be ignored (this includes any problem set you did not submit).

Instead of having a midterm exam, we would like you to read about a topic in information theory and give a short *presentation* to your peers.
You will present as a group, but everyone should speak for a few minutes.
We will give you many examples (on Canvas) but you are free to pick your own topic (but please confirm it with us).

Everything that we discussed in class and on the homework is in principle in scope for the exam. We recommend that you revisit the homework problems and the exercises as a preparation. You are allowed to bring one self-prepared “cheat sheet” to the exam (A4 paper, hand-written, you can use both sides).