3 July 2020 

We are looking for two new doctoral coworkers (PhD students) for our lab. Please see advertisement of the positions here.


30 June 2020 

We have been granted access to large amounts of computing resources by the Northern German Network for HPC Computing (HLRN).


24 June 2020 

Our paper on learning of higher-order statistics for image encoding (Mousavi, Drefs, Lücke, 2020) has been accepted by the International Conference on Machine Learning, Optimization, and Data Science.


20 May 2020 
We have received top-up funding for the processing of SARS-CoV-2 EM microscopy images within our BMBF project SPAplus.


1 April 2020 
Our project "SPAplus" (collaborative BMBF project, 3 years) has started. We will investigate medical image processing using generative models.


25 March 2020 
Our paper "Phase transition for parameter learning of Hidden Markov Models" (Rau et al.) has been made available on arXiv.


15 March 2020 
Our research activities shift to home offices due to the Corona crisis


4 March 2020 
Our paper "Maximal Causes for Exponential Family Observables" (Mousavi et al.) has been made available on arXiv.


18 Feb 2020 
Our abstract "Optimal Inference of Sound Intensities and Sound Components Using Generative Representations" (Monk, Savin, Lücke) has been accepted for the ASA Conference in Chicago, where it will be presented as a talk.


6 Feb 2020 
Our project proposal "SPAplus" (collaborative BMBF project, 3 years) has been accepted and will be funded.


1 August 2019
Release of the open source software library "ProSper". The library contains a collection of algorithms for probabilistic sparse coding. For the source code see here. For a description see here.


1 July 2019
Our paper "k-Means as a Variational EM Approximation of Gaussian Mixture Models" has been published by Pattern Recognition Letters and is available online (PRLetters, arXiv).


2 April 2019
Our paper "k-Means as a Variational EM Approximation of Gaussian Mixture Models" was accepted for publication by Pattern Recognition Letters.


14-15 March 2019
Jörg Lücke gave a series of three lectures on Generative Machine Learning at the IK 2019 Spring School.


17 Jan 2019
Our paper "STRFs in primary auditory cortex emerge from masking-based statistics of natural sounds" (Sheikh et al.) has been published by PLOS Computational Biology.

16 July 2018
Our paper "Neural Simpletrons - Learning in the Limit of Few Labels with Directed Generative Networks" (Forster et al.) has been published by Neural Computation.

5 July 2018
Our paper "Truncated Variational Sampling for ‘Black Box’ Optimization of Generative Models" has been presented at the LVA/ICA 2018.


3 July 2018
Our paper "Optimal neural inference of stimulus intensities" (Monk et al.) has been published by Nature's Scientific Reports.

24 March 2018
Our paper "Evolutionary Expectation Maximization" (Guiraud et al.) has been accepted for GECCO 2018.

19 March 2018
Our paper "Truncated Variational Sampling for ‘Black Box’ Optimization of Generative Models" (Lücke et al.) has been accepted for LVA/ICA 2018.

5 March 2018
Our paper "Neural Simpletrons - Learning in the Limit of Few Labels with Directed Generative Networks" (Forster et al.) has been accepted by Neural Computation.

22 Dec 2017
Our paper "Can clustering scale sublinearly with its clusters?" (Forster & Lücke) has been accepted for AISTATS 2018.

30 June 2017
Our paper "Discrete Sparse Coding" (Exarchakis & Lücke) has been accepted by Neural Computation.

7 June 2017
Our paper "Models of acetylcholine and dopamine signals differentially improve neural representations" (Holca-Lamarre et al.) has been accepted by the journal Frontiers in Neuroscience.

25 May 2017
Our paper "Binary non-negative matrix deconvolution for audio dictionary learning" (Drgas et al.) has been accepted by the journal IEEE Transactions on Audio, Speech and Language Processing.


Head of lab

Prof. Dr. Jörg Lücke

+49 441 798 5486

+49 441 798-3902

W30 2-201




+49 441 798-

+49 441 798-3902

W30 2-202


Postal Address

Prof. Dr. Jörg Lücke
Arbeitsgruppe Machine Learning
Exzellenzcluster Hearing4all und
Department für Medizinische Physik und Akustik
Fakultät für Medizin und Gesundheitswissenschaften
Carl von Ossietzky Universität Oldenburg
D-26111 Oldenburg

Office Address

Room 201 (2nd floor)
Building W30 (NeSSy)  
Küpkersweg 74
26129 Oldenburg

Illustration of truncated distributions for a GMM in D = 2 dimensions. The figure considers a data point \(\vec{y}^{(n)}\) which lies (for illustrative purposes) to some extend in between some clusters. For C = 8 cluster, the full posterior (aka. the responsibilities)  \(r_c^{(n)} = p(c | \vec{y}^{(n)},Θ)\) are shown in the top-left. Below, a truncated approximation \(q_c^{(n)}\) with |K| = C' = 3 is shown for the same data point. The truncated approximation maintains the C' highest posterior values, sets all others to zero, and renormalizes the distribution to sum to one. The three closest clusters, which correspond to the three highest posterior values, are connected with black lines in the main figure.


In this line of research, we investigate novel variational approaches to decisively improve learning algorithms for clustering. We are interested in the theoretical as well as practical aspects of such approaches.

In practice, variational approaches allow for the derivation of very efficient clustering algorithms for large-scale problems, i.e., when the number of clusters and the number of data points is large, and when the dimensionality of data points is high. Variational approaches can, in this respect, provide a very favorable scaling behavior with the number of clusters. Such a scaling can then be combined with approaches developed by other groups that allow for favorable scaling with the number of data points (e.g. coreset approaches) or the data dimensionality (e.g., by exploiting triangle inequalities).

On the theoretical side, the variational approaches we investigate allow the definition of learning algorithms that interpolate between hard EM clustering approaches (such as k-means) and soft-EM clustering as, e.g., provided by EM for Gaussian mixture models (GMMs). The illustration above shows, for instance, a k-means-like algorithm based on three instead of one 'winning' cluster (this is an instance of the k-means-C' algorithm [Lücke & Forster, 2019] with C' = 3). In the framework of variational EM, k-means is itself obtained as a limit case of the k-means-C' algorithm (namely when C' = 1). As a consequence, k-means (Lloyd's algorithm) can be obtained as an approximation of EM for GMMs without the requirement to take a limit to small or zero cluster variances (which is the usual textbook approach). This implies further, that k-means optimizes a variational lower bound (i.e. a free energy a.k.a. ELBO) of a GMM loglikelihood. The paper [Lücke & Forster, 2019] provides a number of quantitative results on the relation of k-means algorithms to GMM data models including a formula to directly translate the quantization error to a variational lower bound. [Lücke & Forster, 2019] is itself based on more general theoretical results about the family of truncated posteriors as variational distributions [Lücke, 2019].

Based on variational EM interpretations of k-means-like algorithms, accelerations of clustering algorithms can be obtained by defining partial instead of full variational E-steps. Variational lower bounds can then be increased without having to compute the distances of a given data point to all cluster centers. In the paper [Forster & Lücke, 2018] it was shown that such partial E-steps can be defined independently of the number of clusters. As a consequence, the number of required distance evaluations could (for clustering problems with many clusters) be drastically reduced. That the number of strongly reduced distance evaluations translates into algorithms that are in practice also strongly improving the state-of-the-art efficiency was then shown by the paper [Hirschberger et al., 2019]. The paper showed for large-scale problems that (A) the reduction of distance evaluations clearly outweighs to memory overhead and auxiliary computations required to implement the variational EM procedure, and (B) that the variational complexity reduction can be combined with complexity reduction approaches for large data sets (we used coresets) and complexity reduction for large data dimensionality (we used the triangle inequality).

A list of our so-far published clustering algorithms together with their implementations is provided below. Implementations based on python are primarily of interest for the theoretical studies [Lücke & Forster, 2019; Forster & Lücke, 2018] . The python implementations can readily be used in relation to theoretical studies, e.g., on convergence properties or for comparisons with k-means-like algorithms. They are, however, not optimized for execution times or large data sets. The implementations based on C++ were optimized both in terms of memory I/O efficiency and run-time efficiency. Furthermore, their more hardware-near C++ code is supporting fast computation time. The algorithms vc-GMM and vc-k-means are only available as C++ implementations and, most importantly, contain combinations with fast-seeding, fast coreset construction, and efficient computations for high data dimensionalities. These algorithms stand to compete as the fastest clustering algorithms for very large-scale clustering problems.


Algorithm Implementation Reference
k-means-C' Python Code Repository (not optimized*) Lücke & Forster, 2019
var-k-means-X Python Code Repository (not optimized*) Forster & Lücke, 2018
var-k-means-S Python Code Repository (not optimized*)
C++ Code Repository (optimized)
Forster & Lücke, 2018
Hirschberger et al., 2019
var-GMM-X Python Code Repository (not optimized*) Forster & Lücke, 2018
var-GMM-S Python Code Repository (not optimized*)
C++ Code Repository (optimized)
Forster & Lücke, 2018
Hirschberger et al., 2019
vc-GMM C++ Code Repository (optimized) Hirschberger et al., 2019
vc-k-means C++ Code Repository (optimized) Hirschberger et al., 2019

*the Python code is not optimized for real-time execution speed, but for analysis and research only.

References (own publications)

[Hirschberger et al., 2019]
Large Scale Clustering with Variational EM for Gaussian Mixture Models
Florian Hirschberger, Dennis Forster, and Jörg Lücke, arXiv:1810.00803, 2019.

[Lücke & Forster, 2019]
k-means as a variational EM approximation of Gaussian mixture models
Jörg Lücke and Dennis Forster, Pattern Recognition Letters, 2019.

[Lücke, 2019]
Truncated Variational Expectation Maximization
Jörg Lücke, arXiv:1610.03113, 2019.

[Forster & Lücke, 2018]
Can clustering scale sublinearly with its clusters? A variational EM acceleration of GMMs and k-means
Dennis Forster and Jörg Lücke, AISTATS, 2018.

Webmaster (perttra.7pwilts@uoijl.dexuvi) (Changed: 2020-07-08)