Temporal Clustering

Clustering data is a basic task and has gotten immense attention, in particular graph-based clustering, where we are given pairwise relationships between people, say, and we want to group them into well-connected clusters with fewer connections between clusters.

Here is an interesting relatively new twist on this kind of problem [1]: Instead of static data, like in a snapshot of Facebook or similar networks, we are given time-stamped data. This could be the set of messages sent between pairs people together with the sending time. We can model this by replacing the input graph by a series of graphs, one for each time stamp. We want to not only group the people into clusters as before, we would like to also know how these clusters evolve over time. That is, we want to find at each time stamp a clustering for the graph at that time stamp. However, we don’t expect that real clusterings change too much from one time stamp to another, so an additional constraint is that not too many people change clusters from one time stamp to another.

In a new paper [2], we looked at the resulting algorithmic problem of computing such “temporal” clusterings. It turned out to be much more difficult than the static variant, and we found some structure that meant that the clustering at the very first time stamps can affect the clustering much later. We may even have to make some unnatural choices in the first time stamps to be able to obtain a reasonable clustering later on. This lead to some strong lower bounds for running times for algorithms that solve that kind of problem.

The next step would be to examine some of the real-world data, to see to which extent such structure really crops up. I would assume that this kind of structure is rather delicate and so shouldn’t be much of a problem in practice. That is, I’m hopeful that there are some properties of real-world data that help us solve this problem much more efficiently than the lower bounds suggest.

  1. Finding Communities in Dynamic Social Networks, Tantipathananandh, C., Berger-Wolf, T.Y. In: 2011 IEEE 11th International Conference on Data Mining. pp. 1236–1241 (2011).
  2. Cluster Editing in Multi-Layer and Temporal Graphs, Chen, J., Molter, H., Sorge, M., Suchý, O. In: Proceedings of the 29th International Symposium on Algorithms and Computation (ISAAC ’18). pp. 24:1–24:13. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2018).