Analyzing the Communication Clusters in Datacenters

AutorFoerster, Klaus-Tycho; Marette, Thibault; Neumann, Stefan; Plant, Claudia; Sadikaj, Ylli; Schmid, Stefan; Velaj, Yllka
ArtConference Paper
AbstraktDatacenter networks have become a critical infrastructure of our digital society and over the last years, great efforts have been made to better understand the communication patterns inside datacenters. In particular, existing empirical studies showed that datacenter traffic typically features much temporal and spatial structure, and that at any given time, some communication pairs interact much more frequently than others. This paper generalizes this study to communication groups and analyzes how clustered the datacenter traffic is, and how stable these clusters are over time. To this end, we propose a methodology which revolves around a biclustering approach, allowing us to identify groups of racks and servers which communicate frequently over the network. In particular, we consider communication patterns occurring in three different Facebook datacenters: a Web cluster consisting of web servers serving web traffic, a Database cluster which mainly consists of MySQL servers, and a Hadoop cluster. Interestingly, we find that in all three clusters, small groups of racks and servers can produce a large fraction of the network traffic, and we can determine these groups even when considering short snapshots of network traffic. We also show empirically that these clusters are fairly stable across time. Our insights on the size and stability of communication clusters hence uncover an interesting potential for resource optimizations in datacenter infrastructures.
KonferenzWorld Wide Web Conference 2023