As multicore chips scale to higher processor counts, communication
between cores becomes more and more important.
Indeed, when a single application is split up among multiple
cores, which are connected through a relatively slow
network, the amount of communication that is required will
have an essential effect on performance. Therefore, if the
application can be partitioned in such a way that communication
between threads is minimised, or that placement
on non-uniform networks can be performed with regards to
communication, a significant performance boost can be obtained.
But to do this effectively, communication streams
inside the application must be known. In this paper, we introduce
a profiling tool for Java that can measure data flows
between methods. It constructs a communication graph,
which combines a traditional call graph with data flow information.
The overhead of profiling is brought down by a factor of 15
through the use of reservoir sampling. We prove that this
can be done with a limited decrease in accuracy.
This way, we can quickly estimate communication flows,
which forms the critical information that allows an efficient
communication-aware parallelisation to be made.