Trying to make use of the inherent parallelization offered by multicore architectures, partitioning the application into smaller units that can be handled by different cores is inevitable. Partitioning an application without full intuition of its behavior is somehow absurd and useless. Therefore, communication profiling, which means investigating the run-time behavior of an application in terms of memory access patterns, is an essential performance optimization tool in the hands of the designer. In this research, PinComm communication profiler has been used to study the data flows between different functions of a WSS decoder, in order to make better clustering and mapping decisions which aim to minimize the inter-processor communication. This is possible by assigning the heavily communicating functions onto the same processor, or at least the nearest neighbor in order to avoid excessive communication overhead between the processors.