CGRAs consist of an array of a large number of functional units (FUs) interconnected by a mesh style network. Register files are distributed throughout the CGRAs to hold temporary values and are accessible only by a subset of FUs. The FUs can execute common word-level operations, including addition, subtraction, and multiplication. CGRA processors accelerate inner loops of applications by exploiting instruction level parallelism (ILP) and in some cases also data-level and task-level parallelism (DLP & TLP). The aim of this tutorial is to give insight in CGRA architectures, their compilation techniques, and to experience first hand how to do source code mapping on a CGRA. Therefore the tutorial consists of presentations as well as a hands-on session.