Compilers for Course-Grained Reconfigurable Array (CGRA) architectures suffer from long compilation times and code quality levels far below the theoretical upper bounds. This article presents a new scheduler, called the Bimodal Modulo Scheduler (BMS), to map inner loops onto (heterogeneous) CGRAs of the Architecture for Dynamically Reconfigurable Embedded Systems (ADRES) family. BMS significantly outperforms existing schedulers for similar architectures in terms of generated code quality and compilation time. This is achieved by combining new schemes for backtracking with extended and adapted forms of priority functions and cost functions, as described in the article. BMS is evaluated by mapping multimedia and software-defined radio benchmarks onto tuned ADRES instances.