This paper presents a debugging aid for parallel program developers. The toolpresented enables programmers to use cyclic debugging techniques for debuggingnon-deterministic parallel programs running on multiprocessor systems withshared memory. The solution proposed consists of a combination of record/replaywith automatic on-the-fly data race detection. This combination enables us tolimit the record phase to the more efficient recording of the synchronizationoperations, and checking for data races (using intrusive methods) during areplayed execution. As the record phase is highly efficient, there is no need toswitch it off, hereby eliminating the possibility of Heisenbugs because tracingcan be left on all the time.