Accelerator processors allow energy-efficient computation at high performance, especially for computationintensive applications. There exists a plethora of different accelerator architectures, such as GPUs and the Cell Broadband Engine. Each accelerator has its own programming language, but the recently introduced OpenCL language unifies accelerator programming languages. Hereby, OpenCL achieves functional protability, allowing to reduce the development time of kernels. Functional portability however has limited value without performance portability: the possibility to re-use optimized kernels with good performance. This paper investigates the specificity of code optimizations to accelerator architecture and the severity of lack of performance portability.