This poster advocates using a statistically rigorous data analysis methodology for reporting Java performance. The key idea is to model non-determinism (caused by JIT, GC, ...) by computing confidence intervals. We show that prevalent data analysis techniques lead to misleading or incorrect conclusions. We present a toolkit to address this.