It takes a long time to generate a configuration for an FPGA starting from a description of a digital circuit in a hardware design language. This configuration should have a high quality so that the FPGA resources are used in an efficient way with the maximum clock frequency and minimizing the power consumption. In this work we present two new packing algorithms that obtain better quality and faster runtimes when compared to the frequently used AAPack packer. The partitioning based methodology allows us to exploit the advantage of multithreading on commodity hardware. Firstly we demonstrate the benefits of our fully partitioning based PartSA packer. Existing packers with a partitioning based approach have problems with the cluster size and bandwidth constraint of the functional blocks. We added a fast simulated annealing step after partitioning to solve these problems. A gain of 26% in total wirelength is obtained while reaching up to 2.3x faster packing runtimes for large circuits on a CPU with four cores. Unfortunately the PartSA packer can not be used for architectures without a complete crossbar in the functional blocks. Therefore a second packer is proposed that combines the benefits of partitioning based and seed based packing. MultiPart has up to 4x faster packing runtimes while still having a gain of 20% in total wirelength.