|This paper proposes various optimizations for lattice-based key-encapsulation mechanisms (KEM) using the Number Theoretic Transform (NTT) on the popular ARM Cortex-M4 microcontroller. Improvements come in the form of a faster code using more efficient modular reductions, small polynomial multiplications and more aggressive layer merging in the NTT but also reduced stack usage. We test those optimizations in software implementations of Kyber and NewHope, both round 2 candidates in the NIST post-quantum project and also NewHope-Compact, a recently proposed derivative of NewHope with smaller parameters. Our software is the first implementation of NewHope-Compact on Cortex-M4 and shows speed improvements over previous high-speed implementations on the same platform for Kyber and NewHope . More over, it gives a common framework to compare those algorithms with the same level of optimization. Our results show that NewHope-Compact is the faster algorithm, followed by Kyber and finally NewHope that seems to suffer from its large modulus and error distribution for small dimensions.