Deep Neural Networks and Language Models built on top of them have seen a lot of hype over the past couple of months, especially in the GPT family of techniques. However, is this hype really deserved? What relationships and patterns do these models end up learning? Do they really reason and internally represent abstract concepts? Most importantly, how does this generalize to applications of Deep Neural Networks outside of Language Modelling? We'll decipher this situation and find the answers to these questions with a bottom up approach.