Preface

With an increasing volume of data being generated, we now usher in an era where data-driven research is playing an increasingly important role on top of traditional hypothesis-driven research. Thus, it is important to be able to tease out trends / patterns from large volumes of data, often through the use of carefully crafted visualizations. In this guide, we provide an introduction into the process of extracting, processing and eventually visualizing data to identify meaningful trends. In particular, we will be working on publicly available COVID-19 statistics (e.g. the number of cases worldwide) using the R programming language to perform data wrangling (with the dplyr package) and data visualization (with the ggplot2 package).

Apart from the programming aspects, the guide will also discuss other aspects of data visualisation, with emphasis on aesthetics and proper interpretation of plots. Overall, we hope that the guide will not only help readers be able to generate plots and work better with data but also appreciate the subtleties in interpreting results from graphical plots.

If you find any errors or wish to offer any feedback, feel free to contact me at .

And let’s dive in!