Hierarchical Visualization of High Dimensional Data

Our ability to investigate biological entities has been improving over the years, thanks to next-generation sequencing technologies providing an ever-increasing efficiency of data collection. However, this superior data collection hasn’t necessarily led to superior knowledge generation. Across the different fields of biological study, data is often high dimensional, with a single entity of interest correlating to a single dimension in the dataset. Datasets with more than a thousand dimensions are not uncommon, and visualizing this without sacrificing some of the data is challenging.

The focus of this PhD project was the development of methods to support the user-driven visual exploration and analysis of high dimensional biological data, specifically in the context of studies of microbial ecology. The project looked to tackle this by structuring the data through exploring combinations of hierarchical visualization methods to display the data, along with supporting methods to highlight patterns of interest within the dataset and provide alternative views to show different aspects of the data Altogether this would provide an environment where a user could display a full high dimensional dataset and in real-time, manipulate and explore it using their own domain knowledge.

The PhD thesis Hierarchical Visualization of High Dimensional Data: Interactive Exploration of ’Omics Type Data was successfully defended in April 2022.

Team

  • Alexander Macquisten (PhD student), Newcastle University
  • Adrian M Smith (external supervisor), Unilever R&D
  • Dr Sara Johansson Fernstad (main supervisor), Newcastle University
  • Prof Nick Holliman (co-supervisor), formerly Newcastle University