Multi-objective Bayesian Optimization with Heuristic Objectives for Biomedical and Molecular Data Analysis Workflows
2
Overview
3
Bayesian optimization in bioinformatics has been applied to protein design
4
Most bioinformatic analyses are unsupervised
5
A typical workflow involves many steps and parameters
6
Difficulty with defining objectives make AutoML challenging to apply in bioinformatics
7
AutoML approaches construct objectives for a given problem
8
Motivation for AutoGeneS constructed objectives
9
Our method automatically infers which objectives are useful to guide optimization
10
MOBO basics
11
We build on the random scalarizations approach that returns a subset of the Pareto front
12
We determine the region of the Pareto front using objective behaviours
13
Three examples of desirable behaviours
14
Toy data simulating useful and not useful objectives
15
Optimizing cofactor in the analysis of Imaging Mass Cytometry (IMC) data
16
We construct objectives for clustering workflow using pairs of co-expressed proteins
17
We construct two meta-objectives using expert annotations
18
Parameters selected by our method led to clusterings that agree with expert annotations
19
Quantitative evaluation of performance
20
Summary
Description:
Watch a 42-minute AutoML Seminar presentation exploring multi-objective Bayesian optimization techniques for biomedical and molecular data analysis workflows, particularly focusing on unsupervised bioinformatics problems. Learn how to tackle hyperparameter optimization challenges when dealing with undefined objectives by using multiple noisy heuristic metrics. Discover a novel method that infers useful heuristic objectives based on domain-specific criteria and adaptively updates scalarization functions through multi-output Gaussian process surrogate functions. Follow along as the speaker demonstrates practical applications in single-cell RNA sequencing and highly multiplexed imaging datasets, showing how this approach effectively handles clustering analyses where traditional optimization methods struggle. Gain insights into how this method successfully identifies biologically meaningful groupings of cells based on expression profiles, evaluates cluster separation, and validates results against expert annotations in Imaging Mass Cytometry data analysis.
Read more
Multi-objective Bayesian Optimization with Heuristic Objectives for Biomedical and Molecular Data Analysis Workflows