Statistical tests for large tree-structured data

Karthik Baharath (University of Nottingham)

Friday 27th March, 2015 15:00-16:00 Maths 204


The Continuum Random Tree (CRT) proposed by Aldous arises as the (invariant) continuous limit, as the number of vertices grow without bound, for a general class of probability models for tree-structured data. We propose powerful goodness-of-fit tests for data which allow for hierarchical, tree-like representations using two different characterizations of the CRT, relating to a  Brownian excursion and a special class of subtrees. Appropriateness of the tests on binary trees obtained from hierarchical clustering algorithms will be discussed, and applied to a dataset of tumour images with the objective of detecting heterogeneity of tumour.

