# Persistent Homologyって何だろう？

Persistent Homologyが面白そうなので調べてみよう。

Persistent Homology — a Survey
Herbert Edelsbrunner and John Harer,                                                                          Article · January 2008, DOI: 10.1090/conm/453/08802

ABSTRACT.

Persistent homology is an algebraic tool for measuring topological features of shapes and functions. It casts the multi-scale organization we frequently observe in nature into a mathematical formalism. Here we give a record of the short history of persistent homology and present its basic concepts. Besides the mathematics we focus on algorithms and mention the various connections to applications, including to biomolecules, biological networks, data analysis, and geometric modeling.

Persistent homologyは、形状と関数の位相的特徴を測定するための代数的ツールです。 それは、私たちが自然界で頻繁に観察するマルチスケールの組織を数学的形式にキャストします。 ここでは、Persistent homologyの短い歴史の記録を示し、その基本的な概念を示します。 数学に加えて、アルゴリズムに焦点を当て、生体分子、生物学的ネットワーク、データ分析、幾何学モデリングなど、アプリケーションへのさまざまな接続について説明します。by Google翻訳

Persistent homologyの最近の話題を知りたいと思って、2020年以降の文献を調べたら、deep learningによるセグメンテーションの精度を向上するためにPersistent homologyの考え方を使うという論文があった。

A Topological Loss Function for Deep-Learning based Image Segmentation using Persistent Homology
James R. Clough, Nicholas Byrne, Ilkay Oksuz, Veronika A. Zimmer, Julia A. Schnabel, Andrew P. King
Abstract

We introduce a method for training neural networks to perform image or volume segmentation in which prior knowledge about the topology of the segmented object can be explicitly provided and then incorporated into the training process. By using the
differentiable properties of persistent homology, a concept used in topological data analysis, we can specify the desired topology of segmented objects in terms of their Betti numbers and then drive the proposed segmentations to contain the specified topological features. Importantly this process does not require any ground-truth labels, just prior knowledge of the topology of the structure being segmented. We demonstrate our approach in four experiments: one on MNIST image denoising and digit recognition, one on left ventricular myocardium segmentation from magnetic resonance imaging data from the UK Biobank, one on the ACDC public challenge dataset and one on placenta segmentation from 3-D ultrasound. We find that embedding explicit prior knowledge in neural network segmentation tasks is most beneficial when the segmentation task is especially challenging and that it can be used in either a
semi-supervised or post-processing context to extract a useful training gradient from images without pixelwise labels.

Index Terms—Segmentation, Persistent Homology, Topology, Medical Imaging, Convolutional Neural Networks

ニューラルネットワークに、数学、物理学、化学、生物学、薬学、医学、・・・、を教える（事前知識として与える）ことによって、人を超える人工知能を作り出すことができるということになるのだろう。科学の発展は、数学に始まり、物理学、化学、生物学、医学の順に進んできた。数学の前に論理学（哲学）。）

James R. Cloughらの1つ前？の論文を見てみよう。

Explicit topological priors for deep-learning based image segmentation using persistent homology

James R. Clough, Ilkay Oksuz, Nicholas Byrne, Julia A. Schnabel and Andrew P. King
School of Biomedical Engineering & Imaging Sciences, King’s College London, UK

arXiv:1901.10244v1 [cs.CV] 29 Jan 2019

1 Introduction
Image segmentation, the task of assigning a class label to each pixel in an image, is a key problem in computer vision and medical image analysis. The most successful segmentation algorithms now use deep convolutional neural networks (CNN), with recent progress made in combining fine-grained local features with coarse-grained global features, such as in the popular U-net architecture [17]. Such methods allow information from a large spatial neighbourhood to be used in classifying each pixel. However, the loss function is usually one which considers each pixel individually rather than considering higher-level structures collectively.

この呼応レベルの構造をまとめて考慮するために導入されるのが、Explicit topological priors ということなのだろう。

In many applications it is important to correctly capture the topological characteristics of the anatomy in a segmentation result. For example, detecting and counting distinct cells in electron microscopy images requires that neighbouring cells are correctly distinguished. Even very small pixelwise errors, such as incorrectly labelling one pixel in a thin boundary between cells, can cause two distinct cells to appear to merge. In this way significant topological errors can be caused by small pixelwise errors that have little effect on the loss function during training but may have large effects on downstream tasks. Another example is the modelling of blood flow in vessels, which requires accurate determination of vessel connectivity. In this case, small pixelwise errors can have a significant impact on the subsequent modelling task. Finally, when imaging subjects who may have congenital heart defects, the presence or absence of small holes in the walls between two chambers is diagnostically important and can be identified from images, but using current techniques it is difficult to incorporate this relevant information into a segmentation algorithm. For downstream tasks it is important that these holes are correctly segmented but they are frequently missed by current segmentation algorithms as they are insufficiently penalised during training. See Figure 1 for examples of topologically correct and incorrect segmentations of cardiac magnetic resonance images (MRI).

もう1つ、別の論文を見てみよう。

Persistent-Homology-based Machine Learning and its Applications – A Survey
Chi Seng Pun et al., arXiv:1811.00252v1 [math.AT] 1 Nov 2018

Abstract
A suitable feature representation that can both preserve the data intrinsic information and reduce data complexity and dimensionality is key to the performance of machine learning models. Deeply rooted in algebraic topology, persistent homology (PH) provides a delicate balance between data simplification and intrinsic structure characterization, and has been applied to various areas successfully. However, the combination of PH and machine learning has been hindered greatly by three challenges, namely topological representation of data, PH-based distance measurements or metrics, and PH-based feature representation. With the development of topological data analysis, progresses have been made on all these three problems, but widely scattered in different literatures.
In this paper, we provide a systematical review of PH and PH-based supervised and unsupervised models from a computational perspective. Our emphasizes are the recent development of mathematical models and tools, including PH softwares and PH-based functions, feature representations, kernels, and similarity models. Essentially, this paper can work as a roadmap for the practical application of PH-based machine learning tools. Further, we consider different topological feature representations in different machine learning models, and investigate their impacts on the protein secondary structure classification.

データ固有の情報を保持し、データの複雑さと次元を削減できる適切な特徴表現は、機械学習モデルのパフォーマンスの鍵となります。代数的トポロジーに深く根ざした永続的ホモロジー（persistent homology：PH）は、データの単純化固有の構造特性の微妙なバランスを提供し、さまざまな分野にうまく適用されています。ただし、PHと機械学習の組み合わせは、データのトポロジ表現PHベースの距離測定または指標PHベースの特徴表現という3つの課題によって大きく妨げられてきました。トポロジーデータ分析の開発により、これら3つの問題すべてについて進歩が見られましたが、さまざまな文献に広く散らばっています。