Two Principles of Geometric Deep Learning
Geometric Deep Learning (GDL) attempts to provide a formal unification for a broad class of existing machine learning problems.
Join the DZone community and get the full member experience.
Join For FreeAfter CNN's exploded in 2012, showing unprecedented levels of prediction accuracy on image classification tasks, a group of researchers from Yann LeCun's team decided to extend their success to other, more exotic domains. Specifically, they started working on generalizing convNets to graphs. Their efforts were described in this influential paper.
Since then, Graph Neural Networks have become a hot area of research within the ML community and beyond. Numerous papers have been published explaining how different kinds and flavors of GNNs can be applied to complicated, irregular, high-dimensional, non-grid-like structures (graphs, manifolds, meshes, etc.).
The fervor with which researchers have been diving into the field is explained by the simple fact that most of the data coming to us from the physical world can be represented by some type of irregular structure, specifically by graphs. Therefore, developing and popularizing GNNs can have considerable implications in industry, business, and academia.
- GNNs helped discover novel, unusual, yet highly potent antibiotics.
- Google uses a GNN model for placement optimization and chip design cycle shortening.
- Google Maps leverages graph neural nets to improve travel time predictions.
- Deep Mind's GNN AlphaFold2 solved protein folding - a challenge scientists have struggled with for about 50 years.
- Amazon, Pinterest, UberEats, and other giants use GNN-powered recommendation systems.
In 2021, M. Bronstein et al. released a comprehensive proto-book on Geometric Deep Learning, which talks about the vast promise of GNNs and describes unified geometric principles for exploiting regularities in data. It also gives us a template for building advanced neural networks for high-dimensional learning.
This article will explain the two main principles of geometric deep learning in simple language. And I'd like to thank Matt Kovtun, a prominent data scientist, for helping me write it.
Euclidean vs Non-Euclidean
In the context of deep learning, non-Euclidean data is particularly challenging to work with as it doesn't allow us to perform many vector space operations.
For example, you can't just add or subtract vertices on a graph, whereas you can absolutely perform those operations with points on a grid.
So, to work with irregular structures, many operations essential to the success of DL models (convolutions, pooling, etc.) must be reinvented.
How Do We Tackle High-Dimensional Learning?
Well, we can't just project high-dimensional data onto a low-dimensional space as it would lead to huge input fidelity losses and make the model's predictions effectively worthless.
What we can do, however, is exploit the low-dimensional geometric structure of the physical world itself, due to which there are regularities in organically-generated datasets.
There are two main principles the GDL book uses to tackle the problem of high-dimensional learning - symmetry groups and scale separation.
Symmetry Groups
Suppose that high-dimensional input signals live on domains, and the structure of these domains defines the transformations that can be applied to the data.
In GDL, we only use transformations that do not change the structure of domains. They're known as symmetries, and their properties are described by group theory. Some of these properties are:
- Composability: If we apply two transformations one after another, we should get a third one that doesn't change the object's structure.
- Invertibility: A function must have an inverse; otherwise, we'll run the risk of losing information while processing the dataset.
- Identity: The identity function returns the object's value and is a symmetry by definition.
There are two kinds of functions that are resistant to symmetries - invariant and equivariant.
1. Invariant
If we apply a transformation from the symmetry group (to a part of the domain or the entire thing) and it doesn't affect the output - the function is invariant. A good example of this is image classification: if we rotate an image of a cat, a trained CNN should still label it "cat."
2. Equivariant
But we might not want an invariant function as it could ruin the identities of the domain's elements. In that case, we'll turn to a more coarse-grained equivariant function that will change both the input and output in the same way. In an image segmentation problem, for instance, we can apply a shift transformation to a part of an image, and the segmentation mask will show the same shift.
Scale Separation
Using symmetries helps to reduce function search space, but we can't apply them in pure form all the time because real-world data is highly susceptible to distortions and noise.
To fight this, GDL also uses scale separation.
In a nutshell, the book says we should compose local operations to model the large-scale ones. This way, the errors won't be propagated globally; thus, the distortions will be contained within certain regions of the input, not affecting the overall predictions.
This mechanism has been a part of modern CNNs for a long time. But now, thanks to the paper, we have explicit instructions on how to apply the principle properly when building deep learning models.
Summing Up
The two main rules of Geometric Deep Learning are:
1. Apply symmetry groups (use transformations that don't change the structure of the datasets).
2. Leverage scale separation (establish geometric stability by composing local operations which help to contain errors within a small region around points in the domain).
The second principle is an addition to the first one, and we use it because it's not enough for the functions to be invariant or equivariant; we also need them to act only within neighborhoods of specific points with respect to the domain.
Opinions expressed by DZone contributors are their own.
Comments