Nonlinear dimensionality reductionNonlinear dimensionality reduction, also known as manifold learning, refers to various related techniques that aim to project high-dimensional data onto lower-dimensional latent manifolds, with the goal of either visualizing the data in the low-dimensional space, or learning the mapping (either from the high-dimensional space to the low-dimensional embedding or vice versa) itself. The techniques described below can be understood as generalizations of linear decomposition methods used for dimensionality reduction, such as singular value decomposition and principal component analysis.
Cluster analysisCluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of exploratory data analysis, and a common technique for statistical data analysis, used in many fields, including pattern recognition, , information retrieval, bioinformatics, data compression, computer graphics and machine learning.
Spectral clusteringIn multivariate statistics, spectral clustering techniques make use of the spectrum (eigenvalues) of the similarity matrix of the data to perform dimensionality reduction before clustering in fewer dimensions. The similarity matrix is provided as an input and consists of a quantitative assessment of the relative similarity of each pair of points in the dataset. In application to image segmentation, spectral clustering is known as segmentation-based object categorization.
Laplacian matrixIn the mathematical field of graph theory, the Laplacian matrix, also called the graph Laplacian, admittance matrix, Kirchhoff matrix or discrete Laplacian, is a matrix representation of a graph. Named after Pierre-Simon Laplace, the graph Laplacian matrix can be viewed as a matrix form of the negative discrete Laplace operator on a graph approximating the negative continuous Laplacian obtained by the finite difference method. The Laplacian matrix relates to many useful properties of a graph.
Dynamical systemIn mathematics, a dynamical system is a system in which a function describes the time dependence of a point in an ambient space, such as in a parametric curve. Examples include the mathematical models that describe the swinging of a clock pendulum, the flow of water in a pipe, the random motion of particles in the air, and the number of fish each springtime in a lake. The most general definition unifies several concepts in mathematics such as ordinary differential equations and ergodic theory by allowing different choices of the space and how time is measured.
Supervised learningSupervised learning (SL) is a paradigm in machine learning where input objects (for example, a vector of predictor variables) and a desired output value (also known as human-labeled supervisory signal) train a model. The training data is processed, building a function that maps new data on expected output values. An optimal scenario will allow for the algorithm to correctly determine output values for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way (see inductive bias).
K-means clusteringk-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or cluster centroid), serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells. k-means clustering minimizes within-cluster variances (squared Euclidean distances), but not regular Euclidean distances, which would be the more difficult Weber problem: the mean optimizes squared errors, whereas only the geometric median minimizes Euclidean distances.
DimensionIn physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. Thus, a line has a dimension of one (1D) because only one coordinate is needed to specify a point on it - for example, the point at 5 on a number line. A surface, such as the boundary of a cylinder or sphere, has a dimension of two (2D) because two coordinates are needed to specify a point on it - for example, both a latitude and longitude are required to locate a point on the surface of a sphere.
AttractorIn the mathematical field of dynamical systems, an attractor is a set of states toward which a system tends to evolve, for a wide variety of starting conditions of the system. System values that get close enough to the attractor values remain close even if slightly disturbed. In finite-dimensional systems, the evolving variable may be represented algebraically as an n-dimensional vector. The attractor is a region in n-dimensional space.
Whitney embedding theoremIn mathematics, particularly in differential topology, there are two Whitney embedding theorems, named after Hassler Whitney: The strong Whitney embedding theorem states that any smooth real m-dimensional manifold (required also to be Hausdorff and second-countable) can be smoothly embedded in the real 2m-space, \R^{2m}, if m > 0. This is the best linear bound on the smallest-dimensional Euclidean space that all m-dimensional manifolds embed in, as the real projective spaces of dimension m cannot be embedded into real (2m − 1)-space if m is a power of two (as can be seen from a characteristic class argument, also due to Whitney).
Dynamical systems theoryDynamical systems theory is an area of mathematics used to describe the behavior of complex dynamical systems, usually by employing differential equations or difference equations. When differential equations are employed, the theory is called continuous dynamical systems. From a physical point of view, continuous dynamical systems is a generalization of classical mechanics, a generalization where the equations of motion are postulated directly and are not constrained to be Euler–Lagrange equations of a least action principle.
Rössler attractorThe Rössler attractor ˈrɒslər is the attractor for the Rössler system, a system of three non-linear ordinary differential equations originally studied by Otto Rössler in the 1970s. These differential equations define a continuous-time dynamical system that exhibits chaotic dynamics associated with the fractal properties of the attractor. Rössler interpreted it as a formalization of a taffy-pulling machine. Some properties of the Rössler system can be deduced via linear methods such as eigenvectors, but the main features of the system require non-linear methods such as Poincaré maps and bifurcation diagrams.
Dimensionality reductionDimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally close to its intrinsic dimension. Working in high-dimensional spaces can be undesirable for many reasons; raw data are often sparse as a consequence of the curse of dimensionality, and analyzing the data is usually computationally intractable (hard to control or deal with).
Four-dimensional spaceFour-dimensional space (4D) is the mathematical extension of the concept of three-dimensional space (3D). Three-dimensional space is the simplest possible abstraction of the observation that one needs only three numbers, called dimensions, to describe the sizes or locations of objects in the everyday world. For example, the volume of a rectangular box is found by measuring and multiplying its length, width, and height (often labeled x, y, and z).
Euclidean spaceEuclidean space is the fundamental space of geometry, intended to represent physical space. Originally, that is, in Euclid's Elements, it was the three-dimensional space of Euclidean geometry, but in modern mathematics there are Euclidean spaces of any positive integer dimension n, which are called Euclidean n-spaces when one wants to specify their dimension. For n equal to one or two, they are commonly called respectively Euclidean lines and Euclidean planes.
Linkless embeddingIn topological graph theory, a mathematical discipline, a linkless embedding of an undirected graph is an embedding of the graph into three-dimensional Euclidean space in such a way that no two cycles of the graph are linked. A flat embedding is an embedding with the property that every cycle is the boundary of a topological disk whose interior is disjoint from the graph. A linklessly embeddable graph is a graph that has a linkless or flat embedding; these graphs form a three-dimensional analogue of the planar graphs.
Weak supervisionWeak supervision, also called semi-supervised learning, is a paradigm in machine learning, the relevance and notability of which increased with the advent of large language models due to large amount of data required to train them. It is characterized by using a combination of a small amount of human-labeled data (exclusively used in more expensive and time-consuming supervised learning paradigm), followed by a large amount of unlabeled data (used exclusively in unsupervised learning paradigm).
DBSCANDensity-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm proposed by Martin Ester, Hans-Peter Kriegel, Jörg Sander and Xiaowei Xu in 1996. It is a density-based clustering non-parametric algorithm: given a set of points in some space, it groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away).
Nash embedding theoremsThe Nash embedding theorems (or imbedding theorems), named after John Forbes Nash Jr., state that every Riemannian manifold can be isometrically embedded into some Euclidean space. Isometric means preserving the length of every path. For instance, bending but neither stretching nor tearing a page of paper gives an isometric embedding of the page into Euclidean space because curves drawn on the page retain the same arclength however the page is bent.
Hidden attractorIn the bifurcation theory, a bounded oscillation that is born without loss of stability of stationary set is called a hidden oscillation. In nonlinear control theory, the birth of a hidden oscillation in a time-invariant control system with bounded states means crossing a boundary, in the domain of the parameters, where local stability of the stationary states implies global stability (see, e.g. Kalman's conjecture).