Publication

Learning to Represent and Reconstruct 3D Deformable Objects

Related concepts (32)

3D scanner is the process of analyzing a real-world object or environment to collect three dimensional data of its shape and possibly its appearance (e.g. color). The collected data can then be used to construct digital 3D models. A 3D scanner can be based on many different technologies, each with its own limitations, advantages and costs. Many limitations in the kind of objects that can be digitised are still present. For example, optical technology may encounter many difficulties with dark, shiny, reflective or transparent objects.

Computer vision

Computer vision tasks include methods for , , and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical or symbolic information, e.g. in the forms of decisions. Understanding in this context means the transformation of visual images (the input to the retina in the human analog) into descriptions of the world that make sense to thought processes and can elicit appropriate action.

Pose (computer vision)

In the fields of computing and computer vision, pose (or spatial pose) represents the position and orientation of an object, usually in three dimensions. Poses are often stored internally as transformation matrices. The term “pose” is largely synonymous with the term “transform”, but a transform may often include scale, whereas pose does not. In computer vision, the pose of an object is often estimated from camera input by the process of pose estimation.

Riemannian manifold

In differential geometry, a Riemannian manifold or Riemannian space (M, g), so called after the German mathematician Bernhard Riemann, is a real, smooth manifold M equipped with a positive-definite inner product gp on the tangent space TpM at each point p. The family gp of inner products is called a Riemannian metric (or Riemannian metric tensor). Riemannian geometry is the study of Riemannian manifolds. A common convention is to take g to be smooth, which means that for any smooth coordinate chart (U, x) on M, the n2 functions are smooth functions.

3D modeling

In 3D computer graphics, 3D modeling is the process of developing a mathematical coordinate-based representation of any surface of an object (inanimate or living) in three dimensions via specialized software by manipulating edges, vertices, and polygons in a simulated 3D space. Three-dimensional (3D) models represent a physical body using a collection of points in 3D space, connected by various geometric entities such as triangles, lines, curved surfaces, etc.

Computer stereo vision

Computer stereo vision is the extraction of 3D information from digital images, such as those obtained by a CCD camera. By comparing information about a scene from two vantage points, 3D information can be extracted by examining the relative positions of objects in the two panels. This is similar to the biological process of stereopsis. In traditional stereo vision, two cameras, displaced horizontally from one another, are used to obtain two differing views on a scene, in a manner similar to human binocular vision.

Shape

A shape or figure is a graphical representation of an object or its external boundary, outline, or external surface, as opposed to other properties such as color, texture, or material type. A plane shape or plane figure is constrained to lie on a plane, in contrast to solid 3D shapes. A two-dimensional shape or two-dimensional figure (also: 2D shape or 2D figure) may lie on a more general curved surface (a non-Euclidean two-dimensional space). Lists of shapes Some simple shapes can be put into broad categories.

Homography (computer vision)

In the field of computer vision, any two images of the same planar surface in space are related by a homography (assuming a pinhole camera model). This has many practical applications, such as , , or camera motion—rotation and translation—between two images. Once camera resectioning has been done from an estimated homography matrix, this information may be used for navigation, or to insert models of 3D objects into an image or video, so that they are rendered with the correct perspective and appear to have been part of the original scene (see Augmented reality).

Pseudo-Riemannian manifold

In differential geometry, a pseudo-Riemannian manifold, also called a semi-Riemannian manifold, is a differentiable manifold with a metric tensor that is everywhere nondegenerate. This is a generalization of a Riemannian manifold in which the requirement of positive-definiteness is relaxed. Every tangent space of a pseudo-Riemannian manifold is a pseudo-Euclidean vector space. A special case used in general relativity is a four-dimensional Lorentzian manifold for modeling spacetime, where tangent vectors can be classified as timelike, null, and spacelike.

Riemannian geometry

Riemannian geometry is the branch of differential geometry that studies Riemannian manifolds, defined as smooth manifolds with a Riemannian metric (an inner product on the tangent space at each point that varies smoothly from point to point). This gives, in particular, local notions of angle, length of curves, surface area and volume. From those, some other global quantities can be derived by integrating local contributions.

Correspondence problem

The correspondence problem refers to the problem of ascertaining which parts of one image correspond to which parts of another image, where differences are due to movement of the camera, the elapse of time, and/or movement of objects in the photos.

Feature (computer vision)

In computer vision and , a feature is a piece of information about the content of an image; typically about whether a certain region of the image has certain properties. Features may be specific structures in the image such as points, edges or objects. Features may also be the result of a general neighborhood operation or feature detection applied to the image. Other examples of features are related to motion in image sequences, or to shapes defined in terms of curves or boundaries between different image regions.

Object detection

Object detection is a computer technology related to computer vision and that deals with detecting instances of semantic objects of a certain class (such as humans, buildings, or cars) in digital images and videos. Well-researched domains of object detection include face detection and pedestrian detection. Object detection has applications in many areas of computer vision, including and video surveillance. It is widely used in computer vision tasks such as , vehicle counting, activity recognition, face detection, face recognition, video object co-segmentation.

Depth map

In 3D computer graphics and computer vision, a depth map is an or that contains information relating to the distance of the surfaces of scene objects from a viewpoint. The term is related (and may be analogous) to depth buffer, Z-buffer, Z-buffering, and Z-depth. The "Z" in these latter terms relates to a convention that the central axis of view of a camera is in the direction of the camera's Z axis, and not to the absolute Z axis of a scene. File:Cubic Structure.jpg|Cubic Structure File:Cubic Frame Stucture and Floor Depth Map.

Machine vision

Machine vision (MV) is the technology and methods used to provide -based automatic inspection and analysis for such applications as automatic inspection, process control, and robot guidance, usually in industry. Machine vision refers to many technologies, software and hardware products, integrated systems, actions, methods and expertise. Machine vision as a systems engineering discipline can be considered distinct from computer vision, a form of computer science.

Transformer (machine learning model)

A transformer is a deep learning architecture that relies on the parallel multi-head attention mechanism. The modern transformer was proposed in the 2017 paper titled 'Attention Is All You Need' by Ashish Vaswani et al., Google Brain team. It is notable for requiring less training time than previous recurrent neural architectures, such as long short-term memory (LSTM), and its later variation has been prevalently adopted for training large language models on large (language) datasets, such as the Wikipedia corpus and Common Crawl, by virtue of the parallelized processing of input sequence.

Reinforcement learning

Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected.

Glossary of Riemannian and metric geometry

This is a glossary of some terms used in Riemannian geometry and metric geometry — it doesn't cover the terminology of differential topology. The following articles may also be useful; they either contain specialised vocabulary or provide more detailed expositions of the definitions given below. Connection Curvature Metric space Riemannian manifold See also: Glossary of general topology Glossary of differential geometry and topology List of differential geometry topics Unless stated otherwise, letters X, Y, Z below denote metric spaces, M, N denote Riemannian manifolds, |xy| or denotes the distance between points x and y in X.

Semantic query

Semantic queries allow for queries and analytics of associative and contextual nature. Semantic queries enable the retrieval of both explicitly and implicitly derived information based on syntactic, semantic and structural information contained in data. They are designed to deliver precise results (possibly the distinctive selection of one single piece of information) or to answer more fuzzy and wide open questions through pattern matching and digital reasoning. Semantic queries work on named graphs, linked data or triples.

Bundle metric

In differential geometry, the notion of a metric tensor can be extended to an arbitrary vector bundle, and to some principal fiber bundles. This metric is often called a bundle metric, or fibre metric. If M is a topological manifold and pi : E → M a vector bundle on M, then a metric on E is a bundle map k : E ×M E → M × R from the fiber product of E with itself to the trivial bundle with fiber R such that the restriction of k to each fibre over M is a nondegenerate bilinear map of vector spaces.