Publication

Real-Time Generative Hand Modeling and Tracking

Related concepts (32)

In the field of gesture recognition and , finger tracking is a high-resolution technique developed in 1969 that is employed to know the consecutive position of the fingers of the user and hence represent objects in 3D. In addition to that, the finger tracking technique is used as a tool of the computer, acting as an external device in our computer, similar to a keyboard and a mouse. The finger tracking system is focused on user-data interaction, where the user interacts with virtual data, by handling through the fingers the volumetric of a 3D object that we want to represent.

Virtual reality

Virtual reality (VR) is a simulated experience that employs pose tracking and 3D near-eye displays to give the user an immersive feel of a virtual world. Applications of virtual reality include entertainment (particularly video games), education (such as medical or military training) and business (such as virtual meetings). Other distinct types of VR-style technology include augmented reality and mixed reality, sometimes referred to as extended reality or XR, although definitions are currently changing due to the nascence of the industry.

Motion capture

Motion capture (sometimes referred as mo-cap or mocap, for short) is the process of recording the movement of objects or people. It is used in military, entertainment, sports, medical applications, and for validation of computer vision and robots. In filmmaking and video game development, it refers to recording actions of human actors and using that information to animate digital character models in 2D or 3D computer animation. When it includes face and fingers or captures subtle expressions, it is often referred to as performance capture.

Immersion (virtual reality)

Immersion into virtual reality (VR) is a perception of being physically present in a non-physical world. The perception is created by surrounding the user of the VR system in images, sound or other stimuli that provide an engrossing total environment. The name is a metaphoric use of the experience of submersion applied to representation, fiction or simulation.

Augmented reality

Augmented reality (AR) is an interactive experience that combines the real world and computer-generated content. The content can span multiple sensory modalities, including visual, auditory, haptic, somatosensory and olfactory. AR can be defined as a system that incorporates three basic features: a combination of real and virtual worlds, real-time interaction, and accurate 3D registration of virtual and real objects. The overlaid sensory information can be constructive (i.e. additive to the natural environment), or destructive (i.

Virtual reality headset

A virtual reality headset (or VR headset) is a head-mounted device that provides virtual reality for the wearer. VR headsets are widely used with VR video games but they are also used in other applications, including simulators and trainers. VR headsets typically include a stereoscopic display (providing separate images for each eye), stereo sound, and sensors like accelerometers and gyroscopes for tracking the pose of the user's head to match the orientation of the virtual camera with the user's eye positions in the real world.

Mixed reality

Mixed reality (MR) is a term used to describe the merging of a real-world environment and a computer-generated one. Physical and virtual objects may co-exist in mixed reality environments and interact in real time. Mixed reality that incorporates haptics has sometimes been referred to as Visuo-haptic mixed reality. In a physics context, the term "interreality system" refers to a virtual reality system coupled with its real-world counterpart.

Single-lens reflex camera

A single-lens reflex camera (SLR) is a camera that typically uses a mirror and prism system (hence "reflex" from the mirror's reflection) that permits the photographer to view through the lens and see exactly what will be captured. With twin lens reflex and rangefinder cameras, the viewed image could be significantly different from the final image. When the shutter button is pressed on most SLRs, the mirror flips out of the light path, allowing light to pass through to the light receptor and the image to be captured.

Gesture recognition

Gesture recognition is a topic in computer science and language technology with the goal of interpreting human gestures via mathematical algorithms. It is a subdiscipline of computer vision. Gestures can originate from any bodily motion or state, but commonly originate from the face or hand. Focuses in the field include emotion recognition from face and hand gesture recognition since they are all expressions. Users can make simple gestures to control or interact with devices without physically touching them.

Depth of field

The depth of field (DOF) is the distance between the nearest and the furthest objects that are in acceptably sharp focus in an image captured with a camera. For cameras that can only focus on one object distance at a time, depth of field is the distance between the nearest and the farthest objects that are in acceptably sharp focus. "Acceptably sharp focus" is defined using a property called the "circle of confusion". The depth of field can be determined by focal length, distance to subject, the acceptable circle of confusion size, and aperture.

Virtual reality sickness

Virtual reality sickness, or VR sickness occurs when exposure to a virtual environment causes symptoms that are similar to motion sickness symptoms. The most common symptoms are general discomfort, eye strain, headache, stomach awareness, nausea, vomiting, pallor, sweating, fatigue, drowsiness, disorientation, and apathy. Other symptoms include postural instability and retching. Common causes are low frame rate, input lag, and the vergence-accommodation-conflict.

Digital single-lens reflex camera

A digital single-lens reflex camera (digital SLR or DSLR) is a digital camera that combines the optics and the mechanisms of a single-lens reflex camera with a solid-state and digitally records the images from the sensor. The reflex design scheme is the primary difference between a DSLR and other digital cameras. In the reflex design, light travels through the lens and then to a mirror that alternates to send the image to either a prism, which shows the image in the optical viewfinder, or the image sensor when the shutter release button is pressed.

Eye tracking

Eye tracking is the process of measuring either the point of gaze (where one is looking) or the motion of an eye relative to the head. An eye tracker is a device for measuring eye positions and eye movement. Eye trackers are used in research on the visual system, in psychology, in psycholinguistics, marketing, as an input device for human-computer interaction, and in product design. In addition, eye trackers are increasingly being used for assistive and rehabilitative applications such as controlling wheelchairs, robotic arms, and prostheses.

Scale-invariant feature transform

The scale-invariant feature transform (SIFT) is a computer vision algorithm to detect, describe, and match local features in images, invented by David Lowe in 1999. Applications include object recognition, robotic mapping and navigation, , 3D modeling, gesture recognition, video tracking, individual identification of wildlife and match moving. SIFT keypoints of objects are first extracted from a set of reference images and stored in a database.

Corner detection

Corner detection is an approach used within computer vision systems to extract certain kinds of features and infer the contents of an image. Corner detection is frequently used in motion detection, , video tracking, image mosaicing, panorama stitching, 3D reconstruction and object recognition. Corner detection overlaps with the topic of interest point detection. A corner can be defined as the intersection of two edges. A corner can also be defined as a point for which there are two dominant and different edge directions in a local neighbourhood of the point.

Light field camera

A light field camera, also known as a plenoptic camera, is a camera that captures information about the light field emanating from a scene; that is, the intensity of light in a scene, and also the precise direction that the light rays are traveling in space. This contrasts with conventional cameras, which record only light intensity at various wavelengths. One type uses an array of micro-lenses placed in front of an otherwise conventional image sensor to sense intensity, color, and directional information.

Polygonal modeling

In 3D computer graphics, polygonal modeling is an approach for modeling objects by representing or approximating their surfaces using polygon meshes. Polygonal modeling is well suited to scanline rendering and is therefore the method of choice for real-time computer graphics. Alternate methods of representing 3D objects include NURBS surfaces, subdivision surfaces, and equation-based (implicit surface) representations used in ray tracers. The basic object used in mesh modeling is a vertex, a point in three-dimensional space.

Real-time computing

Real-time computing (RTC) is the computer science term for hardware and software systems subject to a "real-time constraint", for example from event to system response. Real-time programs must guarantee response within specified time constraints, often referred to as "deadlines". Real-time responses are often understood to be in the order of milliseconds, and sometimes microseconds. A system not specified as operating in real time cannot usually guarantee a response within any timeframe, although typical or expected response times may be given.

Camera

A camera is an optical instrument used to capture and store images or videos, either digitally via an electronic , or chemically via a light-sensitive material such as photographic film. As a pivotal technology in the fields of photography and videography, cameras have played a significant role in the progression of visual arts, media, entertainment, surveillance, and scientific research. The invention of the camera dates back to the 19th century and has since evolved with advancements in technology, leading to a vast array of types and models in the 21st century.

Kalman filter

For statistics and control theory, Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, including statistical noise and other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than those based on a single measurement alone, by estimating a joint probability distribution over the variables for each timeframe. The filter is named after Rudolf E. Kálmán, who was one of the primary developers of its theory.