Publication

Model-based reinforcement learning and navigation in animals and machines

Related concepts (32)

Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected.

Agent-based model

An agent-based model (ABM) is a computational model for simulating the actions and interactions of autonomous agents (both individual or collective entities such as organizations or groups) in order to understand the behavior of a system and what governs its outcomes. It combines elements of game theory, complex systems, emergence, computational sociology, multi-agent systems, and evolutionary programming. Monte Carlo methods are used to understand the stochasticity of these models.

Deep reinforcement learning

Deep reinforcement learning (deep RL) is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs (e.g.

Memory

Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered, it would be impossible for language, relationships, or personal identity to develop. Memory loss is usually described as forgetfulness or amnesia. Memory is often understood as an informational processing system with explicit and implicit functioning that is made up of a sensory processor, short-term (or working) memory, and long-term memory.

Episodic memory

Episodic memory is the memory of everyday events (such as times, location geography, associated emotions, and other contextual information) that can be explicitly stated or conjured. It is the collection of past personal experiences that occurred at particular times and places; for example, the party on one's 7th birthday. Along with semantic memory, it comprises the category of explicit memory, one of the two major divisions of long-term memory (the other being implicit memory).

Machine learning

Machine learning (ML) is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machines 'discover' their 'own' algorithms, without needing to be explicitly told what to do by any human-developed algorithms. Recently, generative artificial neural networks have been able to surpass results of many previous approaches.

Intelligent agent

In artificial intelligence, an intelligent agent (IA) is an agent acting in an intelligent manner; It perceives its environment, takes actions autonomously in order to achieve goals, and may improve its performance with learning or acquiring knowledge. An intelligent agent may be simple or complex: A thermostat or other control system is considered an example of an intelligent agent, as is a human being, as is any system that meets the definition, such as a firm, a state, or a biome.

Explicit memory

Explicit memory (or declarative memory) is one of the two main types of long-term human memory, the other of which is implicit memory. Explicit memory is the conscious, intentional recollection of factual information, previous experiences, and concepts. This type of memory is dependent upon three processes: acquisition, consolidation, and retrieval. Explicit memory can be divided into two categories: episodic memory, which stores specific personal experiences, and semantic memory, which stores factual information.

Semantic memory

Semantic memory refers to general world knowledge that humans have accumulated throughout their lives. This general knowledge (word meanings, concepts, facts, and ideas) is intertwined in experience and dependent on culture. New concepts are learned by applying knowledge learned from things in the past. Semantic memory is distinct from episodic memory—the memory of experiences and specific events that occur in one's life that can be recreated at any given point.

Autobiographical memory

Autobiographical memory (AM) is a memory system consisting of episodes recollected from an individual's life, based on a combination of episodic (personal experiences and specific objects, people and events experienced at particular time and place) and semantic (general knowledge and facts about the world) memory. It is thus a type of explicit memory. Conway and Pleydell-Pearce (2000) proposed that autobiographical memory is constructed within a self-memory system (SMS), a conceptual model composed of an autobiographical knowledge base and the working self.

Baddeley's model of working memory

Baddeley's model of working memory is a model of human memory proposed by Alan Baddeley and Graham Hitch in 1974, in an attempt to present a more accurate model of primary memory (often referred to as short-term memory). Working memory splits primary memory into multiple components, rather than considering it to be a single, unified construct. Baddeley & Hitch proposed their three-part working memory model as an alternative to the short-term store in Atkinson & Shiffrin's 'multi-store' memory model (1968).

Reinforcement

In reinforcement theory, it is argued that human behavior is a result of "contingent consequences" to human actions The publication pushes forward the idea that "you get what you reinforce" This means that behavior when given the right types of reinforcers can change employee behavior for the better and negative behavior can be weeded out. The model of self-regulation has three main aspects of human behavior, which are self-awareness, self-reflection, and self-regulation. Reinforcements traditionally align with self-regulation.

Repressed memory

Repressed memory is a controversial, and largely scientifically discredited, psychiatric phenomenon which involves an inability to recall autobiographical information, usually of a traumatic or stressful nature. The concept originated in psychoanalytic theory where repression is understood as a defense mechanism that excludes painful experiences and unacceptable impulses from consciousness. Repressed memory is presently considered largely unsupported by research.

Recurrent neural network

A recurrent neural network (RNN) is one of the two broad types of artificial neural network, characterized by direction of the flow of information between its layers. In contrast to uni-directional feedforward neural network, it is a bi-directional artificial neural network, meaning that it allows the output from some nodes to affect subsequent input to the same nodes. Their ability to use internal state (memory) to process arbitrary sequences of inputs makes them applicable to tasks such as unsegmented, connected handwriting recognition or speech recognition.

Meta-learning (computer science)

Meta learning is a subfield of machine learning where automatic learning algorithms are applied to metadata about machine learning experiments. As of 2017, the term had not found a standard interpretation, however the main goal is to use such metadata to understand how automatic learning can become flexible in solving learning problems, hence to improve the performance of existing learning algorithms or to learn (induce) the learning algorithm itself, hence the alternative term learning to learn.

Types of artificial neural networks

There are many types of artificial neural networks (ANN). Artificial neural networks are computational models inspired by biological neural networks, and are used to approximate functions that are generally unknown. Particularly, they are inspired by the behaviour of neurons and the electrical signals they convey between input (such as from the eyes or nerve endings in the hand), processing, and output from the brain (such as reacting to light, touch, or heat). The way neurons semantically communicate is an area of ongoing research.

Long-term memory

Long-term memory (LTM) is the stage of the Atkinson–Shiffrin memory model in which informative knowledge is held indefinitely. It is defined in contrast to short-term and working memory, which persist for only about 18 to 30 seconds. LTM is commonly labelled as "explicit memory" (declarative), as well as "episodic memory," "semantic memory," "autobiographical memory," and "implicit memory" (procedural memory). The idea of separate memories for short- and long-term storage originated in the 19th century.

Spatial memory

In cognitive psychology and neuroscience, spatial memory is a form of memory responsible for the recording and recovery of information needed to plan a course to a location and to recall the location of an object or the occurrence of an event. Spatial memory is necessary for orientation in space. Spatial memory can also be divided into egocentric and allocentric spatial memory. A person's spatial memory is required to navigate around a familiar city. A rat's spatial memory is needed to learn the location of food at the end of a maze.

Software agent

In computer science, a software agent or software AI is a computer program that acts for a user or other program in a relationship of agency, which derives from the Latin agere (to do): an agreement to act on one's behalf. Such "action on behalf of" implies the authority to decide which, if any, action is appropriate. Some agents are colloquially known as bots, from robot. They may be embodied, as when execution is paired with a robot body, or as software such as a chatbot executing on a phone (e.g.

Q-learning

Q-learning is a model-free reinforcement learning algorithm to learn the value of an action in a particular state. It does not require a model of the environment (hence "model-free"), and it can handle problems with stochastic transitions and rewards without requiring adaptations. For any finite Markov decision process (FMDP), Q-learning finds an optimal policy in the sense of maximizing the expected value of the total reward over any and all successive steps, starting from the current state.