Publication

Learning continuous-time working memory tasks with on-policy neural reinforcement learning

Related concepts (31)

Working memory is a cognitive system with a limited capacity that can hold information temporarily. It is important for reasoning and the guidance of decision-making and behavior. Working memory is often used synonymously with short-term memory, but some theorists consider the two forms of memory distinct, assuming that working memory allows for the manipulation of stored information, whereas short-term memory only refers to the short-term storage of information.

Reinforcement learning

Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected.

Memory

Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered, it would be impossible for language, relationships, or personal identity to develop. Memory loss is usually described as forgetfulness or amnesia. Memory is often understood as an informational processing system with explicit and implicit functioning that is made up of a sensory processor, short-term (or working) memory, and long-term memory.

Deep reinforcement learning

Deep reinforcement learning (deep RL) is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs (e.g.

Q-learning

Q-learning is a model-free reinforcement learning algorithm to learn the value of an action in a particular state. It does not require a model of the environment (hence "model-free"), and it can handle problems with stochastic transitions and rewards without requiring adaptations. For any finite Markov decision process (FMDP), Q-learning finds an optimal policy in the sense of maximizing the expected value of the total reward over any and all successive steps, starting from the current state.

Baddeley's model of working memory

Baddeley's model of working memory is a model of human memory proposed by Alan Baddeley and Graham Hitch in 1974, in an attempt to present a more accurate model of primary memory (often referred to as short-term memory). Working memory splits primary memory into multiple components, rather than considering it to be a single, unified construct. Baddeley & Hitch proposed their three-part working memory model as an alternative to the short-term store in Atkinson & Shiffrin's 'multi-store' memory model (1968).

Short-term memory

Short-term memory (or "primary" or "active memory") is the capacity for holding a small amount of information in an active, readily available state for a short interval. For example, short-term memory holds a phone number that has just been recited. The duration of short-term memory (absent rehearsal or active maintenance) is estimated to be on the order of seconds. The commonly cited capacity of 7 items, found in Miller's Law, has been superseded by 4±1 items. In contrast, long-term memory holds information indefinitely.

Artificial neural network

Artificial neural networks (ANNs, also shortened to neural networks (NNs) or neural nets) are a branch of machine learning models that are built using principles of neuronal organization discovered by connectionism in the biological neural networks constituting animal brains. An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons.

Spatial memory

In cognitive psychology and neuroscience, spatial memory is a form of memory responsible for the recording and recovery of information needed to plan a course to a location and to recall the location of an object or the occurrence of an event. Spatial memory is necessary for orientation in space. Spatial memory can also be divided into egocentric and allocentric spatial memory. A person's spatial memory is required to navigate around a familiar city. A rat's spatial memory is needed to learn the location of food at the end of a maze.

Randomness

In common usage, randomness is the apparent or actual lack of definite pattern or predictability in information. A random sequence of events, symbols or steps often has no order and does not follow an intelligible pattern or combination. Individual random events are, by definition, unpredictable, but if the probability distribution is known, the frequency of different outcomes over repeated events (or "trials") is predictable. For example, when throwing two dice, the outcome of any particular roll is unpredictable, but a sum of 7 will tend to occur twice as often as 4.

Psychophysics

Psychophysics quantitatively investigates the relationship between physical stimuli and the sensations and perceptions they produce. Psychophysics has been described as "the scientific study of the relation between stimulus and sensation" or, more completely, as "the analysis of perceptual processes by studying the effect on a subject's experience or behaviour of systematically varying the properties of a stimulus along one or more physical dimensions". Psychophysics also refers to a general class of methods that can be applied to study a perceptual system.

Evidence

Evidence for a proposition is what supports the proposition. It is usually understood as an indication that the supported proposition is true. What role evidence plays and how it is conceived varies from field to field. In epistemology, evidence is what justifies beliefs or what makes it rational to hold a certain doxastic attitude. For example, a perceptual experience of a tree may act as evidence that justifies the belief that there is a tree. In this role, evidence is usually understood as a private mental state.

Random number generation

Random number generation is a process by which, often by means of a random number generator (RNG), a sequence of numbers or symbols that cannot be reasonably predicted better than by random chance is generated. This means that the particular outcome sequence will contain some patterns detectable in hindsight but unpredictable to foresight. True random number generators can be hardware random-number generators (HRNGs), wherein each generation is a function of the current value of a physical environment's attribute that is constantly changing in a manner that is practically impossible to model.

Decision-making

In psychology, decision-making (also spelled decision making and decisionmaking) is regarded as the cognitive process resulting in the selection of a belief or a course of action among several possible alternative options. It could be either rational or irrational. The decision-making process is a reasoning process based on assumptions of values, preferences and beliefs of the decision-maker. Every decision-making process produces a final choice, which may or may not prompt action.

Memory management unit

A memory management unit (MMU), sometimes called paged memory management unit (PMMU), is a computer hardware unit that examines all memory references on the memory bus, translating these requests, known as virtual memory addresses, into physical addresses in main memory. In modern systems, programs generally have addresses that access the theoretical maximum memory of the computer architecture, 32 or 64 bits. The MMU maps the addresses from each program into separate areas in physical memory, which is generally much smaller than the theoretical maximum.

/dev/random

In Unix-like operating systems, and are s that serve as cryptographically secure pseudorandom number generators. They allow access to environmental noise collected from device drivers and other sources. typically blocked if there was less entropy available than requested; more recently (see below for the differences between operating systems) it usually blocks at startup until sufficient entropy has been gathered, then unblocks permanently.

Interference theory

The interference theory is a theory regarding human memory. Interference occurs in learning. The notion is that memories encoded in long-term memory (LTM) are forgotten and cannot be retrieved into short-term memory (STM) because either memory could interfere with the other. There is an immense number of encoded memories within the storage of LTM. The challenge for memory retrieval is recalling the specific memory and working in the temporary workspace provided in STM.

Randomized algorithm

A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic or procedure. The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performance in the "average case" over all possible choices of random determined by the random bits; thus either the running time, or the output (or both) are random variables.

Random seed

A random seed (or seed state, or just seed) is a number (or vector) used to initialize a pseudorandom number generator. For a seed to be used in a pseudorandom number generator, it does not need to be random. Because of the nature of number generating algorithms, so long as the original seed is ignored, the rest of the values that the algorithm generates will follow probability distribution in a pseudorandom manner.

Computer network

A computer network is a set of computers sharing resources located on or provided by network nodes. Computers use common communication protocols over digital interconnections to communicate with each other. These interconnections are made up of telecommunication network technologies based on physically wired, optical, and wireless radio-frequency methods that may be arranged in a variety of network topologies. The nodes of a computer network can include personal computers, servers, networking hardware, or other specialized or general-purpose hosts.