Publication

Driving and suppressing the human language network using large language models

Related concepts (32)

The human brain is the central organ of the human nervous system, and with the spinal cord makes up the central nervous system. The brain consists of the cerebrum, the brainstem and the cerebellum. It controls most of the activities of the body, processing, integrating, and coordinating the information it receives from the sense organs, and making decisions as to the instructions sent to the rest of the body. The brain is contained in, and protected by, the skull bones of the head.

Language model

A language model is a probabilistic model of a natural language that can generate probabilities of a series of words, based on text corpora in one or multiple languages it was trained on. Large language models, as their most advanced form, are a combination of feedforward neural networks and transformers. They have superseded recurrent neural network-based models, which had previously superseded the pure statistical models, such as word n-gram language model.

Origin of language

The origin of language (spoken and signed, as well as language-related technological systems such as writing), its relationship with human evolution, and its consequences have been subjects of study for centuries. Scholars wishing to study the origins of language must draw inferences from evidence such as the fossil record, archaeological evidence, contemporary language diversity, studies of language acquisition, and comparisons between human language and systems of communication existing among animals (particularly other primates).

Language

Language is a structured system of communication that consists of grammar and vocabulary. It is the primary means by which humans convey meaning, both in spoken and written forms, and may also be conveyed through sign languages. The vast majority of human languages have developed writing systems that allow for the recording and preservation of the sounds or signs of language. Human language is characterized by its cultural and historical diversity, with significant variations observed between cultures and across time.

Artificial neural network

Artificial neural networks (ANNs, also shortened to neural networks (NNs) or neural nets) are a branch of machine learning models that are built using principles of neuronal organization discovered by connectionism in the biological neural networks constituting animal brains. An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons.

Recurrent neural network

A recurrent neural network (RNN) is one of the two broad types of artificial neural network, characterized by direction of the flow of information between its layers. In contrast to uni-directional feedforward neural network, it is a bi-directional artificial neural network, meaning that it allows the output from some nodes to affect subsequent input to the same nodes. Their ability to use internal state (memory) to process arbitrary sequences of inputs makes them applicable to tasks such as unsegmented, connected handwriting recognition or speech recognition.

Constructed language

A constructed language (shortened to a conlang) is a language whose phonology, grammar, and vocabulary, instead of having developed naturally, are consciously devised for some purpose, which may include being devised for a work of fiction. A constructed language may also be referred to as an artificial, planned or invented language, or (in some cases) a fictional language. Planned languages (or engineered languages/engelangs) are languages that have been purposefully designed; they are the result of deliberate, controlling intervention and are thus of a form of language planning.

Transformer (machine learning model)

A transformer is a deep learning architecture that relies on the parallel multi-head attention mechanism. The modern transformer was proposed in the 2017 paper titled 'Attention Is All You Need' by Ashish Vaswani et al., Google Brain team. It is notable for requiring less training time than previous recurrent neural architectures, such as long short-term memory (LSTM), and its later variation has been prevalently adopted for training large language models on large (language) datasets, such as the Wikipedia corpus and Common Crawl, by virtue of the parallelized processing of input sequence.

Large language model

A large language model (LLM) is a language model characterized by its large size. Their size is enabled by AI accelerators, which are able to process vast amounts of text data, mostly scraped from the Internet. The artificial neural networks which are built can contain from tens of millions and up to billions of weights and are (pre-)trained using self-supervised learning and semi-supervised learning. Transformer architecture contributed to faster training.

Brain

A brain is an organ that serves as the center of the nervous system in all vertebrate and most invertebrate animals. It is located in the head, usually close to the sensory organs for senses such as vision. It is the most complex organ in a vertebrate's body. In a human, the cerebral cortex contains approximately 14–16 billion neurons, and the estimated number of neurons in the cerebellum is 55–70 billion. Each neuron is connected by synapses to several thousand other neurons.

Neural network

A neural network can refer to a neural circuit of biological neurons (sometimes also called a biological neural network), a network of artificial neurons or nodes in the case of an artificial neural network. Artificial neural networks are used for solving artificial intelligence (AI) problems; they model connections of biological neurons as weights between nodes. A positive weight reflects an excitatory connection, while negative values mean inhibitory connections. All inputs are modified by a weight and summed.

Cerebral cortex

The cerebral cortex, also known as the cerebral mantle, is the outer layer of neural tissue of the cerebrum of the brain in humans and other mammals. The cerebral cortex mostly consists of the six-layered neocortex, with just 10% consisting of allocortex. It is separated into two cortices, by the longitudinal fissure that divides the cerebrum into the left and right cerebral hemispheres. The two hemispheres are joined beneath the cortex by the corpus callosum. The cerebral cortex is the largest site of neural integration in the central nervous system.

Language acquisition

Language acquisition is the process by which humans acquire the capacity to perceive and comprehend language (in other words, gain the ability to be aware of language and to understand it), as well as to produce and use words and sentences to communicate. Language acquisition involves structures, rules, and representation. The capacity to use language successfully requires one to acquire a range of tools including phonology, morphology, syntax, semantics, and an extensive vocabulary.

Great ape language

Research into great ape language has involved teaching chimpanzees, bonobos, gorillas and orangutans to communicate with humans and each other using sign language, physical tokens, lexigrams, and imitative human speech. Some primatologists argue that the use of these communication methods indicate primate "language" ability, though this depends on one's definition of language. Non-human animals have produced behaviors that resemble human sentence production.

Social network

A social network is a social structure made up of a set of social actors (such as individuals or organizations), sets of dyadic ties, and other social interactions between actors. The social network perspective provides a set of methods for analyzing the structure of whole social entities as well as a variety of theories explaining the patterns observed in these structures. The study of these structures uses social network analysis to identify local and global patterns, locate influential entities, and examine network dynamics.

Animal language

Animal languages are forms of non-human animal communication that show similarities to human language. Animals communicate through a variety of signs, such as sounds or movements. Signing among animals may be considered complex enough to be a form of language if the inventory of signs is large. The signs are relatively arbitrary, and the animals seem to produce them with a degree of volition (as opposed to relatively automatic conditioned behaviors or unconditioned instincts, usually including facial expressions).

Language death

In linguistics, language death occurs when a language loses its last native speaker. By extension, language extinction is when the language is no longer known, including by second-language speakers, when it becomes known as an extinct language. A related term is linguicide, the death of a language from natural or political causes, and, rarely, glottophagy, the absorption or replacement of a minor language by a major language.

Natural language processing

Natural language processing (NLP) is an interdisciplinary subfield of linguistics and computer science. It is primarily concerned with processing natural language datasets, such as text corpora or speech corpora, using either rule-based or probabilistic (i.e. statistical and, most recently, neural network-based) machine learning approaches. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them.

Cortical homunculus

A cortical homunculus () is a distorted representation of the human body, based on a neurological "map" of the areas and proportions of the human brain dedicated to processing motor functions, or sensory functions, for different parts of the body. Nerve fibresconducting somatosensory information from all over the bodyterminate in various areas of the parietal lobe in the cerebral cortex, forming a representational map of the body. Findings from the 2010s and early 2020s began to call this interpretation into question, and research is ongoing in this field.

Generative pre-trained transformer

Generative pre-trained transformers (GPT) are a type of large language model (LLM) and a prominent framework for generative artificial intelligence. The first GPT was introduced in 2018 by OpenAI. GPT models are artificial neural networks that are based on the transformer architecture, pre-trained on large data sets of unlabelled text, and able to generate novel human-like content. As of 2023, most LLMs have these characteristics and are sometimes referred to broadly as GPTs.