Multi-objective optimizationMulti-objective optimization or Pareto optimization (also known as multi-objective programming, vector optimization, multicriteria optimization, or multiattribute optimization) is an area of multiple-criteria decision making that is concerned with mathematical optimization problems involving more than one objective function to be optimized simultaneously. Multi-objective is a type of vector optimization that has been applied in many fields of science, including engineering, economics and logistics where optimal decisions need to be taken in the presence of trade-offs between two or more conflicting objectives.
Cluster analysisCluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of exploratory data analysis, and a common technique for statistical data analysis, used in many fields, including pattern recognition, , information retrieval, bioinformatics, data compression, computer graphics and machine learning.
GenerationA generation refers to all of the people born and living at about the same time, regarded collectively. It can also be described as, "the average period, generally considered to be about 20–30 years, during which children are born and grow up, become adults, and begin to have children." In kinship terminology, it is a structural term designating the parent-child relationship. It is known as biogenesis, reproduction, or procreation in the biological sciences.
Data and information visualizationData and information visualization (data viz or info viz) is the practice of designing and creating easy-to-communicate and easy-to-understand graphic or visual representations of a large amount of complex quantitative and qualitative data and information with the help of static, dynamic or interactive visual items.
Greatest GenerationThe Greatest Generation, also known as the G.I. Generation and the World War II generation, is the Western demographic cohort following the Lost Generation and preceding the Silent Generation. The generation is generally defined as people born from 1901 to 1927. They were shaped by the Great Depression and were the primary generation composing the enlisted forces in World War II. Most people of the Greatest Generation are the parents of the Silent Generation and Baby Boomers, and, in turn, were the children of the Lost Generation.
Simple random sampleIn statistics, a simple random sample (or SRS) is a subset of individuals (a sample) chosen from a larger set (a population) in which a subset of individuals are chosen randomly, all with the same probability. It is a process of selecting a sample in a random way. In SRS, each subset of k individuals has the same probability of being chosen for the sample as any other subset of k individuals. A simple random sample is an unbiased sampling technique. Simple random sampling is a basic type of sampling and can be a component of other more complex sampling methods.
Silent GenerationThe Silent Generation, also known as the Traditionalist Generation, is the Western demographic cohort following the Greatest Generation and preceding the baby boomers. The generation is generally defined as people born from 1928 to 1945. By this definition and U.S. Census data, there were 23 million Silents in the United States as of 2019. In the United States, the Great Depression of the 1930s and World War II in the early-to-mid 1940s caused people to have fewer children and as a result, the generation is comparatively small.
Generation ZGeneration Z (often shortened to Gen Z), colloquially known as zoomers, is the demographic cohort succeeding Millennials and preceding Generation Alpha. Researchers and popular media use the mid-to-late 1990s as starting birth years and the early 2010s as ending birth years. Most members of Generation Z are children of Generation X or younger Baby Boomers. The older members may be the parents of the younger members of Generation Alpha.
Systematic samplingIn survey methodology, systematic sampling is a statistical method involving the selection of elements from an ordered sampling frame. The most common form of systematic sampling is an equiprobability method. In this approach, progression through the list is treated circularly, with a return to the top once the list ends. The sampling starts by selecting an element from the list at random and then every kth element in the frame is selected, where k, is the sampling interval (sometimes known as the skip): this is calculated as: where n is the sample size, and N is the population size.
Stratified samplingIn statistics, stratified sampling is a method of sampling from a population which can be partitioned into subpopulations. In statistical surveys, when subpopulations within an overall population vary, it could be advantageous to sample each subpopulation (stratum) independently. Stratification is the process of dividing members of the population into homogeneous subgroups before sampling. The strata should define a partition of the population.
Sampling (statistics)In statistics, quality assurance, and survey methodology, sampling is the selection of a subset or a statistical sample (termed sample for short) of individuals from within a statistical population to estimate characteristics of the whole population. Statisticians attempt to collect samples that are representative of the population. Sampling has lower costs and faster data collection compared to recording data from the entire population, and thus, it can provide insights in cases where it is infeasible to measure an entire population.
Mathematical optimizationMathematical optimization (alternatively spelled optimisation) or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries.
Generation XGeneration X (often shortened to Gen X) is the demographic cohort following the baby boomers and preceding the millennials. Researchers and popular media use the mid-to-late 1960s as starting birth years and the late 1970s to early 1980s as ending birth years, with the generation being generally defined as people born from 1965 to 1980. By this definition and U.S. Census data, there are 65.2 million Gen Xers in the United States as of 2019.
Hierarchical clusteringIn data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two categories: Agglomerative: This is a "bottom-up" approach: Each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. Divisive: This is a "top-down" approach: All observations start in one cluster, and splits are performed recursively as one moves down the hierarchy.
Generation AlphaGeneration Alpha (Gen Alpha for short) is the demographic cohort succeeding Generation Z. Scientists and popular media use the early 2010s as starting birth years and the early-to-mid 2020s as ending birth years . Named after alpha, the first letter in the Greek alphabet, Generation Alpha is the first to be born entirely in the 21st century and the third millennium. Members of Generation Alpha are mostly children of Millennials, and older Generation Z.
K-means clusteringk-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or cluster centroid), serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells. k-means clustering minimizes within-cluster variances (squared Euclidean distances), but not regular Euclidean distances, which would be the more difficult Weber problem: the mean optimizes squared errors, whereas only the geometric median minimizes Euclidean distances.
Parallel coordinatesParallel coordinates are a common way of visualizing and analyzing high-dimensional datasets. To show a set of points in an n-dimensional space, a backdrop is drawn consisting of n parallel lines, typically vertical and equally spaced. A point in n-dimensional space is represented as a polyline with vertices on the parallel axes; the position of the vertex on the i-th axis corresponds to the i-th coordinate of the point.
Scientific visualizationScientific visualization (also spelled scientific visualisation) is an interdisciplinary branch of science concerned with the visualization of scientific phenomena. It is also considered a subset of computer graphics, a branch of computer science. The purpose of scientific visualization is to graphically illustrate scientific data to enable scientists to understand, illustrate, and glean insight from their data.
Sampling frameIn statistics, a sampling frame is the source material or device from which a sample is drawn. It is a list of all those within a population who can be sampled, and may include individuals, households or institutions. Importance of the sampling frame is stressed by Jessen and Salant and Dillman. In many practical situations the frame is a matter of choice to the survey planner, and sometimes a critical one. [...] Some very worthwhile investigations are not undertaken at all because of the lack of an apparent frame; others, because of faulty frames, have ended in a disaster or in cloud of doubt.
Cluster samplingIn statistics, cluster sampling is a sampling plan used when mutually homogeneous yet internally heterogeneous groupings are evident in a statistical population. It is often used in marketing research. In this sampling plan, the total population is divided into these groups (known as clusters) and a simple random sample of the groups is selected. The elements in each cluster are then sampled. If all elements in each sampled cluster are sampled, then this is referred to as a "one-stage" cluster sampling plan.