Q-learningQ-learning is a model-free reinforcement learning algorithm to learn the value of an action in a particular state. It does not require a model of the environment (hence "model-free"), and it can handle problems with stochastic transitions and rewards without requiring adaptations. For any finite Markov decision process (FMDP), Q-learning finds an optimal policy in the sense of maximizing the expected value of the total reward over any and all successive steps, starting from the current state.
Variable Specific Impulse Magnetoplasma RocketThe Variable Specific Impulse Magnetoplasma Rocket (VASIMR) is an electrothermal thruster under development for possible use in spacecraft propulsion. It uses radio waves to ionize and heat an inert propellant, forming a plasma, then a magnetic field to confine and accelerate the expanding plasma, generating thrust. It is a plasma propulsion engine, one of several types of spacecraft electric propulsion systems. The VASIMR method for heating plasma was originally developed during nuclear fusion research.