Darwin’s Evolution Theory justifies the fact that Change is the only Constant.
Physical. Biological. Chemical. Evolution is the answer to Extinction. It comes in all forms, shapes, sizes, visuals. It can be tangible and intangible too.
Even though humans are considered to be “The Most Evolved Species” on Planet Earth. There still exists a constant sense of evolution, rather, a competition of becoming “The Best” amongst each other. Be it the field of sports, politics, movies, technology, business, basically in all forms of the profession.
“What” One Wants. One knows. “Why” One Wants it. That’s also known. But “How” One Achieves it, makes all the difference. Sets apart the Best from the Rest.
This “How” part forms the crux of our advancements. The Methodologies. The Learnings. The Understanding to Implement it. Even take it a step further, as in the case of Artificial Intelligence and Robotics. Humanoids!
“Learning” or “How to Learn a new skill?” has become one of the most fundamental questions for scientists, students, researches, teachers, across the globe. The desire to understand the answer is obvious – if we can understand this, we can enable human species to do things we might not have thought before. Alternately, we can train machines to do more “human” tasks and create true Artificial Intelligence.
While we don’t have a complete answer to the above question yet, there are a few clear things. Irrespective of the skill, we first learn by interacting with the environment. Whether we are learning to drive a car or whether it an infant learning to walk, the learning is based on the interaction with the environment. Learning from the interaction is the foundational underlying concept for all theories of learning and intelligence.
Today, we will explore Reinforcement Learning – goal-oriented learning based on interaction with the environment. Reinforcement Learning is said to be the hope of true Artificial Intelligence. And it is rightly said so because the potential that Reinforcement Learning possesses is immense.
Reinforcement Learning is a training method based on rewarding desired behaviours and/or punishing undesired ones. The learning method has been adopted in Artificial Intelligence (AI) as a method of directing unsupervised Machine Learning through rewards and penalties. Reinforcement Learning is used in Operations Research, Information Theory, Game Theory, Control Theory, Simulation-based Optimization, Multi-Agent Systems, Swarm Intelligence, Statistics and Genetic Algorithms.
RL is usually modelled as a Markov Decision Process (MDP).
Fig: The Agent-Environment interaction in a Markov Decision Process.
HOW TO FORMULATE A BASIC RL PROBLEM?
Some key terms that describe the elements of an RL problem are:
- Environment: Physical world in which the agent operates.
- State: Current situation of the agent.
- Reward: Feedback from the environment.
- Policy: Method to map the agent’s state to actions.
- Value: Future reward that an agent would receive by taking an action in a particular state.
WHAT IS MARKOV DECISION PROCESS? (in a gist)
MDPs are mathematical frameworks to describe an environment in Reinforcement Learning and almost all RL problems can be formalized using MDPs. An MDP consists of a set of finite environment States S, a set of possible Actions A(s) in each State, a Real-Valued Reward function R(s) and a Transition Model P (s’, s | a).
MDPs help to make decisions on a Stochastic environment. The goal is to find a policy, which is a map that gives all optimal actions on each State on our Environment.
MDP is somehow more powerful than simple planning because your policy will allow you to do optimal actions even if something went wrong along the way. Simple planning just follows the plan after you find the best strategy.
COMPARISON WITH OTHER MACHINE LEARNING METHODOLOGIES:
Reinforcement Learning belongs to a bigger class of Machine Learning algorithms. Below is the description of types of Machine Learning methodologies.
Types of Machine Learning:
- Supervised: Task Driven (Regression/Classification).
- Unsupervised: Data-Driven (Clustering).
- Reinforcement: Algorithms learn to react to an Environment.
Let’s see a comparison between RL and others:
- Supervised vs Reinforcement Learning: In supervised learning, there’s an external “supervisor”, which has knowledge of the environment and who shares it with the agent to complete the task. But there are some problems in which there are so many combinations of subtasks that the agent can perform to achieve the objective. So that creating a “supervisor” is almost impractical. For example, in a chess game, there are tens of thousands of moves that can be played. So, creating a knowledge base that can be played is a tedious task. In these problems, it is more feasible to learn from one’s own experiences and gain knowledge from them. This is the main difference that can be said of Reinforcement Learning and supervised learning. In both supervised and Reinforcement Learning, there is a mapping between input and output. But in Reinforcement Learning, there is a reward function which acts as a feedback to the agent as opposed to supervised learning.
- Unsupervised vs Reinforcement Learning: In Reinforcement Learning, there’s a mapping from input to output which is not present in unsupervised learning. In unsupervised learning, the main task is to find the underlying patterns rather than the mapping. For example, if the task is to suggest a news article to a user, an unsupervised learning algorithm will look at similar articles which the person has previously read and suggest anyone from them. Whereas a Reinforcement Learning algorithm will get constant feedback from the user by suggesting a few news articles and then build a “knowledge graph” of which articles will the person like.
There is also a fourth type of Machine Learning methodology called Semi-Supervised Learning, which is essentially a combination of supervised and unsupervised learning. It differs from Reinforcement Learning as similar to supervised and semi-supervised learning has direct mapping whereas reinforcement does not.
RL REAL-WORLD IMPLEMENTATIONS:
Even though we are still in the early stages of Reinforcement Learning, several applications and products are starting to rely on technology. Companies are beginning to implement RL for problems and sequential decision-making is required and where RL can support human experts or automate the decision-making process. Here are a few:
- Robotics: Reinforcement Learning gives robotics a “framework and a set of tools” for hard-to-engineer behavioUrs. Since Reinforcement Learning can happen without supervision, this could help robotics grow exponentially.
- Industrial Automation: Thanks to the Reinforcement Learning capabilities from DeepMind, Google was able to reduce energy consumption in its data centres dramatically. Bonsai, recently acquired by Microsoft, offers a Reinforcement Learning solution to automate and “build intelligence into complex and dynamic systems” in energy, HVAC, manufacturing, automotive and supply chains.
- Enhance Predictive Maintenance: Machine Learning has been used in manufacturing for some time, but Reinforcement Learning would make predictive maintenance even better than it is today.
- Game Playing: Indeed, the first application in which Reinforcement Learning gained notoriety was when AlphaGo, a Machine Learning algorithm, won against one of the world’s best human players in the game Go. Now Reinforcement Learning is used to compete in all kinds of games.
- Personalization: Whether it’s the media you consume, the advertising that’s targeted to you or the goods you should purchase next on Amazon, there are Reinforcement Learning algorithms at play behind the scenes to create a stellar customer experience.
The Artificial Intelligence industry is growing at an exponential rate. Today there’s a strong Artificial Intelligence presence across the globe. It’s inevitable that business sooner or later will have to embrace AI; it creates efficiency and helps save billions of dollars that would have been lost otherwise. In the Health Care industry, Deep Learning Bioinformatics is catching on and fast. We can’t go on being blind to the endless possibilities that can be achieved thanks to AI. The reality is many companies are developing Artificial Intelligence, and it’s time we all join the bandwagon.