Deep Reinforcement Learning in Telecommunications

And I am back today with a post from the O’Reilly Media Artificial Intelligence newsletter. This is important because the post gives us a nice new perspective about the real potential of Artificial Intelligence for business applications. We are really in need to see AI from realistic and pragmatic perspectives, and forget a bit the hype with the surely valuable theoretical developments, but  sometimes a bit off the day-to-day realities of most people.

The telecommunications business sector of most economies, more or less developed, are not a phantasy or hyped subject. It is a reality, sometimes more so with the less developed economies, with even the most poor in these countries finding their way to own a mobile phone or even a smart phone. So this is an area that naturally attracts a fair deal of attention within the business community of any place or jurisdiction.

What is not often so much realised or communicated is that the infrastructure supporting the telecommunications businesses is fraught with challenges and difficulties. The article I would like to share here today introduces us the promise of AI in addressing those challenges. And to do this with the recent interesting development within the increasingly popular deep reinforcement learning framework is worth to mention. The article makes for a good read, it is well written and the material supporting its claims are trustworthy and well targeted. I will provide a link to the main research paper in the article below the list of paragraphs I thought I should share with the readers and audience of this Blog (almost all of the article which attests to its quality).

The importance of this article is hard to overstate. This is because we are maybe through an intense phase of wider business implementation of AI techniques and technology to businesses as diverse as Financial Services, Insurance, IT security and so on… But what we should also realise is that the business case is one of operational efficiency as well as of reduction of costs in every possible way. We have been listening to the expression ‘Internet of Things’ (IoT) often but we seldom listen what that is about and what the challenges are. Artificial Intelligence techniques might be the essential ingredient that will allow IoT to become a reality and yield its promise of new business models. In the future we may even forget that it is AI technology and  techniques that are running the show behind the scenes. They will be better benevolent ghosts than those of Shakespearean inspiration…Or so we wish.


Making telecommunications infrastructure smarter



A common refrain about artificial intelligence (AI) lauds the ability of machines to pick out hard-to-see patterns from data to make valuable, and more importantly, actionable observations. Strong interest has naturally followed in applying AI to all sorts of promising applications, from potentially life-saving ones like health care diagnostics to the more profit-driven variety like financial investment. Meanwhile, rather ethereal speculation about existential risk and more tangible concern about job dislocation have both run rampant as the topics du jour concerning AI. In the midst of all this, many tend to overlook a more concrete and pretty important, but admittedly less sexy subject—AI’s impact on core infrastructure.


Major shifts usually happen as a confluence of not one, but many trends. That is the case in telecommunications as well. Continuous proliferation and obsolescence of devices means more and more endpoints, often mobile, need to be managed as they come online and offline. Meanwhile, bandwidth usage is exploding as consumers and corporations demand more content and data, and the march toward cloud services continues. All of this means that network infrastructure must have more capacity and be more dynamic and fault tolerant than ever as providers feverishly optimize network configurations and perform load balancing.

One can imagine the complexity of this when delivering connectivity services to a large portion of an entire country’s online presence. To solve this problem, telecommunications companies are looking to borrow principles from people who have solved similar issues at smaller scales—running data centers with software-defined networking (SDN). By developing SDN for a wide area network (WAN), telecom providers can effectively make their networks reprogrammable by abstracting away hardware complexity into software.

Still, even as WANs become software defined, their enormity makes management and operation difficult. While hardware management becomes less complex, running software-defined WANs (SD-WANs) presents highly complex data problems—information flows may exhibit patterns, but they are hard to extract. This is a perfect application for AI algorithms, however. Reinforcement learning is a particularly promising approach that already has history with networking applications. More recently, researchers have combined deep learning with traditional reinforcement learning to develop deep reinforcement learningtechniques that will likely find usage in networking.

Here is a watered-down version of how that might work. First, consider traditional reinforcement approaches where an agent learns how to operate within an environment based on the reward it receives for each action it performs while in certain states within that environment. The original implementations were tabular as rewards were assigned to each possible (state, action) pair to essentially form a lookup table. This table then served as the value function that an agent would refer to when following a policy for choosing an action.


Figure 1. Tabular reinforcement learning. Courtesy of Roger Chen.



Makes sense. At least, if it is practical to test and map out every possible (state, action) permutation in a system. However, consider how much telecommunications infrastructure exists, and consider how many devices are connected to it. Now consider how both infrastructure and devices comprise smaller parts that break down into even smaller parts and so forth, and each piece impacts the state of the environment. Now consider that all that infrastructure, those devices, and their components are constantly changing with each hardware and software update as well as each new device that comes online for the first time. Suddenly, that tabular approach doesn’t seem so friendly, does it?

This is where deep neural networks come in. By using many hidden layers to extract features from information describing explored states, Deep Q Network algorithms can learn how to predict outcomes for (state, action) pairs never seen before, superseding tabular approaches for high-dimensional state spaces that are impractical to fully map out by brute force.


Figure 2. Deep reinforcement learning. Courtesy of Roger Chen.


Of course, real life is never that easy. WANs feature not only high-dimensional state spaces, but also potentially high-dimensional and continuous action spaces, for which Deep Q Network algorithms falter without additional components, such as also applying deep neural networks to an agent’s action policy. Furthermore, it is nontrivial to determine how an agent should model or represent the hugely complex environment that is telecommunications. To learn more about deep reinforcement learning, here is a great post to start with. These challenges of applying deep reinforcement learning to the telecom industry show only the tip of the iceberg. As always, figuring out how to keep humans in the loop and using active learning will be important to the successful integration of AI into telecommunications as well. Additionally, practical business and regulatory constraints mean deployment will likely first concentrate in WANs with more clearly defined boundaries, such as those used primarily by private enterprises, before extending to more public networks.

Underlying the telecom industry’s adoption of AI is the broader trend toward improving infrastructure resource management with software, data, and algorithms. Where possible and sensible, physical infrastructure will be modeled in software as a system of information flows and will become the playground for AI-driven optimization. This sounds bold, but it is not unprecedented, as technological automation has long been used to help manage infrastructure. The difference today is the kind of knowledge tasks that deep neural nets and other advances have allowed us to automate. One data point that shows this is already underway is DeepMind’s recent achievement in using deep reinforcement learning to reduce the energy used to cool a Google data center by a whopping 40%. Expect to see much more of this type of application in the next few years.


Playing Atari with Deep Reinforcement Learning




We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is a convolutional neural network, trained with a variant of Q-learning, whose input is raw pixels and whose output is a value function estimating future rewards. We apply our method to seven Atari 2600 games from the Arcade Learning Environment, with no adjustment of the architecture or learning algorithm. We find that it outperforms all previous approaches on six of the games and surpasses a human expert on three of them.

featured image: Packet Routing in Dynamically Changing Neworks: a Reinforcement Learning Approach 


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s