Skip to content

Delving into the Deep…Learning: the View from a Summit

October 22, 2015


There’s something big happening in the world of technology.  Over the past couple of years there’s been a resurgent interest in neural networks and excitement over the challenging problems that they can now solve.  Folks are talking about Deep Learning…  I’ve been keeping an eye on what’s happening in the field to see if there’s anything that we can use to build better tools for researchers here at Mendeley.  Recently I went along to the Re-Work’s Deep Learning Summit and thought I’d share my experiences of it here.

The Deep Learning Summit

The 300 attendees were mainly a mixture of technology enthusiasts, machine learning practitioners and data scientists, who all share an interest in what can be done uniquely with deep learning.  We heard from around 50 speakers, over two days explaining how they are using deep learning.  There were lighting talks, normals ones with Q&A and some fireplace chats.  There were lots of questions, chief amongst them being ‘What is deep learning?’, and some notable highlights.

Some Highlights

Paul Murphy, CEO of Clarify, gave a brief history of deep learning.  While neural networks were popular in the 1980s, they went out of fashion for a couple of decades as other techniques were shown to outperform them.  They were great at solving toy world problems but beyond some applications, such as automatic hand-writing recognition, they had difficulty in gaining traction in other areas.  Yann LeCun, Yoshua Bengio, Geoffrey Hinton, and many others persevered with neural network research and, with the increase in computational processing power, more data and a number of algorithmic improvements, have shown that they can now outperform the state-of-the-art in several fields including speech recognition, visual object recognition and object detection.  Deep Learning was born.

The players on the stage of #DeepLearning - @AndrewYNg @geoff_hinton @ylecun

The players on the stage of #DeepLearning – @AndrewYNg @geoff_hinton @ylecun

There was an interesting fireside chat with Ben Medlock (@Ben_Medlock) from SwiftKey, the productivity tool that predicts the next word you’re about to type to save you time.  I love this app.  Ben spoke about the challenges involved in natural language processing and how many of the current syntactic approaches don’t exploit the semantics of the sentences.  This is where deep learning comes in.  Using tools like word2vec, you can compare words in a semantic space and use that to improve text prediction.  He spoke about how they have done some work with recurrent neural networks, possibly the deepest of deep learning, to improve their tools.

A lot of the work presented was in the area of vision.  This is a field in which deep learning has made consistent advances. Matthew Zeiler presented some impressive demos from Clarifai.  They take videos and automatically tag them with concept tags in real-time, from a selection of over 10,000 tags.  They report that deep learning has significantly improved the quality of results here.  It’s available through their API as a service and they already have a number of high profile customers such as vimeo, vodafone and trivago.

Clarifai Example 2 Clarifai Example 1

Some early work on neural turing machines also peaked my interest. Alex Graves, from Google DeepMind, told us that modern machine learning is good at finding statistics.  For example, you can train a network model to give you the probability of an image, given a label, or the probability of some text given some audio, but the techniques behind them don’t tend to generalise very far.  To improve them, the usual solution is to make better representations of the data, such as making the representations more spare or disentangling them into different factors.  Neural turing machines offer another solution where instead of learning statistical patterns, they learn programs.  These programs are made up of all of the usual things that programming languages provide like variables, routines, subroutines and indirection.  The main advantage of this is that these machines are good at solving problems where the solutions fit well into algorithmic structures.  I assume that this also makes the solutions much more readable, adding some transparency to the typical black box solutions.

Finally, a fun one but certainly with serious science behind it.  Korey Kavukcuoglu, also from Google DeepMind, spoke about their agent-based systems that use deep learning to learn how to play Atari games.  For him, deep learning takes us from solving narrow AI problems to general AI.  He showed that through using reinforcement learning, where agents learn from observations in their (simulated) environments and are not given explicit goals, they trained Deep Q networks (convolutional neural networks) to play a number of Atari games.  In several games they perform with human-like performance and even go beyond in some cases.  They built this using Gorila, Google’s reinforcement learning architecture, designed to help scale up deep learning to be applied to real-world problems.


Deep Learning is not just hype, which was one of my worries before going to the summit.  It clearly can solve lots of real-world problems with a level of accuracy that we haven’t seen before.  I’ve kicked off a few hacks and spikes to explore what we can build for Mendeley’s users using some of these techniques.  If we get good results then expect Deep Learning to be powering some of our features soon!


From → Uncategorized

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: