Winner of the TechCrunch Disrupt 2015 Startup Battlefield that took place in London in December, was a small startup from England called Jukedeck. They presented a solution that can help you create music for videos. The create part is up to an algorithm that will automatically generate the music according to your preferences. You can specify type and spirit of the music as well as length. The real news here is that people now have access to a machine learning algorithms can generate music. Forget the music videos part: algorithms are now composing music. This is just one of the many examples of the new generation of machine learning algorithms we have seen emerge in the last few years. We are in the early stages of a new era of machine learning.
Creating an algorithm to generate music is not new. It turns out that Prof. David Cope at the University of California at Santa Cruz experimented with computer music at the end of the 20th century. He specialised in classical music, in particular Johan Sebastian Bach. By 1997 he had perfected his algorithm so it could generate sufficiently good composition to be able to fool the general listener. Maybe not a masterpiece but sufficiently good.
In 2012, Google announced their research into finding cats in videos. It may seems like a stupid problem to solve but it is actually a breakthrough in computer science. Ever since the introduction of the first computers – ironically called “electronic brains”, people have been fascinated by their capacity to act like a brain. However, computers are far from anytime like the brain. If you use a computer to multiply 1,000 three digit number, even an old 1960s computer would be by far better then a human. It turns out that what is hard for us humans is easy for computers, and what we find easy is hard problem for computers. If you ask a two year old kid to point to a cat on a picture, it would be an easy task. But give that problem to a computer, it’s remarkably difficult to solve.
Despite the fact that problems like image recognition and creative task such as composition are hard, it has not stopped people from working on machine learning or Artificial Intelligence (AI) in general. In the 1980s there was a promising new wave of AI technology called Neural Networks. Those are networks that in some ways try to act similar to the brain, with nodes and connections between them. The the idea was to train the network to get better at specific tasks. Although promising at the time, these networks did not deliver much and became a disappointment. Yet another AI winter followed, not so common in the quest for intelligent machines over the years.
Neural Networks may not have worked in the 1980, but since that time we have seen exponential growth in compute power, storage and bandwidth. Now we have cloud computing and big data. Furthermore we have video games and for video games we need Graphical Processing Units or GPUs and this means we can build really powerful supercomputers, relatively cheap. This is the adjacent possible for a new era of AI. It turns out that the basic idea of neural networks was not wrong, but the capacity to make it work was not available in 1980. Really good example of adjacent possible.
Prof. Cope’a algorithm was a programmatic way to generate specific type of musical composition. The new type of algorithms we see today, like the Junkbox service, works in totally different way. Machine learning algorithms are trained, not programmed. Part of this is deep learning algorithms that are fed with huge amounts of data and use layers of nodes to try different combinations, strengthening those that work and repeat. One class is Recurrent Neural Networks which seem to be able solve particular types of programs like speech recognition, understanding handwriting and, surprisingly, composing music.
Even with all the knowledge on machine learning available, creating a machine learning software is really hard and requires huge infrastructure. The technology is very much academic but is starting to produce practical solution that will open up new levels of possibilities. Big technology vendors are democratising machine learning and offering relatively easy and affordable access to machines learning software. Google has their Prediction API and Amazon has their Machine Learning Services. Access to machine learning systems is now as simple as signing up for subscription on the web.
So what does this mean? This means that apps we use will get smarter and work better for us. They will be able to predict out preferences and help with many problem only capable of humans. Services like speech understanding, pattern recognition, personal recommendations, document and image categorising, fraud detection, and all sorts of creative task will be done by software. It will mean a shift in jobs as software can increasingly replace some tasks, previously only capable of humans. At first we will find this scary, but then, as usual get used to it, and expect some smartness of all things, including everyday objects like cars, TVs and coffee machines. We expect to be able to talk to these things and they talk back.
We are still in early days of this machine learning renaissance and we have a lot to understand what this means for business and people’s jobs, in particular white collar jobs. With the access to enormous cloud computing services, more and more solutions will appear that try to predict and analyse our behaviour. More and more tasks will become software task and this will change the job market. Companies that want to stay relevant, even non-IT companies, need to think about how software can help them.