AutoML: The Automation of Automation

Machine to Machine: AI makes AI

The next Big Thing in AI is likely to be use  of machine learning to automate machine learning. The idea that routine tasks involved in developing a machine learning solution could be automated makes perfect sense. Machine learning development is a replicable activity with routine processes. Although total automation is improbable at the moment, even partial automation yields significant benefits. As the ranks of available machine learning experts grow thinner and thinner, ability to automate some of their time-consuming tasks means that they can spend more time on high-value functions and less on the nitty-gritty of model building and reiteration. This will, in theory, release more data scientists to work on the vast number of projects envisioned in an ubiquitous AI environment, as well as making it possible for less proficient practitioners to utilize machine learning routines without the need for extensive training.

Although automated machine learning (AutoML) is appealing, and startups have emerged with various promise, it is clear that this capability is not yet fully developed. There are, however, innumerable systems that are suitable now for use in selected environments, and to solve specific problems. Among these programs are Google AutoML, DataRobot, Auto-WEKA, TPOT, and auto-sklearn. Many are Open Source or freely available. Major analytics firms are also rapidly developing AutoML routines, including Google, Microsoft, Salesforce, and Facebook, and this area is being approached with increasing urgency.

Current AutoML programs mainly take care of the highly repetitive tasks that machine learning requires to create and tune models. The chief current  automation targets are selection of appropriate machine learning algorithms, tuning of hyperparameters, feature extraction and iterative modeling. Hyperparamter tuning is particularly significant because it is critical to deep neural networks. Some AutoML routines have already demonstrated advances on manual human performance insome of these areas. Other processes that could support AutoML, such as data cleaning, are aided by a separate body of machine learning tools that could be added to AutoML In fact, AutoML itself exists in a larger context of applying ML to data science and software development—an effort that shows promise but remains at an early stage.

Even with recent focus on AutoML, the capability of these programs has yet to reach the stage where they could be relied upon to achieve a desired result without human intervention. Data scientists will not lose their jobs in near future; as others have pointed out, humans are still required to set the objectives and verify results for any machine learning operation. Transparency is also of the utmost importance in determining whether the model is accurately selecting useful results or has settled upon an illusion of accuracy.

Currently, many  AutoML programs have operated successfully in test cases, with problems emerging as the size of the data set rises or the required operation becomes more complicated. An AutoML solution must not only embrace a wide range of ML models and techniques; but it must at the same time handle the large amount of data that will be used for testing through innumerable iterations.

AutoML, and, indeed, other forms of auto data science are likely to continue to advance. It makes sense that machines should add a layer of thinking about thinking on top of the specific task layer. A machine driven approach to developing automation of automation makes sense, not only in reducing the human component, but also in ensuring that there is capability to meet the demands of an ever expanding usage of AI in business. Arguabley, development of a more autonomous AutoML would be an important step toward Artificial General Intelligence.

Improvement in this area is likely to be swift, due to the urgency of the data scientist shortage at a time when all companies are advised to invest in AI. There is an ambitious DARPA program, Data-Driven Discovery of Models (D3M), aimed at coming up with techniques that automate machine learning model building from data ingestion to model evaluation. This was begun in June, 2016 and is furthering interest in AutoML approaches. Among AutoML startups, one standout is DataRobot, which has raised $54 million recently, bringing its total funding to $111 million. Meanwhile, there is a growing body of research in Academia, as well as within corporate research teams, focusing upon how to crack a problem that could create something like a user-friendly machine learning platform.


Leave a Reply

Your email address will not be published. Required fields are marked *