What is HoNCAML¶
HoNCAML (Holistic No Code Automated Machine Learning) is a tool aimed to run automated machine learning pipelines, and specifically focused on finding the best model and hyperparameters for the problem at hand.
Following the no code paradigm, no Python knowledge is needed. There are two ways to define pipelines:
Through the Graphical User Interface
Through YAML configuration files
HoNCAML (Holistic No Code Automated Machine Learning) is a tool aimed to run automated machine learning pipelines, and specifically focused on finding the best model and hyperparameters for the problem at hand.
Pipelines¶
There are three types of provided pipelines.
Train¶
Train a specific model with the hyperparameters specified.
Input: A dataset for the training.
Output: The model object stored to disk.
Predict¶
Use a model to generate predictions for a specific dataset.
Input: A dataset for the test, together with a model object.
Output: A tabular file with the predictions.
Benchmark¶
Search for the best model and hyperparameters for the dataset at hand.
Input: A dataset for the benchmark.
Output: Main output is a configuration file with the best model and hyperparameters, and a tabular file with the results for all configurations tested.
Focus¶
HoNCAML has been designed having the following aspects in mind:
Ease of use
Modularity
Extensibility
Users¶
HoNCAML does not assume any kind of technical knowledge, but at the same time it is designed to be extended by expert people. Therefore, its user base may range from:
Basic users: In terms of programming experience and/or machine learning knowledge. It would be possible for them to get results in an easy way.
Advanced users: It is possible to customize experiments in order to adapt to a specific use case that may be needed by an expert person.
Support¶
Regarding each of the following concepts, HoNCAML supports specific sets of them; nevertheless, due to its nature, extend the library further should be not only feasible, but intuitive.
Data structure¶
For now only data with tabular format is supported. However, HoNCAML provides special preprocessing methods if needed:
Normalization
One hot encoding of categorical features
Problem type¶
At this moment, the following types of problems are supported:
Regression
Classification
Model type¶
Regarding available models, the following are supported:
Sklearn models (ML)
Pytorch models (DL)