ML.NET 2.0 enhances text classification
Microsoft has launched ML.NET 2.0, a new version of its open source, cross-platform machine learning framework for .NET. The upgrade features capabilities for text classification and automated machine learning.
Unveiled November 10, ML.NET 2.0 arrived in tandem with a new version of the ML.NET Model Builder, a visual developer tool for building machine learning models for .NET applications. The Model Builder introduces a text classification scenario that is powered by the ML.NET Text Classification API.
Previewed in June, the Text Classification API enables developers to train custom models to classify raw text data. The Text Classification API uses a pre-trained TorchSharp NAS-BERT model from Microsoft Research and the developer’s own data to fine-tune the model. The Model Builder scenario supports local training on either CPUs or CUDA-compatible GPUs.
Also in ML.NET 2.0:
- Binary classification, multiclass classification, and regression models using preconfigured automated machine learning pipelines make it easier to begin using machine learning.
- Data preprocessing can be automated using the AutoML Featurizer.
- Developers can choose which trainers are used as part of a training process. They also can choose tuning algorithms used to find optimal hyperparameters.
- Advanced AutoML training options are introduced to choose trainers and choose an evaluation metric to optimize.
- A sentence similarity API, using the same underlying TorchSharp NAS-BERT model, calculates a numerical value representing the similarity of two phrases.
Future plans for ML.NET include expansion of deep learning coverage and emphasizing use of the LightBGM framework for classical machine learning tasks such as regression and classification. The developers behind ML.NET also intend to improve the AutoML API to enable new scenarios and customizations and simplify machine learning workflows.
Copyright © 2022 IDG Communications, Inc.