Baking Clouds Ltd

Baking Clouds provide tailored IT consultancy services to small and medium-sized companies; we cover all aspects of IT without any hidden costs.

Open-source machine learning frameworks

What is Machine Learning?

Machine learning describes machines that are taught to learn and make decisions by examining large amounts of input data. It makes calculated suggestions and/or predictions based on analyzing this information and performs tasks that are considered to require human intelligence. This includes activities like speech recognition, translation, visual perception, and more.

Machine Learning and AI are 2019 trends. This article will list some of the open-source machine learning options available that can help you transform your business

1. TensorFlow


TensorFlow™ is an open source software library for high performance numerical computation. Its flexible architecture allows easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices. Originally developed by researchers and engineers from the Google Brain team within Google’s AI organization, it comes with strong support for machine learning and deep learning and the flexible numerical computation core is used across many other scientific domain

2. Keras


Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.

Use Keras if you need a deep learning library that:

Allows for easy and fast prototyping (through user friendliness, modularity, and extensibility).
Supports both convolutional networks and recurrent networks, as well as combinations of the two.
Runs seamlessly on CPU and GPU.

  • User friendliness. Keras is an API designed for human beings, not machines. It puts user experience front and center. Keras follows best practices for reducing cognitive load: it offers consistent & simple APIs, it minimizes the number of user actions required for common use cases, and it provides clear and actionable feedback upon user error.
  • Modularity. A model is understood as a sequence or a graph of standalone, fully-configurable modules that can be plugged together with as few restrictions as possible. In particular, neural layers, cost functions, optimizers, initialization schemes, activation functions, regularization schemes are all standalone modules that you can combine to create new models.
  • Easy extensibility. New modules are simple to add (as new classes and functions), and existing modules provide ample examples. To be able to easily create new modules allows for total expressiveness, making Keras suitable for advanced research.
  • Work with Python. No separate models configuration files in a declarative format. Models are described in Python code, which is compact, easier to debug, and allows for ease of extensibility.

3. Shogun Toolbox


Shogun is and open-source machine learning library that offers a wide range of efficient and unified machine learning methods.

Shogun is accessible

  • Supports many languages (Python, Octave, R, Java/Scala, Lua, C#, Ruby, etc) and platforms (Linux/Unix, MacOS and Windows) and integrates with their scientific computing environments.

Shogun is state-of-the-art

  • Efficient implementation (from standard to cutting edge algorithms), modern software architecture in C++.
  • Easy combination of multiple data representations, algorithm classes and general purpose tools for rapid prototyping of data pipelines.

Shogun is open source

  • Free software, community-based development and machine learning education.
  • GPLv3 license and working towards BSD compatibility.

4. Accord.NET Framework


F# is an open source, functional-first, general purpose programming language and is particularly suitable for developing mathematical models that are an integral part of machine learning algorithm development.

Code written in F# is generally very expressive and is close to its actual algorithm description.

The framework is divided into libraries via the installer, compressed archives and NuGet packages, which include Accord.Math, Accord.Statistics, Accord. MachineLearning, Accord.Neuro, Accord.Imaging, Accord.Audio, Accord.Vision, Accord.Controls, Accord.Controls.Imaging, Accord.Controls.Audio, Accord.Controls.Vision, etc.
Its features are:

  • Matrix library for an increase in code reusability, and gradual change of existing algorithms over standard .NET structures.
  • Consists of more than 40 different statistical distributions like hidden Markov models and mixture models.
  • Consists of more than 30 hypothesis tests like ANOVA, two-sample, multiple-sample, etc.
  • Consists of more than 38 kernel functions like KVM, KPC and KDA.

5. Torch


Torch is a scientific computing framework with wide support for machine learning algorithms that puts GPUs first. It is easy to use and efficient, thanks to an easy and fast scripting language, LuaJIT, and an underlying C/CUDA implementation.

A summary of core features:

  • a powerful N-dimensional array
  • lots of routines for indexing, slicing, transposing, …
  • amazing interface to C, via LuaJIT
  • linear algebra routines
  • neural network, and energy-based models
  • numeric optimization routines
  • Fast and efficient GPU support
  • Embeddable, with ports to iOS and Android backends

6. Theano


Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Theano features:

  • tight integration with NumPy – Use numpy.ndarray in Theano-compiled functions.
  • transparent use of a GPU – Perform data-intensive computations much faster than on a CPU.
  • efficient symbolic differentiation – Theano does your derivatives for functions with one or many inputs.
  • speed and stability optimizations – Get the right answer for log(1+x) even when x is really tiny.
  • dynamic C code generation – Evaluate expressions faster.
  • extensive unit-testing and self-verification – Detect and diagnose many types of errors.

The actual syntax of Theano expressions is symbolic, which can be off putting to beginners used to normal software development. Specifically, expression are defined in the abstract sense, compiled and later actually used to make calculations.

It was specifically designed to handle the types of computation required for large neural network algorithms used in Deep Learning. It was one of the first libraries of its kind (development started in 2007) and is considered an industry standard for Deep Learning research and development.

7. Scikit-learn


Scikit-learn (formerly scikits.learn) is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.

Scikit-Learn is characterized by a clean, uniform, and streamlined API, as well as by very useful and complete online documentation. A benefit of this uniformity is that once you understand the basic use and syntax of Scikit-Learn for one type of model, switching to a new model or algorithm is very straightforward.

8. Caffe


Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by Berkeley AI Research (BAIR) and by community contributors. Yangqing Jia created the project during his PhD at UC Berkeley. Caffe is released under the BSD 2-Clause license.

Expressive architecture encourages application and innovation. Models and optimization are defined by configuration without hard-coding. Switch between CPU and GPU by setting a single flag to train on a GPU machine then deploy to commodity clusters or mobile devices.

Extensible code fosters active development. In Caffe’s first year, it has been forked by over 1,000 developers and had many significant changes contributed back. Thanks to these contributors the framework tracks the state-of-the-art in both code and models.

Speed makes Caffe perfect for research experiments and industry deployment. Caffe can process over 60M images per day with a single NVIDIA K40 GPU*. That’s 1 ms/image for inference and 4 ms/image for learning and more recent library versions and hardware are faster still. We believe that Caffe is among the fastest convnet implementations available.

Community: Caffe already powers academic research projects, startup prototypes, and even large-scale industrial applications in vision, speech, and multimedia. Join our community of brewers on the caffe-users group and Github.

9. Microsoft Cognitive Toolkit


Microsoft cognitive toolkit supports loading and saving models in Open Neural Network Exchange) ONNX format, which enables you as a developer to run your trained network on Java and C# with much better overall performance than you can get with Python

Speed & Scalability

The Microsoft Cognitive Toolkit trains and evaluates deep learning algorithms faster than other available toolkits, scaling efficiently in a range of environments—from a CPU, to GPUs, to multiple machines—while maintaining accuracy.
A drawing of a sta

Commercial-Grade Quality

The Microsoft Cognitive Toolkit is built with sophisticated algorithms and production readers to work reliably with massive datasets. Skype, Cortana, Bing, Xbox, and industry-leading data scientists already use the Microsoft Cognitive Toolkit to develop commercial-grade AI.
Drawing of two brackets intermingling gracefull


The Microsoft Cognitive Toolkit offers the most expressive, easy-to-use architecture available. Working with the languages and networks you know, like C++ and Python, it empowers you to customize any of the built-in training algorithms, or use your own.

10. Amazon Machine Learning


AML is based on simple, scalable, dynamic and flexible ML technology used by Amazon’s ‘Internal Scientists’ community professionals to create Amazon Cloud Services. AML connects to data stored in Amazon S3, Redshift or RDS, and can run binary classification, multi-class categorisation or regression on this data to create models.
The key contents used in Amazon ML are listed below.

  • Datasources: Contain metadata associated with data inputs to Amazon ML.
  • ML models: Generate predictions using the patterns extracted from the input data.
  • Evaluations: Measure the quality of ML models.
  • Batch predictions asynchronously generate predictions for multiple input data observations.
  • Real-time predictions synchronously generate predictions for individual data observations.

Its key features are:

  • Supports multiple data sources within its system.
  • Allows users to create a data source object from data residing in Amazon Redshift – the data warehouse Platform as a Service.
  • Allows users to create a data source object from data stored in the MySQL database.
  • Supports three types of models: binary classification, multi-class classification and regression.


With the evolution in machine learning expertise and tools, companies are able to use more cutting-edge applications that are helping them increase business efficiency, intelligence, agility and be more focused on their customers. Machine learning helps them to transform some of the core business processes and offer a huge potential for growth for all types of companies.

Machine learning frameworks come with pre-built components that are easy to understand and code. A good ML framework thus reduces the complexity of defining ML models. With these open-source ML frameworks, you can build your ML models easily and quickly.

Open-source machine learning frameworks
Scroll to top