Subscribe to Our Bi-Weekly AI Newsletter

Java Tools for Deep Learning, Machine Learning and AI

Why should you use JVM languagues like Java, Scala, Clojure or Kotlin to build AI and machine-learning solutions?

Java is the most widely used programming language in the world. Large organizations in the public and private sector have enormous Java code bases, and rely heavily on the JVM as a compute environment. In particular, much of the open-source big data stack is written for the JVM. This includes Apache Hadoop for distributed data management; Apache Spark as a distributed run-time for fast ETL; Apache Kafka as a message queue; ElasticSearch, Apache Lucene and Apache Solr for search; and Apache Cassandra for data storage to name a few. The tools below give you powerful ways to leverage machine learning on the JVM.

Apply reinforcement learning to simulations »

Deep Learning & Neural Networks

Deep learning usually refers to deep artificial neural networks. Neural networks are a type of machine learning algorithm loosely modeled on the neurons in the human brain.


TensorFlow provides a Java API. While it is not as fully developed as TensorFlow’s Python API, progress is being made. Karl Lessard is leading efforts to adapt TensorFlow to the JVM. Those interested can join the TensorFlow Java SIG or this Tensorflow JVM Gitter channel. TensorFlow Serving is a flexible, high-performance serving system for machine learning models, designed for production environments. TensorFlow-Java’s Github repository can be found here and here. Companies such as Facebook are active on the TensorFlow SIG led by Karl Lessard.


Neuroph is an open-source Java framework for neural networks. Developers can create neural nets with the Neuroph GUI. The Neuroph API documentation also explains how neural networks work.


Apache MXNet has a Java API as well as many other bindings. It is backed by Carnegie Mellon and Amazon as well as the Apache Foundation.

Deep Java Library (DJL)

Deep Java Library is another Java-focused deep learning dev tool introduced by Amazon.


Deeplearning4j is a DSL that allows users to configure neural networks in Java. It was created by the startup Skymind, which shut down in 2019, and no longer offers technical or commercial support.

Computer Vision JSR

Frank Greco of IBM and Zoran Severac are leading an effort to define a computer vision API for Java.

Reinforcement Learning


Pathmind’s application applies deep reinforcement learning to simulations, and trains AI decision-making agents that can respond to real events. Its users are industrial engineers and simulation modelers. Pathmind is being used by simulation modelers at Accenture and other global engineering teams to optimize in use cases like reducing carbon emissions in supply chain throughput, coordinating the activity of AGVs and cranes. It augments the control systems of organizations that have physical operations like factories, mines and warehouses. It helps them increase the efficiency and throughput of their operations by as much as 30%. 

Machine Learning Model Servers


Seldon is a Java-focused, open-source, machine learning model server that integrates with Kubernetes. Seldon’s Github repository. Its name references the godfather of psycho-historians, Hari Seldon, of Isaac Asimov’s Foundation series, who uses math to predict the future.


Kubeflow is an open, community-driven project to make it easy to deploy and manage an ML stack on Kubernetes. Kubeflow pipelines are reusable end-to-end ML workflows (including models and data transforms) built using the Kubeflow Pipelines SDK.

Amazon Sagemaker

Amazon Sagemaker is a tool for building, training and deploying machine learning models to production.


MLeap is an open-source project that helps deploy Spark pipelines, including ML models, to production.

Expert Systems

An expert system is also called a rules-based system. The rules are typically if-then statements; i.e. if this condition is met, then perform this action. An expert system usually comprises hundreds or thousands of nested if-then statements. Expert systems were a popular form of AI in the 1980s. They are good at modeling static and deterministic relationships; e.g. the tax code. However, they are also brittle and they require manual modification, which can be slow and expensive. Unlike, machine-learning algorithms, they do not adapt as they are exposed to more data. They can be a useful complement to a machine-learning algorithm, codifying the things that should always happen a certain way.


Drools is a business rules management system backed by Red Hat.



Optaplanner is an AI constraint solver written in Java. It is a lightweight, embeddable planning engine that includes algorithms such as Tabu Search, Simulated Annealing, Late Acceptance and other metaheuristics with very efficient score calculation and other state-of-the-art constraint solving techniques.

Natural-Language Processing

Natural language processing (NLP) refers to applications that use computer science, AI and computational linguistics to enable interactions between computers and human languages, both spoken and written. It involves programming computers to process large natural language corpora (sets of documents).

Challenges in natural language processing frequently involve natural language understanding (NLU) and natural language generation (NLG), as well as connecting language, machine perception and dialog systems.


Apache OpenNLP is a machine-learning toolkit for processing natural language; i.e. text. The official website provides API documentation with information on how to use the library.

Stanford CoreNLP

Stanford CoreNLP is the most popular Java natural-language processing framework. It provides various tools for NLP tasks. The official website provides tutorials and documentation with information on how to use this framework.

Machine Learning

Machine learning encompasses a wide range of algorithms that are able to adapt themselves when exposed to data, this includes random forests, gradient boosted machines, support-vector machines and others.


SMILE stands for Statistical and Machine Intelligence Learning Engine. SMILE was create by Haifeng Lee, and provides fast, scalable machine learning for Java. SMILE uses ND4J to perform scientific computing for large-scale tensor manipulations. It includes algorithms such as support vector machines (SVMs), decision trees, random forests and gradient boosting, among others.


Apache SINGA is an open-source machine-learning library capable of distributed training, with a focus on healthcare applications.

Java Machine Learning Library (Java-ML)

Java-ML is an open source Java framework which provides various machine learning algorithms specifically for programmers. The official website provides API documentation with many code samples and tutorials.


RapidMiner is a data science platform that supports various machine- and deep-learning algorithms through its GUI and Java API. It has a very big community, many available tutorials, and an extensive documentation.


Weka is a collection of machine learning algorithms that can be applied directly to a dataset, through the Weka GUI or API. The WEKA community is large, providing various tutorials for Weka and machine learning itself.

MOA (Massive On-line Analysis)

MOA (Massive On-line Analysis) is for mining data streams.

Encog Machine Learning Framework

Encog is a Java machine learning framework that supports many machine learning algorithms. It was developed by Jeff Heaton, of Heaton Research. The official website provides documentation and examples.


H2O is a startup providing open-source algorithms such as random forests and gradient boosted models.


The Brown-UMBC Reinforcement Learning and Planning is for the use and development of single or multi-agent planning and learning algorithms and domains to accompany them.

Chris Nicholson

Chris Nicholson is the CEO of Pathmind. He previously led communications and recruiting at the Sequoia-backed robo-advisor, FutureAdvisor, which was acquired by BlackRock. In a prior life, Chris spent a decade reporting on tech and finance for The New York Times, Businessweek and Bloomberg, among others.


A bi-weekly digest of AI use cases in the news.