Navigate back to the homepage

Predictive Maintenance

Stefan Libiseller
January 16th, 2020 · 4 min read

Optimising cost-intensive processes such as maintaining heavy machinery is a perfect application for machine learning. Data can be easily captured by sensors and is therefore usually abundant and of good quality, which are ideal conditions for accurate predictions. From an economic perspective data-based decisions can reduce costs while increasing reliability compared to maintenance at fixed intervals. This combination makes predictive maintenance so appealing, especially to manufacturers and public transport companies.

In this blog post, I will explain the fundamentals for developing and implementing a predictive maintenance system.

Step Zero: Data Collection

Data collection is by far the most important task before any data science project can start. The more data is available and the better the quality, the easier it is to develop a predictive maintenance system. There is one truth that keeps showing up in almost every data science project: The best time to start collecting data was years ago, the second best time is now. This applies in particular to predictive maintenance, as the wear and tear of heavy machinery can take a very long time.

I recommend to record all available data - even if it does not seem to be useful at the time. Storing data has never been cheaper and everything that was not recorded is lost forever. Also don’t forget to include records of failures and maintenance information in the data collection process! These labels are vital information to distinguish between healthy and faulty states later on. Ideally involve the maintenance staff as well as a data scientist in this process to make sure not to miss anything.

Here is another tip: Often machines start making strange noises or show visual wear before breaking down. With deep learning it is also possible to use audio, photos and videos as a data source! They have a high information content while also being fairly cheap sensors. When humans notice changes in sound or appearance, it is often too late - a deep learning system on the other hand can monitor the machine constantly and is often able to notice even very subtle differences.

Exploratory Data Analysis

First of all, it is fundamental to be familiar with the data before using it to train a machine learning model. Otherwise it's like trying to get from point A to point B in a foreign city without Google Maps. It's just not a good idea.

Exploratory Data Analysis (EDA) means that there is no set goal, but instead it's about discovering relationships in the data. To start, it's a good idea to look at the most common measures of a distribution such as range, dispersion, median and other metrics depending on the data. It's also important check for missing values and outliers at this stage.

In order to understand the data really well, visualisations are essential. Popular libraries in Python to do this are Matplotlib and seaborn, but my personal favourite is plotnine. plotnine is an implementation of a grammar of graphics in Python and based on ggplot2 which is a very popular library in 'R'. I’ve learned to love ggplot2 when I was working on my Twitter Follower Analysis back in 2017. It keeps otherwise complex plots fairly simple and allows you to add layers to refine the plot in steps - perfect for the data exploration phase.

With all plots done, look for and calculate which features (sensors) correlate with the wear process. You might have to tweak the data before patterns emerge, but if they are visible to you, it's a good indicator that the ML model will pick them up as well.


This is the cleanup and data transformation phase. Everything nasty we’ve discovered in the EDA has to be treated now or our ML model is going to be sick later. Remove or correct outliers, fill in or exclude missing values (be careful not to manipulate the data too much) and exclude unreliable sensors.

It’s also a good idea to normalise the data, especially if the scales and range varies widely. ML systems usually learn faster and produce more accurate predictions if the data is normalised.

Feature Engineering

Feature Engineering is an important performance booster. It allows us to bake our previously discovered knowledge of data into the prediction process. The goal of feature engineering is to make the prediction as easy as possible for the ML model. It’s like pre-cutting the food for a child so it has less trouble eating it.

Let me give you an example: An accelerometer on a machine might not be very useful if only its raw data is fed into the ML model. But when transformed to the frequency domain the same data suddenly becomes much more meaningful, revealing oscillation and frequency shifts over time. This new data is much more useful for the model and the prediction therefore becomes better and more reliable.

Another possibility of feature engineering would be to combine multiple meaningful sensors into a single, enhanced condition indicator.

Training and Testing

Sadly I can't recommend the best model architecture for predictive maintenance or even if you should use classic ML or deep learning. It really depends on the data and what you are trying to predict.

For deep learning, recurring neural networks, such as LSTMs (Long Short Term Memory) would be my first choice of network, since sensor data represents a time series. In classical ML, Random Forest and Support vector machines (SVM) have shown good results. For anomaly detection, conditional AutoEncoders work well.

There are two approaches when it comes to the output of the predictive maintenance model: Regression and classification. In remaining useful life (RUL) prediction, the network outputs the expected duration until failure - a regression problem. But in practice the question is more often “Whats the probability of failure in the next 30 days?”, which shifts the task from regression to classification.

As with all data science projects, a variety of models with different parameters are usually trained to find the best model and its configuration for the respective task.


The last step is to deploy and test the model in practice. You can read more about the deployment of models in my blog post Deploying PyTorch to Production. If the test-phase is successful, a rollout to identical machines is fairly easy.

In addition to my data science skills I also have knowledge of electronics and electrical engineering from my time at the HTL. If you would also like to reduce costs and increase reliability at your company, feel free to contact me!

YouTube: Predictive Maintenance & Monitoring using Machine Learning Talk
Prognostics Data Repository by NASA - A collection of 16 predictive maintenance datasets

Let's Automate Your Business!

I create custom AI models for small to medium-sized companies who want to make their products stand out with deep learning. If you are interested in collaborating, send me an email or talk to me directly in a free video call!

Schedule free video call

Join My Newsletter

Get notified about new blog posts and news about what I've been doing lately. You can opt-out at any time and I promise not to spam your inbox or share your email with anyone else.

More blog posts

Deploying PyTorch to Production

A guide on how to deploy PyTorch models to production with a simple Flask API and where to look if you need more power.

December 12th, 2019 · 2 min read

How Machines Understand Words

Words for humans to understand words for machines. A summary with code examples about word embeddings, word vectors and byte pair encoding.

August 14th, 2019 · 3 min read
© 2020 - Stefan LibisellerImprint
Link to $ to $ to $