AI-driven Predictive Maintenance in the IIoT: It’s complicated
Predictive maintenance is considered a key application for artificial intelligence in the Industrial Internet of Things (IIoT). Sensors capture vibrations, temperatures, currents, or pressure values, and AI models are expected to detect when machines transition from normal operation into a critical state. The goal is to identify failures early and make maintenance predictable. In practice, however, many of these projects fail not because of insufficient computing power or sensor technology, but due to a fundamental data problem.
Why predictive maintenance AI is so difficult to train
Industrial systems are designed to operate reliably. Failures are rare – and from a machine learning perspective, that is exactly the problem. While huge volumes of normal-operation data exist, there are very few labeled examples of real faults or failures. The datasets are highly imbalanced: “normal condition” dominates, “fault condition” is the exception.
For AI models, this creates a high risk of misleading results. A model can achieve high accuracy by predicting “no fault” almost all the time, without actually being able to reliably detect critical states. This class imbalance problem is one of the central obstacles to deploying AI productively in predictive maintenance.
Statistical data preprocessing with SMOTE
A common approach to mitigating this issue is artificially increasing the number of rare fault samples. The best-known method is SMOTE (Synthetic Minority Oversampling Technique). New fault data points are generated by interpolating between existing fault samples. This balances the dataset and allows classical machine learning models to be trained more stably.
The advantage of SMOTE lies in its simplicity and low computational cost. The downside is that the generated data is often mathematically, but not physically, realistic. Complex temporal or nonlinear fault patterns can only be represented to a limited extent.
Generative models and GANs
A more advanced approach uses generative AI models, particularly Generative Adversarial Networks (GANs). These models learn the statistical structure of real fault data and generate new synthetic samples that closely resemble the originals. This enables the creation of more realistic fault scenarios, especially for complex sensor data and time series.
GANs can address the training problem much more effectively than purely statistical methods. At the same time, they are computationally expensive, difficult to validate, and prone to methodological pitfalls—for example, when synthetic data unintentionally leaks into test datasets. In addition, their interpretability for industrial users is limited.

Proposed workflow model using data generating techniques. Zafat et al, 2025
Combinations of SMOTE and GANs
A hybrid approach combines both methods. This is exactly where the paper “GenIIoT: Generative Models Aided Proactive Fault Management in Industrial Internet of Things” by Isra Zafat, Arshad Iqbal, Maqbool Khan, Naveed Ahmad, and Mohammed Ali Alshara, published in 2025 in the journal *Information* (MDPI), comes in.
The idea: SMOTE first improves class balance, while GANs subsequently generate more realistic fault patterns. This mixed approach aims to mitigate the weaknesses of both methods. The authors demonstrate, using several industrial datasets, that detection performance across different AI models can be significantly improved—especially for highly imbalanced data.
Alternative strategies beyond synthetic fault data
Not all approaches rely on synthetic data. An alternative strategy is anomaly detection: the model learns only normal operation and flags deviations. This reduces the need for fault labels but increases the risk of false alarms.
Other approaches use transfer learning, where models are pre-trained on similar machines or simulations, or hybrid systems that combine AI with rule-based methods and physical models. These approaches are often more robust but more complex to implement.
Many approaches for one fundamental problem
All of the described solution paths respond to the same fundamental issue: industrial reality provides too little fault data to train AI in a classical way. SMOTE, GANs, hybrid approaches, and alternative learning strategies are attempts to close this structural gap. No method is universally superior. What matters are the system context, data quality, safety requirements, and how explainable and maintainable a solution needs to be.
Summary (tl;dr)
- Predictive maintenance often fails due to too little real fault data
- SMOTE is simple, but produces only limited realism
- GANs generate more realistic faults, but are complex and hard to validate
- Hybrid approaches like the one by Zafat et al. combine both methods
- Alternative strategies avoid fault data, but introduce new risks
Further reading
- GenIIoT: Generative Models Aided Proactive Fault Management in IIoT (Zafat et al., 2025)Hybrid SMOTE–GAN approach to improve training for fault management and predictive maintenance in IIoT.
- Review of imbalanced fault diagnosis technology based on generative adversarial networks (Oxford Academic, 2024)Open-access review on class imbalance in fault diagnosis and the role of GANs as a data enhancement approach.
- Generate Synthetic Pump Signals Using Conditional GAN (MathWorks/MATLAB)Practical tutorial on generating synthetic pump signals for PdM using conditional GANs.
- TimeGAN: Time-series Generative Adversarial Networks (NeurIPS 2019 – PDF)Seminal paper on TimeGAN as a generative model for realistic synthetic time series (relevant for PdM sensor data).
- Evaluating Lightweight GAN- and Adapted CTGAN-Based Data Synthesis for Predictive Maintenance (PDF, 2025)Conference paper comparing GAN- and CTGAN-like approaches for PdM sensor data synthesis.
- Fair Synthetic Time Series Data for Predictive Maintenance (DiVA – PDF, 2024)Thesis on generating synthetic PdM time series and evaluating quality and bias.
- Try Anomaly Detection for Predictive Maintenance (KNIME – Blog)Hands-on introduction to anomaly detection as a PdM strategy.











