A Novel Deep Learning Framework for Predictive Fault Diagnosis in Dry-Type Transformers Using Vibration Signal Data Analysis Technique

A Novel Deep Learning Framework for Predictive Fault Diagnosis in Dry-Type Transformers Using Vibration Signal Data Analysis Technique

Trong-Chuong Trinh Van-Nam Pham Ngoc-Khoat Nguyen*

Department of Science and Technology, Hanoi University of Industry, Hanoi 100000, Vietnam

Faculty of Automation, School of Electrical and Electronic Engineering (SEEE), Hanoi University of Industry, Hanoi 100000, Vietnam

Faculty of Control and Automation, Electric Power University, Hanoi 100000, Vietnam

Corresponding Author Email: 
khoatnn@epu.edu.vn
Page: 
2133-2142
|
DOI: 
https://doi.org/10.18280/jesa.581013
Received: 
1 August 2025
|
Revised: 
3 September 2025
|
Accepted: 
21 September 2025
|
Available online: 
31 October 2025
| Citation

© 2025 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).

OPEN ACCESS

Abstract: 

Early and accurate fault diagnosis in power transformers is critical for ensuring grid reliability and operational safety, particularly within dynamic and complex industrial environments. This study introduces a novel deep learning-based framework for predictive fault diagnosis in dry-type transformers, leveraging vibration signals for real-time condition monitoring. The methodology involves transforming raw time-series vibration data into two-dimensional (2D) representations using Gramian Angular Field (GAF) encoding. Specifically, both the Gramian Angular Summation Field (GASF) and Gramian Angular Difference Field (GADF) techniques are applied to capture complementary temporal correlations and dependencies. The encoded images are processed using a dedicated dual-stream Convolutional Neural Network (CNN) architecture designed to perform automated, robust feature extraction. The resulting feature vectors are then fused and utilized to train a suite of five distinct machine learning classifiers: Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Decision Tree (DT), and a feedforward neural network. The proposed system's efficacy was rigorously evaluated across three prevalent fault types common in dry-type transformers: loose windings, loose mounting bolts, and overload conditions. Empirical results demonstrate that the Random Forest classifier exhibited the most robust performance, achieving a peak classification accuracy of 94%. By synergistically integrating Internet of Things (IoT) technologies for data acquisition with sophisticated artificial intelligence for predictive analytics, this approach delivers a scalable, non-invasive, and cost-effective solution that surpasses the diagnostic capabilities of conventional monitoring techniques. The developed framework not only significantly enhances transformer reliability and reduces maintenance overhead but also establishes a proactive paradigm for risk mitigation and intelligent asset management in modern power systems.

Keywords: 

transformer fault diagnosis, deep learning, vibration analysis, GAF, condition monitoring

1. Introduction

The prevailing transformer monitoring infrastructure in industrial environments remains largely fragmented and functionally constrained. Most systems rely on standalone monitoring and alarm units that track only a limited set of fundamental electrical parameters. This decentralized architecture typically lacks the capacity for holistic data acquisition and advanced analytics, rendering continuous online monitoring and periodic performance assessment infrequent. Consequently, the implementation of automated predictive maintenance scheduling is rare and often technologically prohibitive.

This restricted visibility into transformer health introduces substantial operational vulnerabilities. Without timely fault detection, latent issues may escalate undetected, resulting in costly repairs, prolonged downtime, or catastrophic equipment failure necessitating full unit replacement.

Recent studies have underscored both the opportunities and challenges inherent in applying machine learning (ML) to predictive maintenance, particularly for critical rotating machinery such as bearings, motors, gearboxes, and pumps [1-4]. Significant gaps persist in this field, presenting opportunities to enhance model accuracy and improve the flexibility of future predictive maintenance solutions. For instance, Rahman et al. developed an intelligent anomaly detection method for electric motors using a combination of vibration signals and artificial intelligence algorithms [5]. They trained an unsupervised learning model on two distinct motors, a new test motor and an aged industrial motor. Their comprehensive analysis demonstrated that the model achieved high anomaly detection capability for standardized operating conditions using mapped features. However, a key limitation of their work was the exclusive use of data from normal motor conditions due to a scarcity of data on actual fault conditions.

Another approach, presented by Tahir et al. [6] leveraged statistical features from vibration signals, such as root mean square (RMS), mean, variance, skewness, and kurtosis, for model training, a common practice in many studies. A notable contribution of their work was the introduction of a Mean-based Outlier Detection (MOD) method during the data preprocessing phase. This technique was designed to identify and remove outliers—data samples influenced by external factors rather than actual faults—thereby improving model performance. However, their study did not address the classification of similar fault types with varying degrees of severity.

The rapid advancement of deep learning (DL) has spurred a growing interest among researchers in applying DL methods to address the aforementioned challenges [7-9]. Many DL techniques have already been successfully deployed in fault diagnosis. Convolutional Neural Networks (CNNs), in particular, have emerged as a classic and highly effective architecture for image classification tasks and they are considered a good solution to the problem of bearing fault detection [10, 11]. Nevertheless, a key challenge with very deep DL models is the vanishing gradient problem, which can lead to a degradation in performance. To circumvent these issues, Support Vector Machines (SVMs) have been proposed [12]. SVMs, which operate on the principle of linear combination through hyperplanes, are particularly effective in handling complex, high-dimensional data, leading to the development of a series of robust SVM-based models [13-15].

A particularly promising approach for enhancing fault diagnosis involves encoding complex time-series data into visual representations for subsequent DL analysis. The Gramian Angular Field (GAF) methodology, encompassing the Gramian Angular Summation Field (GASF) and Gramian Angular Difference Field (GADF) techniques, has achieved considerable prominence as an approach for transforming univariate time-series data into a two-dimensional image representation. This technique is valued for its ability to preserve the temporal relationships within the data, thereby enhancing feature extraction by DL models. For example, Shen et al. applied GAF to vibration signals for bearing fault diagnosis, demonstrating marked improvements in classification accuracy when combined with CNNs [16]. Similarly, Bai et al. utilized GAF encoding for fault detection in rotating machinery, highlighting its robustness in catching subtle patterns for data in time-series [17]. The GAF with its effectiveness in various tasks of fault diagnosis has been further validated by other studies, confirming its utility in transforming complex vibration signals into a format suitable for DL models [18-20].

This study introduces a novel deep learning-based framework for the predictive fault diagnosis of dry-type transformers. The proposed methodology leverages GAF encoding to convert normalized time-series vibration data into GASF and GADF images. These images are subsequently processed by a CNN for automated feature extraction. The extracted features are then concatenated and fed into a series of machine learning models for fault classification, focusing on three common transformer faults: loose windings, loose mounting bolts, and overload conditions. By integrating Internet of Things (IoT) technologies for real-time data acquisition with advanced artificial intelligence (AI) for predictive analytics, this approach seeks to improve transformer reliability, reduce maintenance costs, and mitigate operational risks. This scalable methodology offers a promising solution for industrial applications and holds the potential to significantly outperform conventional fault diagnosis techniques.

The rest of this paper is organized as follows. Section 2 will present the theoretical framework for the diagnosis of transformer’s faults. Then, Section 3 proposes a novel method to solve such a problem. Section 4 shows experimental results as applied the proposed methodology. Finally, conclusions and future work will be provided in Section 5.

2. Theoretical Framework

Regular and systematic monitoring of electrical transformers enable the timely detection and mitigation of potential faults, thereby streamlining maintenance procedures, enhancing operational efficiency, and reducing associated costs. In this context, the development and deployment of intelligent fault diagnosis systems—leveraging Internet of Things (IoT) technologies within industrial environments—are both necessary and practically justified. Such systems utilize a network of sensors to acquire heterogeneous data streams encompassing electrical, mechanical, and thermal parameters. These data are persistently stored and enriched through integration with intelligent diagnostic software. Subsequently, advanced analytical techniques powered by Artificial Intelligence (AI) are employed to process and interpret the collected information, facilitating early identification of anomalous conditions and latent faults. The fundamental operational architecture of these systems follows a structured sequence, as illustrated in Figure 1.

Figure 1. Process flow of transformer condition monitoring and fault prediction system

  1. Data Collection from Sensors: Vibration sensors are attached to the transformer casing to continuously collect data regarding the transformer's state and operation.
  2. Data Transmission to the Central Control System: Data from the sensors are transmitted to the central control system using network connections. The data are sent to servers or centralized points for analysis and processing.
  3. Data Analysis: At the central control system, data from the sensors are analyzed using machine learning algorithms, artificial intelligence, and data analysis techniques. The objective is to identify trends, patterns, and abnormal parameters that could lead to faults.
  4. Fault Prediction and Warning: Based on the data analysis, the system predicts potential faults or undesirable future situations. When a potential fault is identified, the system generates warnings and notifies managers or technicians for timely intervention.
  5. Maintenance Optimization: Provides detailed information about predicted faults and the transformer's state. This information helps optimize maintenance schedules, saving time and resources.
3. Methodology

3.1 Data processing and GAF transformation

The Gramian angle field is a mapping technique which converts 1D-waveforms into 2D-photos in Cartesian coordinates. This transformation retains the dynamic features of the original signal following the discrete time domain while also preserving its temporal dependencies. The surface vibration signal of the transformer, once collected, is represented as a 1D time series , where n is the total number of samples acquired during the measurement process at a specified sampling frequency. To standardize the data's magnitude and nullify variations related to the measurement scale, the complete signal undergoes a normalization procedure to fit within the predefined range [−1,1] using Eq. (1):

$x_i^{\mathrm{norm}}=2 \times \frac{x_i-\min (x)}{\max (x)-\min (x)}-1$     (1)

where:

  • $x_i$ is the signal value at time index $i$;
  • $\min (x)$ and $\max (x)$ are defined as the minimum and maximum amplitudes, respectively, observed across the entire signal range.

Signal normalization is first employed as a critical preprocessing stage to attenuate amplitude-scale artifacts and thereby enhance the convergence stability and rate of the deep learning architecture. Subsequent to normalization, the continuous time-series signal is partitioned into a series of fixed-length, overlapping segments via a sliding window protocol, as depicted in Figure 2. This segmentation strategy serves a dual purpose: it facilitates the extraction of localized temporal dynamics from the signal and simultaneously acts as a data augmentation technique to substantially expand the volume of training instances.

Figure 2. Re-sampling process of the signal

Figure 3. The transformation pipeline from the normalized time-series to GASF/GADF representations

To convert one-dimensional vibration time-series into a format suitable for CNNs, each segmented and normalized sample $x=\left\{x_1, x_2, \ldots, x_n\right\}$, where $x_i \in[-1,1]$, is transformed into a 2D image applying the GAF method as presented in Eq. (2).

$\left\{\begin{array}{c}\phi=\arccos \left(\tilde{x}_i\right),-1 \leq \tilde{x}_i \leq 1, \tilde{x}_i \in \tilde{X} \\ r=\frac{t_i}{N}, t_i \in \mathbb{N}\end{array}\right.$     (2)

where:

  • $x_i$ is the signal value at time $t_i$;
  • $t_i$ is the timestamp corresponding to each sample;
  • N is a constant used to normalize the radius $r_i$ into a reasonable range within the polar coordinate system.

Based on the polar coordinates, two types of GAFs are computed to encode temporal correlations between signal points:

The Gramian Angular Summation Field (GASF)

$G_S=\left(\begin{array}{ccc}\cos \left(\phi_1+\phi_1\right) & \cdots & \cos \left(\phi_1+\phi_N\right) \\ \cos \left(\phi_2+\phi_1\right) & \cdots & \cos \left(\phi_2+\phi_N\right) \\ \vdots & \ddots & \vdots \\ \cos \left(\phi_N+\phi_1\right) & \cdots & \cos \left(\phi_N+\phi_N\right)\end{array}\right)=\tilde{X} \tilde{X}^T-\left(\sqrt{I_N-\tilde{X}^2}\right)^T \sqrt{I_N-\tilde{X}^2}$      (3)

The Gramian Angular Difference Field (GADF)

The resulting GAF images as shown in Figure 3 retain both temporal and amplitude information, making them highly suitable for training CNN models in fault classification tasks.

$G_D=\left(\begin{array}{ccc}\sin \left(\phi_1-\phi_1\right) & \cdots & \sin \left(\phi_1-\phi_N\right) \\ \sin \left(\phi_2-\phi_1\right) & \cdots & \sin \left(\phi_2-\phi_N\right) \\ \vdots & \ddots & \vdots \\ \sin \left(\phi_N-\phi_1\right) & \cdots & \sin \left(\phi_N-\phi_N\right)\end{array}\right)=\left(\sqrt{I_N-\tilde{X}^2}\right)^T X-X^T \sqrt{I_N-\tilde{X}^2}$       (4)

As a result, the GAF images of the transformer corresponding to four different states are also presented in Table 1.

Table 1. GAF images of transformer according to the four different states

Categories

N

E1

E2

E3

Transformer vibration

GADF

GASF

3.2 A CNN and classification model

The CNN architecture, detailed in Figure 4, was specifically engineered for feature extraction from the image-based representation of the converted vibration signals. This architecture is structurally organized into a sequence of three distinct convolutional blocks, each interspersed with a pooling layer. The final feature maps are then channeled through a flattening layer, which transitions the data into a vector format suitable for processing by the subsequent fully connected layers.

The raw time signal is converted into a 2D image with 1 channel (grayscale) using the Gramian Angular Field (GAF) method, resulting in an image with a size of 2000 × 2000. Then, this 1-channel image is converted into a 3-channel (RGB) image by duplicating the single channel into three identical channels, resulting in an image with a size of 2000 × 2000 × 3. Finally, the image is resized from 2000 × 2000 × 3 to 64 × 64 × 3 to serve as the input for the CNN model as detailed in Table 2.

Figure 4. An illustration of the CNN architecture

The input layer receives RGB images of size 64 × 64 × 3, representing the image-transformed vibration signals.

The first convolutional layer (Conv2D + ReLU) applies 32 filters of size 3 × 3, using the ReLU non-linear function to detect local features such as edges and corners.

This is followed by a MaxPooling2D layer with a 2 × 2 kernel, reducing the spatial dimensions and computational complexity.

The second convolutional layer (Conv2D + ReLU) increases feature depth with 64 filters of size 3 × 3, capturing more abstract patterns.

Another MaxPooling2D (2 × 2) layer further reduces the dimensionality.

The third convolutional layer (Conv2D + ReLU) uses 128 filters of size 3 × 3, enabling the network to learn complex and high-level representations.

A final MaxPooling2D (2 × 2) layer down samples the spatial information before flattening.

The Flatten thickness transforms the 2D characteristic maps into a 1D feature vector.

Finally, a Dense layer with 256 units and ReLU activation is used to project the learned features into a high-level representation suitable for classification or further analysis.

Table 2. The CNN architecture for feature extraction

Layer Type

Parameters

Activation Function

Output Dimensions (H × W × C)

Input Layer

RGB image (64 × 64 × 3)

-

64 × 64 × 3

Conv2D (Layer 1)

32 filters, 3 × 3 kernel

ReLU

62 × 62 × 32

MaxPooling2D (Layer 1)

2 × 2 pool size

-

31 × 31 × 32

Conv2D (Layer 2)

64 filters, 3 × 3 kernel

ReLU

29 × 29 × 64

MaxPooling2D (Layer 2)

2 × 2 pool size

-

14 × 14 × 64

Conv2D (Layer 3)

128 filters, 3 × 3 kernel

ReLU

12 × 12 × 128

MaxPooling2D (Layer 3)

2 × 2 pool size

-

6 × 6 × 128

Flatten

-

-

4608 (1D vector)

Dense Layer

256 units

ReLU

256

Figure 5. A representation of the overall structure

Two independent CNN branches are designed to process GASF and GADF inputs respectively. Each CNN consists of multiple convolutional and pooling layers, followed by a flattening layer to convert the output feature maps into 1D vectors (see Figure 5):

Branch 1: CNN trained on GASF images

Branch 2: CNN trained on GADF images

Let $\mathrm{f}_{\text {GASF}} \in \mathbb{R}^{d_1}$ and $\mathrm{f}_{\text {GADF}} \in \mathbb{R}^{d_2}$ be the flattened feature vectors extracted from the two branches.The feature vectors are concatenated to form a unified representation:

$f_{\text {concat }}=\left[f_{G A S F} ; f_{G A D F}\right] \in R^{d_1+d_2}$     (5)

To evaluate the discriminative capability of the fused features, five classical machine learning classifiers are employed in this study, covering both linear and non-linear modeling paradigms with parameters provided in Table 3:

  • Support Vector Machine (SVM): It is highly suitable for classification tasks involving high-dimensional feature vectors and sparse datasets, demonstrating strong performance in scenarios where the number of features significantly exceeds the number of samples.
  • Random Forest (RF): An ensemble-based classifier that combines multiple Decision Trees to improve generalization and robustness against overfitting. RF is well-suited for handling noisy and complex feature distributions.
  • K-Nearest Neighbors (KNN): This algorithm is an instance-based, non-parametric classifier that determines the class assignment of a novel data point by performing a majority vote among the labels of the K data points that are closest in the defined feature space. The KNN is simple yet effective, especially when the feature distribution preserves local structures.
  • Logistic Regression (LR): A linear model that estimates the probability of class membership using the logistic function. It serves as a strong baseline for evaluating feature separability, particularly in linearly separable problems.
  • Multi-Layer Perceptron (MLP): A feedforward artificial neural network that consists of an input layer, one or more hidden layers, and an output layer. The MLP is capable of modeling intricate, non-linear dependencies embedded within the dataset.

To select the optimal hyperparameters for the classifiers, the study uses grid search combined with 5-fold cross-validation to optimize the hyperparameters for each model.

Table 3. Optimal hyper parameters for each classifier

Classifier

Best Hyperparameters

Support Vector Machine (SVM)

'C': 1,

'gamma': 'scale',

'kernel': 'rbf'

Random Forest (RF)

'max_depth': 10,

'max_features': 'sqrt',

'min_samples_split': 2,

'n_estimators': 200

K-Nearest Neighbors (KNN)

'algorithm': 'auto',

'n_neighbors': 5,

'weights': 'uniform'

Logistic Regression (LR)

'C': 1, 'max_iter': 100,

'solver': 'lbfgs'

Multi-Layer Perceptron (MLP)

'activation': 'relu',

'hidden_layer_sizes': (100,),

'learning_rate': 'adaptive',

'max_iter': 300,

'solver': 'adam'

These classifiers were selected for their diverse algorithmic foundations (linear, non-linear, and ensemble), allowing for a comprehensive evaluation of the fused feature set.

4. Experimental Results and Discussions

4.1 Experimental setup

Performing diagnostics on transformers (TFs) operating under real-world conditions and automatically still faces many limitations due to the complexity of the task. TFs are often used to monitor and control production processes in industrial environments. However, identifying and analyzing signals from TFs, such as vibration and noise from operating equipment, often encounters many challenges due to significant noise and fluctuations in surrounding environmental factors. Vibration and noise signals typically contain many unwanted noise and fluctuations, from external forces such as vibrations from machinery to environmental factors like temperature and humidity variations. This makes signal analysis and processing difficult, especially when aiming to apply automation methods. One of the biggest challenges is selecting appropriate feature values to effectively represent the signal. These feature values need to fully reflect the critical characteristics of the signal and eliminate environmental noise. However, selecting suitable features is not only a technical challenge but also requires a deep understanding of both the data and the production process. Once a good set of feature values is obtained, building an effective recognition model becomes more feasible. This model can use machine learning and artificial intelligence methods to classify and predict issues in the production process based on the signals collected from TFs. However, to achieve the best results, it is necessary to combine a comprehensive understanding of the production field with sophisticated techniques in signal processing and machine learning.

This paper will construct an experimental model to collect samples, specifically targeting a 3-phase 380 V isolation transformer with a 5 kVA capacity. The transformer’s vibration data will be collected using a vibration measurement device, stored on an SD card, and transmitted via WIFI standard. This data will be used for training the AI model, with a sampling frequency of 2 kHz. The experimental setup is presented in Figure 6 and Figure 7, with the goal of collecting diverse and realistic data. During the experiment, the fault states of the transformer are categorized into four main types to create diversity and challenges in training the AI model, including:

  • Normal Operation (N): This is the desired state.
  • Loose Windings in Primary and Secondary Coils (E1): A common issue where loose windings can occur due to large currents causing vibrations, leading to mechanical failure and reduced device lifespan.
  • Loose Mounting Bolts of the Transformer (E2): A dangerous fault where loose bolts can destabilize critical components, causing imbalance and increased vibrations.
  • Overload Fault (E3): When the transformer operates beyond its permissible overload duration, it can lead to a sudden increase in current and temperature, causing stress on the transformer parts and increased vibrations. Overload can cause:
  • Temperature Rise: Overload can cause uneven expansion and structural changes in transformer parts, creating vibrations.
  • Sudden Increase in Current and Temperature: This can create mechanical stress on transformer parts, causing vibrations.

Figure 6. Proposed signal processing and classification pipeline

(a) Proposed model

(b) Real devices

Figure 7. Experimental model and equipment positioning on the transformer

To address this issue, an early detection system is needed to perform inspections, maintenance, and repairs to mitigate these phenomena and ensure the safety and performance of the transformer. For this reason, the paper proposes using vibration signals as input for early fault diagnosis. The experimental model and the data collection setup for vibration data are depicted in Figure 7.

In the study, the vibration signal is divided into windows with a size of 2000 points, with an overlap ratio of 91%. Each signal window, after being sliced, corresponds to a duration of one second, allowing for the monitoring of system changes in short time frames.

4.2 Evaluation metrics

To appraise the performance of the classification models, four key evaluation metrics were employed: Accuracy, the overall rate of correct predictions; Precision, the proportion of positive predictions that were verifiably correct; Recall, the capacity of the model to identify all actual positive instances; and the F1-score. The F1-score, as the harmonic mean of Precision and Recall, was specifically included to offer a consolidated performance measure that mitigates the effects of class imbalance (see Figure 8 and Figure 9).

A group of circles with different colored trianglesAI-generated content may be incorrect.

Figure 8. Performance of the predictive computational frameworks

Figure 9. Confusion matrices of different classification models

  • Accuracy: Calculated as Accuracy $=\frac{T P+T N}{T P+T N+F P+F N}$ where TP, TN, FP, and FN denote true positives, true negatives, false positives, and false negatives, respectively.
  • Precision: Defined as Precision $=\frac{T P}{T P+F P}$, measuring the accuracy of positive predictions.
  • Recall: Computed as: Recall $=\frac{T P}{T P+F N}$, indicating the ability to identify all positive instances.
  • F1-score: Given by F1-score $=2 \cdot \frac{\text { Precision ⋅ Recall }}{\text { Precision+Recall }}$, balancing precision and recall.

4.3 Results

An accelerometer placed on top of the transformer records vibration signals at a sampling frequency of 2 kHz.

Dividing the training dataset into four separate classes is extremely important. Three of these classes are dedicated to transformer faults (E1-loose windings, E2-loose mounting bolts, and E3-overload), while one class represents the transformer in normal condition (see Table 4 and Table 5).

Table 4. Sample dataset collected from the experimental model at 50% load

Type of Error

Sampling Frequency (Hz)

Number of Samples

Time per Sample (mins)

Window Size (point)

Overlap (%)

N

2000

2

11

2000

91

E1

2000

2

11

2000

91

E2

2000

2

11

2000

91

E3

2000

2

11

2000

91

Table 5. Training, testing, and validating datasets results

Class

Train (70%)

Validation (15%)

Test (15%)

Total Samples

N

1120

240

240

1600

E1

1120

240

240

1600

E2

1120

240

240

1600

E3

1120

240

240

1600

Total

4480

960

960

6400

Table 6. Performance of machine learning models

Model

Accuracy

Precision

Recall

F1-Score

Random Forest

0.94

0.95

0.94

0.94

SVM

0.91

0.92

0.91

0.91

KNN

0.86

0.85

0.84

0.84

Logistic Regression

0.70

0.75

0.70

0.70

MLP

0.68

0.69

0.70

0.69

As detailed in Table 6, the performance efficacy of five distinct classification algorithms was rigorously assessed utilizing four fundamental metrics: accuracy, precision, recall, and the F1-score. The Random Forest classifier consistently yielded the highest measured values across these indices, reporting a maximal accuracy of 0.94, a precision of 0.95, and an F1-score of 0.94, suggesting substantial robustness and pronounced predictive validity.

The SVM yielded slightly lower results with an accuracy of 0.91 and an F1-score of 0.91 yet maintained good balance between precision (0.92) and recall (0.91), demonstrating its effectiveness in handling high-dimensional data.

In contrast, the K-Nearest Neighbors (KNN) model showed a moderate performance with an accuracy of 0.86 and F1-score of 0.84, suggesting it is less capable of capturing complex decision boundaries, especially in the presence of noise.

The Logistic Regression and Multi-Layer Perceptron (MLP) models performed significantly worse, with accuracies of 0.70 and 0.68, respectively. While Logistic Regression reached a relatively high precision (0.75), its recall and F1-score were notably lower (0.70), revealing limitations in model generalization. Similarly, MLP exhibited balanced but low performance across all metrics, which may result from underfitting or inadequate network capacity.

5. Conclusion and Future Work

This study successfully demonstrated a novel methodology for fault diagnosis in dry-type transformers by converting vibration signals into the GAF images for analysis by a CNN and machine learning classifiers. The proposed framework, particularly when utilizing a Random Forest classifier, achieved 94% accuracy in identifying loose windings, loose mounting bolts, and overload conditions, marking a significant improvement over traditional diagnostic technique.

The integration of this AI-driven approach with IoT for real-time data acquisition presents a scalable and automated solution for predictive maintenance. This paradigm enhances operational reliability, reduces lifecycle costs, and mitigates the risk of catastrophic failure.

For successful deployment in real-world applications, several key factors need to be considered. The reliability of vibration data depends on the proper placement of sensors on the transformer, which requires careful design to ensure accurate signal collection. Additionally, the process of converting signals into GAF images and analyzing them with models requires significant computational power, meaning that strong edge computing systems are needed for real-time processing at the source. This adds to the system's complexity and cost. Another challenge is ensuring the system can handle variations in operating conditions, such as temperature fluctuations or noise interference, which may affect the quality of the data.

Future research should focus on optimizing sensor placement and improving edge computing algorithms to enhance real-time data processing. Additionally, exploring multi-sensor data fusion and more advanced deep learning architectures to further increase diagnostic precision. Ultimately, this work contributes to the development of more intelligent and cost-effective asset management strategies for the power industry.

  References

[1] Çınar, Z.M., Nuhu, A.A., Zeeshan, Q., Korhan, O., Asmael, M., Safaei, B. (2020). Machine learning in predictive maintenance towards sustainable smart manufacturing in industry 4.0. Sustainability, 12(19): 8211. https://doi.org/10.3390/SU12198211

[2] Rai, A., Upadhyay, S.H. (2016). A review on signal processing techniques utilized in the fault diagnosis of rolling element bearings. Tribology International, 96: 289-306. https://doi.org/10.1016/j.triboint.2015.12.037

[3] Saufi, S.R., Ahmad, Z.A.B., Leong, M.S., Lim, M.H. (2019). Challenges and opportunities of deep learning models for machinery fault detection and diagnosis: A review. IEEE Access, 7: 122644-122662. https://doi.org/10.1109/ACCESS.2019.2938227

[4] Zhao, Z.B., Wu, J.Y., Li, T.F., Sun, C., Yan, R.Q., Chen, X.F. (2021). Challenges and opportunities of AI-enabled monitoring, diagnosis & prognosis: A review. Chinese Journal of Mechanical Engineering, 34: 56. https://doi.org/10.1186/s10033-021-00570-7

[5] Rahman, T.A.Z., Chek, L.W., Ramli, N. (2022). Intelligent vibration-based anomaly detection for electric motor condition monitoring. In 2022 9th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS2022), Bam, Iran.

[6] Tahir, M.M., Khan, A.Q., Iqbal, N., Hussain, A., Badshah, S. (2016). Enhancing fault classification accuracy of ball bearing using central tendency based time domain features. IEEE Access, 5: 72-83. https://doi.org/10.1109/ACCESS.2016.2608505

[7] Chen, Z.Q., Li, C., Sanchez, R.V. (2015). Gearbox fault identification and classification with convolutional neural networks. Shock and Vibration, 2015: 390134. https://doi.org/10.1155/2015/390134

[8] Zhao, J., Yang, S.P., Li, Q., Liu, Y.Q., Gu, X.H., Liu, W.P. (2021). A new bearing fault diagnosis method based on signal-to-image mapping and convolutional neural network. Measurement, 176: 109088. https://doi.org/10.1016/j.measurement.2021.109088

[9] Gao, Y., Liu, X.Y., Huang, H.Z., Xiang, J.W. (2021). A hybrid of FEM simulations and generative adversarial networks to classify faults in rotor-bearing systems. ISA Transactions, 108: 356-366. https://doi.org/10.1016/j.isatra.2020.08.012

[10] Liu, H., Zhou, J.Z., Xu, Y.H., Zheng, Y., Peng, X.L., Jiang, W. (2018). Unsupervised fault diagnosis of rolling bearings using a deep neural network based on generative adversarial networks. Neurocomputing, 315: 412-424. https://doi.org/10.1016/j.neucom.2018.07.034

[11] Wang, R.X., Jiang, H.K., Li, X.Q., Liu, S.W. (2020). A reinforcement neural architecture search method for rolling bearing fault diagnosis. Measurement, 154: 107417. https://doi.org/10.1016/j.measurement.2019.10741

[12] Almatheel, Y.A., Osman, M. (2021). Bearing element fault diagnosis using support vector machine. In 2020 International Conference on Computer, Control, Electrical, and Electronics Engineering (ICCCEEE), Khartoum, Sudan, pp. 1-5. https://doi.org/10.1109/ICCCEEE49695.2021.9429590

[13] Zhang, X.Y., Liang, Y.T., Zhou, J.Z., Zang, Y. (2015). A novel bearing fault diagnosis model integrated permutation entropy, ensemble empirical mode decomposition and optimized SVM. Measurement, 69: 164-179. https://doi.org/10.1016/j.measurement.2015.03.017

[14] Huo, D., Kang, Y., Wang, B., Feng, G., Zhang, J., Zhang, H. (2022). Gear fault diagnosis method based on multi-sensor information fusion and VGG. Entropy, 24(11): 1618. https://doi.org/10.3390/e24111618

[15] Pham, V.N., Tran, H.L. (2022). Electrocardiogram (ECG) circuit design and using the random forest for ECG arrhythmia classification. In Advances in Engineering Research and Application. ICERA 2022. Lecture Notes in Networks and Systems, pp. 477-494. https://doi.org/10.1007/978-3-031-22200-9_54

[16] Shen, J.T., Wu, Z., Cao, Y.C., Zhang, Q., Cui, Y.P. (2024). Research on fault diagnosis of rolling bearing based on gramian angular field and lightweight model. Sensors, 24(18): 5952. https://doi.org/10.3390/s24185952

[17] Bai, R.Y., Wang, H.W., Sun, W.L., Shi, Y.X. (2024). Fault diagnosis method for rotating machinery based on SEDenseNet and Gramian Angular Field. Eksploatacja i Niezawodność – Maintenance and Reliability, 26(4): 191445. https://doi.org/10.17531/ein/191445

[18] Zhou, Z.J., Ai, Q.S., Lou, P., Hu, J.M., Yan, J.W. (2024). A novel method for rolling bearing fault diagnosis based on Gramian Angular Field and CNN-ViT. Sensors, 24(12): 3967. https://doi.org/10.3390/s24123967

[19] Yu, S.B., Liu, Z.S., Wang, S.B., Zhang, G.R. (2024). A novel adaptive gramian angle field based intelligent fault diagnosis for motor rolling bearings. Journal of Physics: Conference Series, 2785: 012042. https://doi.org/10.1088/1742-6596/2785/1/012042

[20] Cai, C.Z., Li, R.L., Ma, Q., Gao, H.F. (2023). Bearing fault diagnosis method based on the Gramian angular field and an SE-ResNeXt50 transfer learning model. Insight - Non-Destructive Testing and Condition Monitoring, 65(12): 695-704. https://doi.org/10.1784/insi.2023.65.12.695