Joint processing technology of laser radar and optical image for power distribution

Joint processing technology of laser radar and optical image for power distribution

This section discusses the efficiency of the Multimodal Data Fusion and a Hybrid Deep Learning Model (MDF-HDL) based fault classification and localization in power distribution networks. This work uses the ArcGIS Power Line Classification Project33 and Awesome 3D LiDAR Datasets34 to create multimodal data fusion systems. These geographical, annotated point cloud data of power line environments can be obtained through the use of the ArcGIS Dataset product. It gives the real-world environment (wires, poles, background) that you require as a baseline for the classification jobs that you are trying to complete.

Dataset description

The ArcGIS Power Line Classification Project and the Awesome 3D LiDAR Datasets are the two main datasets used in this investigation. There are over 3 million annotated point cloud samples in the ArcGIS collection that depict actual power line settings, complete with cables, poles, and background objects. For applications requiring the location and categorization of faults in power distribution networks, this data offers a realistic geographic baseline.

The Awesome 3D LiDAR Datasets were chosen based on their applicability for urban odometry, localization, and segmentation applications, taking into account experimental parameters like scale, objectives, and sensor type. There are about 2 million high-resolution 3D point clouds in the collection. Both datasets were divided into 25-meter spatial blocks with a maximum of 8,192 points each during data processing. The application of normalization and noise filtering techniques enhanced the quality of the data.

The dataset was split into 80% training, 10% validation, and 10% testing subsets both geographically and temporally to guarantee reliable model assessment and avoid overfitting. Through stratified sampling, our partitioning technique ensured that each fault category was proportionately represented in each subgroup while maintaining constant class representation across all splits. Furthermore, to reduce imbalance and enhance model generalization, data augmentation methods were used for underrepresented fault classes, including random rotations, horizontal and vertical mirroring, and controlled noise injection. To make sure minority classes made a suitable contribution to the optimization process, a weighted categorical cross-entropy loss function was also used during training. Together, these actions successfully reduced class disparity and improved the model’s dependability and fairness in problem identification across a range of operating conditions. Methods such as stratified sampling are utilized in an effort to alleviate class imbalance. Following the application of CNN layers for image data and 3D convolutional networks or RandLA-Net for LiDAR data, Kalman filtering was utilized for the purpose of achieving robust multimodal fusion that was achieved by feature extraction. When it came to classification, Adam-optimized decision trees with early halting (patience of 8) were utilized in order to avoid excess fitting. This comprehensive description of the dataset and the preprocessing workflow contribute to the enhancement of the reproducibility and reliability of the fault detection results.

Experimental setup

Multimodal data fusion and reproducibility are ensured by the experimental setup. The study uses the ArcGIS Power Line Classification Project and Awesome 3D LiDAR Datasets from ArcGIS Hub and GitHub. ArcGIS has 3 million point cloud points, while LiDAR has 2 million. High-resolution 3D LiDAR point clouds and GIS power line photos are included. Data are preprocessed by dividing them into 25-meter blocks with 8,192 points per block, noise filtering, and normalization. Mode-specific feature extraction uses CNN layers for images and 3D convolutional networks or RandLA-Net for LiDAR. The collected characteristics are incorporated using Kalman filtering for robust multimodal fusion. Decision tree models optimized with the Adam optimizer (learning rate = 0.001, batch size = 64, 50 epochs) are used for classification, while GIS mapping with 1 m spatial accuracy is used for fault location. The model achieves 98.9% accuracy, 98.7% precision, 98.3% recall, 98.5% F1-score, According to experimental results, the model ensures real-time performance at over 80 frames per second by achieving an average end-to-end latency of 12.5 ms per sample on a high-performance workstation (NVIDIA RTX 3090 GPU, Intel i9 CPU). Latency rises to 20–35 ms on mid-range GPUs like the RTX 3060, which still satisfies real-time needs. Near-real-time requirements for field applications are met by efficient deployment using TensorRT and FP16 quantization, which keeps latency below 100 ms even on tiny GPUs. In order to verify and maintain real-time operation under various hardware restrictions, optimization techniques (model pruning, quantization, and asynchronous execution) have been implemented, along with a thorough latency breakdown across preprocessing, feature extraction, fusion, and inference phases. The implementation environment runs Python 3.9, TensorFlow 2.x, NumPy, and ArcGIS API on an NVIDIA RTX 3090 GPU, Intel i9-10900 K CPU, and 64 GB RAM. The dataset is split into 80% training, 10% validation, and 10% testing sets, with the Adam optimizer used for optimization and early halting (patience = 8 epochs) to prevent overfitting.

Collecting and synchronizing disparate datasets like ArcGIS power line photos and Awesome 3D LiDAR point clouds ensures spatial alignment for consistent feature mapping in multimodal data fusion. Data quality is improved via noise filtering, normalization, and temporal-spatial alignment for each modality. Convolutional neural networks (CNN) for image data and 3D convolutional networks for LiDAR data extract rich, modality-specific characteristics. Kalman filtering iteratively merges complementing information from each source to improve resilience and accuracy. For accurate fault classification and GIS mapping fault localization, Adam-optimized decision trees are fed the fused multimodal feature representation. Power distribution network fault detection is accurate and real-time with this systematic fusion approach, which leverages the complementary strengths of different modalities and addresses data heterogeneity.

Figure 6 shows a complete data analysis after feature extraction and multimodal fusion. In the comparison histogram (Fig. 6a), the fused feature distribution reduces variance while retaining the important bimodal features of LiDAR and optical modalities, suggesting effective complementary information integration, specifically shows the comparative feature distribution before and after fusion by plotting Feature Value (x-axis) against Frequency (y-axis). Feature value sorting (Fig. 6b) shows that fused features smooth and uniformly represent samples, bridging their statistical qualities and the distribution homogeneity attained by fusion by plotting Feature Index (x-axis) against Variance (y-axis). The spectrum energy plot (Fig. 6c) indicates a concentration of energy at lower frequencies, validating the conclusion that the fusion process suppressed high-frequency noise. Plotting frequency (Hz) against spectral energy (dB) in subfigure (c) illustrates how high-frequency noise is suppressed. PCA results (Fig. 6d) show that the first two components explain virtually all of the variance, demonstrating that most relevant information is recorded post-fusion and Principal Component (x-axis) vs. Explained Variance (%) to emphasize dimensionality reduction efficiency. Fused features have the greatest significance ratings (Fig. 6e), surpassing separate modalities and showing their vital influence in classification accuracy. In order to determine which qualities are more important for categorization, Subfigure (e) displays Feature Rank (x-axis) against Importance Score (0–1 scale). Finally, t-SNE clustering (Fig. 6f) shows well-separated clusters, proving that the fused feature space improves inter-class separability and classification over unimodal representations. The inter-class separability attained following multimodal fusion is demonstrated in subfigure (f), which compares t-SNE Dimension 1 to Dimension 2.

Fig. 6
figure 6

Efficiency analysis of feature extraction and fusion (a) Feature distribution before/after fusion. (b) Feature variance distribution. (c) Spectral energy of fused features. (d) PCA explained variance. (e) PCA explained variance. (f) t-SNE feature cluster analysis.

The fault classification system based on deep learning approaches has surpassed expectations by achieving an accuracy of 98.91% with a 3-layer DNN architecture, which is remarkably close to the proposed ceiling of 99.4% displayed in Fig. 7 (a). Accuracy (%) vs. Epoch is shown in Subfigure (a), which shows how the model’s performance improves throughout training iterations. From the analysis of training convergence (Fig. 7b), it is clear that optimization is achieved without considerable overfitting, as demonstrated by the training loss and validation loss curves of the model, The convergence pattern between the training and validation sets is highlighted in Subfigure (b), which displays Loss Value versus Epoch. The neural network (NN) training state graphic is included, which displays the evolution of the gradient magnitude, validation checks, and learning rate scalar (µ). This figure validates effective validation during model training, stable convergence, and appropriate learning rate adaptation. The discriminative power of the model is illustrated by Subfigure (c), which shows the False Positive Rate (x-axis) vs. True Positive Rate (y-axis) via the ROC curve, which were close to each other is well demonstrated by the ROC curve (AUC = 0.992, Fig. 7c) and precision-recall characteristics (Fig. 7d), affirming that the model maintained good discriminative ability across all operational thresholds.

Fig. 7
figure 7

Efficiency analysis of deep learning-based fault classification. (a) Accuracy (b) loss convergence (c) ROC curve (Fault Class B) (d) Precision-recall curve (e) Confusion matrix (f) Classification confidence distribution.

Data from the confusion matrix (Fig. 7e) showcases an outstanding 485 out of 500 sample classification, resulting in only five errors classifying the two subtypes of faults B and C softmax distributions in Fig. 7f indicate the overwhelming dominance of high-confidence predictions, which showed that out of all samples, 89% were classified with over 95% certainty affirming that the system can be reliably deployed in industrial settings. Classification Confidence (%) vs. Sample Count is shown in Subfigure (f), which shows the distribution of reliability among samples. To ensure visual consistency and analytical clarity, all probability and rate metrics are consistently expressed as percentages (%), and legends in each subfigure clearly indicate training, validation, and testing outcomes.

Refining the decision tree shows notable increases in performance, enhancing accuracy by 60–80% across all fault types (Fig. 8a).It highlights mistake reduction across categories by plotting Fault Type (x-axis) against Misclassification Rate (%). Maximum accuracy is attained at depth 5 (Fig. 8b), Tree Depth (levels) vs. Accuracy (%) is shown to determine the ideal model complexity, where the probability distribution of classification confidence exhibits a steep increase (Fig. 8c) the degree of categorization certainty by plotting Confidence Probability (0–1 scale) against Fault Class. Gini impurity analysis reveals a strong negative correlation with decision confidence (Fig. 8d), and the ranking of feature importance identifies voltage and current measurements as the most relevant ones (Fig. 8e) the relative importance of input attributes by plotting the Feature Index against Importance Value (normalized). In terms of accuracy, the average absolute improvement of 5–8% across all fault types after refinement is the post-comparison accuracy (Fig. 8f), which confirms that the optimization process worked effectively. After refinement, the model has an accuracy between 93 and 98% and still has a clear structure due to the decision tree used.

Fig. 8
figure 8

Refinement efficiency analysis. (a) Mis-classification rate (b) tree depth vs. accuracy (c) classification confidences (d) Gini impurity vs. confidence (e) feature importance (f) accuracy comparison.

In addition, the efficiency of the Multimodal Data Fusion and a Hybrid Deep Learning Model (MDF-HDL) is compared with the existing approaches such as capsule network (CapsNet)15, Neural Architecture Search (NAS)17, Spatial-temporal recurrence neural network (STRGNN)35 and 1D convolutional neural networks (CNN)36 and the obtained results are shown in Table 2.

Table 2 Comparative analysis of MDF-HDL.

Table 2 shows that the MDF-HDL framework outperforms previous techniques in all important performance criteria. MDF-HDL outperforms CapsNet (95.24%), NAS (96.27%), STRGNN (94.35%), and 1D-CNN (92.98%) in fault classification and localization at 98.91%. Beyond accuracy, MDF-HDL has higher precision (0.9837) and recall (0.9931), resulting in an F1-score of 0.9893, indicating better false positive and negative balance. This performance advantage shows its ability to find flaws under difficult settings. MDF-HDL also reduces training time (65.1 min) compared to NAS (210.31 min) and CapsNet (120.3 min) while retaining a competitive inference performance of 12.5 ms for real-time applications. In terms of data efficiency, MDF-HDL achieves 86% accuracy with 10k samples, surpassing benchmark models that vary from 64 to 77% under the same data limitations. These results demonstrate that the proposed framework is practical for real-world multimodal fault diagnosis and localization tasks due to its state-of-the-art accuracy, faster training convergence, lower computational cost, and efficient use of limited training data.

The claim of maintaining low computational complexity even when integrating several techniques, including k-means clustering, deep learning layers, Kalman filtering, and decision trees, has been validated by a thorough computational complexity analysis. The analysis breaks down each component’s time and space complexity: Generally, k-means clustering works in \(\:\mathbf{O}(\mathbf{n}\:\times\:\:\mathbf{k}\:\times\:\:\mathbf{t}\:\times\:\:\mathbf{d}),\) where n is the number of samples, k the clusters, t iterations, and \(\:d\) the dimensionality; deep learning layers are optimized to control complexity by balancing model depth and parameter count; Kalman filtering incurs linear complexity \(\:O\left({m}^{2}\right)\) related to fused feature dimensions; and, lastly, decision trees optimized via Adam maintain effective classification with complexity approximately \(\:O\left(log\:n\right)\). The low computational cost claim was supported by experimental runtime profiling on a realistic hardware setup (NVIDIA RTX 3090 GPU, Intel i9-10900 K CPU), which verified an inference time of roughly 12.5 ms per sample. These outcomes demonstrate the system’s suitability for detecting realistic, real-time problems in complex power distribution networks. The supplemental resources offer full computational details and profiling.

Statistical results

The benchmark datasets underwent additional tests, which were conducted using the 5-fold cross-validation method. To accurately portray the consistency of the model, performance metrics such as accuracy, precision, recall, and F1-score are now provided as the mean ± standard deviation across all folds. As an illustration, the MDF-HDL model achieved an accuracy of 98.91% ± 0.23%, precision of 98.7% ± 0.27%, recall of 98.3% ± 0.30%, and F1-score of 98.5% ± 0.25%. Additionally, sensitivity assessments were conducted on critical hyperparameters, including the learning rate and batch size, to validate the model’s stable performance within practical parameter ranges. To address concerns regarding unpredictability and reinforce trust in the model’s application for fault detection in power distribution networks, these additional statistical evaluations provide a rigorous examination of the model’s reliability. To further strengthen the statistical validity and interpretability of the performance evaluation, 95% confidence intervals and error bars were incorporated into all reported performance metrics. Specifically, for each of the five cross-validation folds, the mean and standard deviation were computed for accuracy, precision, recall, and F1-score, and the corresponding error bars were plotted to visually represent the variability across folds. The MDF-HDL model achieved an average accuracy of 98.91% ± 0.23% (95% CI: [98.68%, 99.14%]), precision of 98.70% ± 0.27% (95% CI: [98.43%, 98.97%]), recall of 98.30% ± 0.30% (95% CI: [98.00%, 98.60%]), and F1-score of 98.50% ± 0.25% (95% CI: [98.25%, 98.75%]). These confidence intervals and error bars, now reflected in Figs. 7(a) and 8(f), provide a clearer depiction of statistical variability and confirm the robustness and consistency of the proposed model across multiple validation folds. This refinement ensures a transparent representation of performance stability and reinforces the reliability of the MDF-HDL framework for real-world fault detection and localization tasks. The MDF-HDL and baseline models were compared using paired t-tests and Wilcoxon signed-rank tests to further confirm model randomness and statistical reliability. Significant performance gains were seen in the results (p < 0.01 and p < 0.05), indicating that the gains are statistically significant and not the result of chance.

According to experimental results, the model ensures real-time performance at over 80 frames per second by achieving an average end-to-end latency of 12.5 ms per sample on a high-performance workstation (NVIDIA RTX 3090 GPU, Intel i9 CPU). Latency rises to 20–35 ms on mid-range GPUs like the RTX 3060, which still satisfies real-time needs. Near-real-time requirements for field applications are met by efficient deployment using TensorRT and FP16 quantization, which keeps latency below 100 ms even on tiny GPUs. In order to verify and maintain real-time operation under various hardware restrictions, optimization techniques (model pruning, quantization, and asynchronous execution) have been implemented, along with a thorough latency breakdown across preprocessing, feature extraction, fusion, and inference phases.

To guarantee dependability, an experimental uncertainty analysis was conducted using many trials and five-fold cross-validation. To demonstrate the openness and robustness of the results provided, performance metrics are displayed as mean ± standard deviation (e.g., accuracy: 98.91% ± 0.23%), which indicates the experimental uncertainty resulting from data and model variances.

Ablation study

Model performance improved significantly in the Kalman filtering ablation study on feature fusion. Without Kalman filtering, the MDF-HDL model achieved 96.23% ± 0.45% accuracy and 95.89% ± 0.47% F1-score. Using Kalman filtering for multimodal feature integration significantly improved accuracy to 98.91% ± 0.23% and F1-score to 98.50% ± 0.25%. Kalman filtering is essential for merging complementary modality features, which improves defect detection precision and reliability.

Compared to CNN-only feature extraction, Graph Neural Networks (GNN) improved performance. The CNN-only model had an accuracy of 97.12% ± 0.38% and an F1-score of 96.85% ± 0.40%, whereas the GNN-enhanced model had 98.18% ± 0.29% and 97.75% ± 0.31%. This suggests that GNN’s capacity to capture relational and structural information between features improves power grid fault detection categorization.

Further investigation showed that decision tree refinement corrected misclassifications from raw deep neural network (DNN) output. The raw DNN predictions have an accuracy of 97.45% ± 0.42% and an F1-score of 97.10% ± 0.44%. After decision tree post-processing, the accuracy and F1-score improved to 98.91% ± 0.23% and 98.50% ± 0.25%, respectively, highlighting the impact of the refinement on classification robustness and error reduction. These results confirm that the MDF-HDL model’s integrated approach maximizes fault classification and localization.

Limitations

Critical limitations are acknowledged in this study. The evaluation uses only publicly available benchmark datasets, not real-world or field-collected data. Sensor noise, fault characteristics, and data gaps can affect model robustness and accuracy in power grid failure detection. Second, high-quality multimodal data, especially LiDAR and optical pictures, are difficult to acquire and expensive, limiting implementation. Obtaining high-quality LiDAR and optical data for the identification of power distribution problems presents both technological and economical challenges. Centimeter-level accuracy requires expensive sensors, aerial platforms, and precise calibration; prices rise with larger grid areas and more frequent scans. Environmental variables like as changes in cloud cover, vegetation, and light can further reduce the reliability of data and need extensive preprocessing and alignment. Moreover, synchronization between multimodal sources necessitates specialized equipment and skilled personnel, and massive data volumes demand strong servers, storage, and GPUs. When combined, these elements lead to scale problems, lengthy implementation times, and high running expenses. Third, while the MDF-HDL model has low computational complexity, integrating deep learning layers, Kalman filtering, and decision tree refinement may limit scalability and real-time operation in large-scale or resource-limited grid environments. Finally, dataset bias and class distribution constraints may limit the model’s applicability to varied geographic and operational contexts. To improve practicality, future research will validate the model on operational grid data, optimize computing efficiency, and address dataset diversity.

Furthermore, because of their labeling procedures and data collection settings, the ArcGIS Power Line Classification Project and the Awesome 3D LiDAR Datasets may introduce intrinsic biases while being comprehensive and well-annotated. Because these datasets mostly cover certain geographic regions under controlled imaging conditions, they could not adequately capture the unpredictability of real-world power distribution networks due to changing weather, terrain, and sensor calibration settings. These dataset-specific biases might make the model overoptimized for benchmark conditions when applied to unfamiliar situations, thereby limiting its usefulness. To mitigate these effects and ensure a balanced class distribution and broader representation, the study employed stratified sampling, cross-validation, and multimodal data normalization. Further validation using a variety of field-collected datasets is required to confirm the robustness and adaptability of the proposed MDF-HDL architecture in operational grid settings.

link

Leave a Reply

Your email address will not be published. Required fields are marked *