Multi-granularity Signal Processing Method for Digital Twin Power Grids via Graph Representation Learning

ABSTRACT


INTRODUCTION
The power network, a fundamental infrastructure of modern society, describes the intricate nodal relationships among various transmission devices, acting as a crucial carrier of multifaceted information [1][2][3].This network's operation hinges on precise analysis and decision-making based on data from these interconnected systems [4].Signal processing plays a pivotal role in this context by enabling the extraction and refinement of information from these complex data streams, thus facilitating a more robust and dynamic understanding of network behaviors and interactions [5].
Digital twins, as a digital representation of physical assets, systems, or processes, have emerged as a groundbreaking technology in the optimization and management of power grids [6,7].By creating a dynamic digital counterpart of the physical power grid, digital twins allow for real-time monitoring, simulation, and analysis of grid operations [8].This capability is crucial for identifying potential issues, optimizing grid performance, and supporting decision-making processes.The integration of digital twins in power grids enhances the ability to predict and mitigate problems before they occur, thereby improving the reliability and efficiency of power distribution.
Signal processing, on the other hand, is essential for handling the vast amounts of data generated by power grids [9].It involves techniques that process and analyze signals to extract useful information, detect anomalies, and enhance the quality of data.In the context of power grids, signal processing can be used to monitor the condition of grid components, detect faults, and optimize the flow of electricity.The combination of signal processing and digital twins provides a powerful toolset for managing the complexity of modern power grids, enabling more precise and adaptive control mechanisms.
Recently, numerous deep learning techniques have emerged to facilitate clustering based on topology, attributes, behaviors, and other aspects [10][11][12].Algorithms like DeepWalk [13], Node2Vec [14], and LINE [15] have established themselves as classical methods in complex network representation learning, effectively addressing the challenge of preserving local topology.Such as SDNE [16] and GCN [17], these studies achieve clustering by mapping individual nodes to different levels of granularity, i.e., by considering the topology of nodes, as in MNRL [18], by considering the topology between nodes and the properties of neighboring nodes.However, the current research only focuses on training out the corresponding models without the corresponding granularity to simulate the twins by computer and feed them to the real world for construction.
With the application of new generation technologies such as industrial Internet of Things and digital twins, it is necessary to achieve interaction between physical space and information space [19][20][21][22][23]. Fuller et al. [24] proposed the innovative concept of digital twin workshops to establish a fivedimensional model of digital twin to realize the interaction and integration of the physical world and the information world to finally realize intelligent manufacturing.Qi et al. [25] integrated digital twin technology with planning and scheduling techniques to create a digital twin-based planning and scheduling system.This system is designed to coordinate and plan workshop production activities, managing and controlling uncertainties from multiple perspectives and in comprehensive ways.Kipf and Welling [26] proposes an ultrashort prediction of PV power based on a digital twin simulation of PV cells and the surrounding environment, comparing the actual power with the predicted power, and then correcting the predicted value to achieve the final prediction result.
This paper proposes a multi-granularity modeling approach for digital twins, focusing on the signal processing techniques that enable the analysis of relationships and interactions within power grids.By leveraging advanced signal processing in graph representation learning, this approach aims to map complex network data into a structured digital twin framework.This enables the dynamic simulation of grid behaviors and interactions at multiple granularities, thereby enhancing the predictability and management of power networks.This methodology underscores the inseparability of physical properties, operational functions, and topological data in creating a responsive and adaptive digital twin system.

SIGNAL PROCESSING-ENHANCED MULTI-GRANULARITY AGGREGATION FRAMEWORK FOR DIGITAL TWIN GRIDS
The intricate relationships between entities within a gridencompassing operational functions, topological positions, and physical attributes-are fundamentally interconnected [27][28][29].Isolating these elements during analysis could obscure critical information, potentially impacting overall grid performance.For instance, a seemingly sparse grid network, as depicted in Figure 1, might suggest a lack of connections between distantly located devices such as nodes P1 and P7.However, a deeper signal processing-based analysis reveals that all inter-node information must be considered to ascertain connectivity.Even if nodes like P4 and P5 share no direct operational or topological link, their association via attribute L2 suggests latent connections.Thus, the fusion of attributes, operations, and topology is crucial for transmitting refined digital information to the digital twin, which then maps nodes across various levels using enhanced low-dimensional representation vectors derived from signal processing techniques.

Figure 1. Physical grid case
The framework for multi-granularity aggregation in digital twin grids is detailed in Figure 2.This framework integrates physical devices, digital twins, and application services to realize a comprehensive multi-granularity process.This integration facilitates real-time interactions, a closed-loop feedback mechanism, and lifecycle maintenance of network devices.Signal processing is central to this framework, enabling the precise collection and transformation of data across different layers of grid complexity-from individual unit operations to system-wide interactions.

Figure 2. Digital twin-based multi-granularity aggregation grid framework
In the physical layer, sensors play a pivotal role by gathering and processing data related to energy consumption, structural modifications, operational changes, and business dynamics under the governing principles of power distribution.This data is instantaneously relayed to the digital twin layer, ensuring that each piece of information influences the digital counterpart accurately.The digital twin constructs dynamic models that reflect changes across various granular levelsunit, system, and complex system-by leveraging historical data stored in databases.These models, enriched through advanced signal processing algorithms, facilitate detailed granularity mapping and are essential for the subsequent application service layer, which utilizes graph representation learning to further refine and implement these mappings.
Ultimately, the integration of the digital twin framework with real-time data processing and advanced signal processing techniques enables a comprehensive feedback loop to the physical grid.This ensures that the digital twin not only represents but actively enhances the grid's operational efficiency.The digital twin's ability to predict, categorize, and display information across multiple granularities minimizes the need for manual intervention, reduces personnel costs, and decreases operational expenditures, thereby optimizing the overall management and reliability of the distribution network.This chapter demonstrates how signal processing is integral to transforming raw data into actionable insights within the digital twin framework, ensuring that every level of the grid benefits from enhanced predictive and analytical capabilities.

SIGNAL PROCESSING IN DIGITAL TWIN GRIDS WITH GRAPH CLUSTERING
We utilize a detailed multi-granularity network representation learning (MNRL) approach that thoroughly refines the distribution network from its fundamental topology to the execution services, characteristics, and additional information associated with each node.This method not only preserves the original topology of single-device granularity but also integrates a wide range of additional node-specific data.This enrichment significantly enhances the complexity and interconnectedness of the distribution network's relational framework.By doing so, it thoroughly addresses and integrates the attributes, operational functionalities, and topological structures of single-grain devices.

Figure 3. Multi-granularity graph representation learning
The extensive integration and mapping processes facilitated by the MNRL method are illustrated in Figure 3, highlighting how diverse data layers contribute to a comprehensive and nuanced network representation.This approach ensures a deeper understanding and more robust analysis of the network, enhancing both the precision and utility of the digital twin model.

Formal definition of grid topology
The distribution network is depicted as a complex network system  = (, , ), where  represents the set of n nodes at single-device granularity,  is the set of edges between these nodes, and A is the set of attributes for each device.Here, A∈RN×M represents the matrix encoding the attributes of all single-device nodes, including services, features, and other information.Each attribute   ∈  pertains to   .The element   = (  ,   ) indicates the line and network connection between the single device nodes   and   .
In a given complex distribution network  = (, , ), we map   and   to a low-dimensional vector   by a learning function  :    , where  denotes the dimensionality of the matrix  and  << || , and the   obtained by learning includes not only the distribution network structure map but also other information of the nodes.The low-latitude vectors obtained by network representation learning remain highly consistent with the original network high-dimensional information.For example,   and   are transformed from high-dimensional vectors to low-dimensional ones by the network representation learning function  , where   and   are similar.
In a complex distribution network, a single device corresponds to a single node, i.e., a single granularity (essential granularity), and the structure is not subdividable.The physical world can combine unit devices (unit layer) far and near into a system (system layer) by Euclidean distance.Finally, the individual systems are combined into a complex system (complex system layer).Multi-granularity representation learning aggregates the complex distribution network with a complete representation using lowdimensional information by learning different unit devices at the unit-level properties, system-level associations, and the global structure.

Representation of learning methods
In order to better represent the complex grid system with multiple granularities and to address the inability to analyze individual device node attributes, we use the information collected by single device granularity sensors by integrating the solution components of each layer in order to form the optimal solution for the whole problem [30], so we use a multigranularity information fusion method, as in Eq. (1): (  ) denotes the grid node   adjacent contacts.The ai denotes the vi attributes of the node.(  ) represents the information of the neighboring nodes along with their own attribute information.
To capture the complementary properties of different granularity hierarchies, as well as the effects of noise, we use the model Autoencoder, which is a powerful unsupervised model.In a complex grid structure with multiple granularities, the autoencoder fuses different information from single device granularity to coarse granularity, which includes execution operations, features, and various additional information.Eq. ( 2) defines that the autoencoder contains three layers: the input layer, the hidden layer, and the output layer, as shown in Figure 4.
The activation function is denoted by (•) .  and   represent the transform matrixes of the k-th layer, respectively.k denotes the number of layers of the encoder and decoder.
( ) ( ) ( ) ( ) The uniform vector representation obtained by model learning, denoted by  1  and obeying the function  ,  ~ (0, 1) is a standard normal distribution.To make the model trained by the multi-granularity learning method fit the type of distribution of the collected data, the following loss function needs to be minimized, with the loss function being: ( ) In order to reduce the loss of potential information, our goal is to minimize the automatic decoder loss function by comparing the decoder output   with a priori knowledge   and calculating the error between the two, as follows: To portray the data from network devices at the same structural tier, we employ the skip-gram model, which is also suggested in research on heterogeneous networks and is suitable for various types of node devices.During our treatment, the node's context regarding the device node performs services, features, and various additional information.In (9), we define the loss function to specify the random wandering  ∈ where vi is the node information and yi is the reconfiguration information obtained by vi through representation learning.
( ) where, ( + |  ) is described using a Softmax function, and B represents the size of the generation window.
( ) Multi-granularity network representation learning is fused by multi-granularity information, and the decoder learns single-device granularity information to carry the transformation of different high level information into uniform low-dimensional information, and then comes to realize the mapping of different granularity, which solves the problem that under multi-granularity, even though there is no close distance in physical space between single-device entities, there is business communication that cannot be fused.The multi-granularity network representation optimizes the objective function, ensuring the validity of the computational outcomes.) The complete objective function is as follows: The multi-granularity learning model is a method that can preserve multiple levels of granularity information, including unit-level attributes of device nodes, system-level associations, and global structure.The method can effectively address the complementarity between different granularities as well as the highly nonlinearization, while preserving various information, such as business and attributes of device nodes, and finally forming the objective function  .Through the gradient granularity descent method, the iteration of the objective function is performed until the function converges.Taken together, this network expression learning is able to interact in real time, accurately and efficiently utilize the collected information for model training and achieve the goal of mapping at different granularities.In the experiments, the data utilized in this paper are sourced from real-time data on a provincial grid data platform, covering a number of cities.These data include device relationships, various attribute information, and business instruction information for the devices.This data set helps demonstrate a significant inseparability and dependency among device node cluster relationships, individual device relationships, attributes, and businesses, which collectively influence the relationships within a single association of devices.To verify the effectiveness of our proposed multigranularity modeling approach, we conducted experiments on datasets from different cities, including City1, City2, and City3.We compared our method with four baseline methods: DeepWalk, Node2Vec, LINE, and SDNE.

Device node classification
In order to demonstrate the performance of our proposed multi-granularity modeling approach, we implement subgroups of nodes to be modeled and achieve multigranularity presentation of nodes.Specifically, we combine device nodes according to their connections, business attributes, from single device components into system components, and finally system components into complex systems, where we use SVM as the combined classifier.For the comprehensive evaluation of the model, we randomly sampled the dataset and selected 10%, 30%, and 50% of the single device nodes as the training set, and the remaining dataset as the test set.To verify the correctness of the binning, we used Macro-F1 (Ma-F1) and Micro-F1 (Mi-F1) as validation metrics.In City 1, as shown in Table 2 and Figure 5, the MRNL method demonstrated superior performance across all training set sizes compared to other methods (DeepWalk, Node2Vec, LINE, SDNE).Notably, as the size of the training set increased from 10% to 50%, MRNL showed consistent improvements in both Micro-F1 (Mi-F1) and Macro-F1 (Ma-F1) scores, starting from 0.6830 and 0.6364 at 10% to 0.7302 and 0.6905 at 50%.This trend suggests that MRNL is highly effective in handling varying granularities of data, significantly outperforming the baseline methods, which also showed gradual improvements as the training data increased.In City 2, as shown in Table 3 and Figure 6, similar trends were observed, with MRNL again outperforming other methods.MRNL's performance peaked at a 50% training set size, with Mi-F1 and Ma-F1 scores of 0.8498 and 0.8215, respectively.Other methods, like LINE and Node2Vec, also showed strong improvements with increased training data.For instance, LINE jumped from a Mi-F1 of 0.7248 at 10% to 0.8455 at 50%, demonstrating its effectiveness in more extensive training scenarios.

Method
As shown in Table 4 and Figure 7, City 3 displayed the highest performance metrics across all methods, particularly for MRNL and SDNE.MRNL showcased the highest scores, achieving Mi-F1 and Ma-F1 of 0.8687 and 0.8607, respectively, at a 50% training set.SDNE also displayed a remarkable performance, especially at 50% training, where it reached Mi-F1 and Ma-F1 scores of 0.8645 and 0.8635.This indicates that both MRNL and SDNE are particularly robust in environments with rich and complex data in City 3.

Grid link forecast
In this section, we use link prediction to represent the ability of the model to reconfigure the grid, with the aim of predicting whether there is an association between two nodes, which is a typical network analysis task.To evaluate our model, we randomly keep 80% of the existing links as positive examples and randomly select the same number of non-existent links.Finally, we use an autoencoder for training.Specifically, we rank the positive and negative examples according to the cosine similarity function.To determine the quality of the ranking, we use accuracy to evaluate the ranked list, with higher values indicating better performance.We validated this on the City 1, City 2, and City 3 datasets, and the results are shown in Figure 8.

CONCLUSION
The experimental results demonstrate the significant potential of the multi-granularity modeling approach in optimizing power grid operations, predictions, and decisionmaking processes.By leveraging advanced signal processing techniques and graph representation learning, our method effectively integrates and analyzes the attributes and operational data of device nodes across different levels of granularity.This comprehensive integration allows for more accurate and dynamic simulations of grid behaviors and interactions, which are crucial for the real-time management and optimization of power grids.
Firstly, the multi-granularity modeling approach enhances the predictability of grid operations by capturing intricate relationships and dependencies among nodes.This improved accuracy enables more precise forecasting of potential issues and more effective maintenance planning, ensuring the reliability and stability of power grid operations.Secondly, our approach supports more informed decision-making by providing a detailed understanding of the grid's operational status at various granularity levels.By mapping complex network data into a structured digital twin framework, stakeholders can gain insights into both local and system-wide interactions, facilitating better resource allocation and operational adjustments.
Future research can investigate the scalability of our approach in larger and more complex grid systems, integrating additional data sources such as weather conditions and energy market trends to enhance predictive power.Developing realtime processing capabilities and adaptive algorithms will be crucial for dynamic grid management.Enhancing analytical and visualization tools can provide more intuitive insights, and leveraging IoT devices and edge computing can improve data collection and processing efficiency.Lastly, exploring the application of our approach in other domains, such as transportation networks and smart cities, could reveal new opportunities for optimizing complex systems.These future research directions highlight the potential for significant contributions to various real-world applications.

Figure 5 .
Figure 5. Accuracy of Micro-F1 and Macro-F1 on City 1 data set

Figure 8 .
Figure 8. Prediction accuracy on City 1, City 2 and City 3 datasets