© 2025 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).
OPEN ACCESS
Accurate prediction of regional-scale thermodynamic energy flow patterns is crucial for understanding surface-atmosphere interactions and optimizing climate model parameterization schemes, playing a vital role in climate simulation and ecological environmental assessment. Traditional physics-guided machine learning (PGML) methods typically embed physical laws as fixed soft constraints in the model, which struggle to adapt to the spatiotemporal heterogeneity of key thermodynamic parameters under complex underlying surfaces and non-uniform conditions. This limitation affects the quantification accuracy of core parameters such as turbulence exchange coefficients and surface impedance. In this paper, we propose a dual-path physics-data hybrid framework that integrates the physical skeleton with data-driven dynamics through a collaborative architecture of differentiable physical kernels and spatiotemporal neural parameterization. The framework constructs a differentiable forward physics simulator based on discretized regional energy balance equations, while utilizing high-dimensional spatiotemporal encoder-decoder networks to adaptively generate grid-scale physical parameter fields from multi-source climate data. Combined with a multi-objective progressive learning strategy, the model achieves pre-training of parameters and physics-enhanced training. A closed-loop integration module is used for energy flow evolution scenario simulation and uncertainty quantification. This model demonstrates significant predictive advantages across various spatial and temporal scales, as well as extreme weather scenarios. For instance, the daily-scale sensible heat flux in homogeneous farmland reduces mean squared error (MSE) by 38.7% compared to traditional PGML and 30.8% compared to Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM), with an accuracy degradation rate of only 13.5%–17.3% under extreme scenarios. The energy conservation error is ≤ 5.7 W·m⁻². Ablation experiments validate the core role of innovative components such as the neural parameterization, while revealing a regional thermodynamic parameterization formula (R² = 0.89) for multi-surface adaptation, uncovering the differences in energy flow driving mechanisms across regions. This study upgrades machine learning from a traditional predictive tool to a mechanism discovery and model enhancement tool, offering new methods for improving regional thermodynamic parameterization schemes and enhancing the reliability of energy flow predictions in extreme scenarios. It holds significant academic value and practical prospects.
regional scale, thermodynamic energy flow, high-dimensional climate data, physics-guided machine learning (PGML), differentiable physics simulation, uncertainty quantification
The accurate characterization and prediction of regional-scale thermodynamic energy flow is a core component in analyzing surface-atmosphere energy exchange mechanisms and optimizing climate model parameterization schemes [1-3]. Its accuracy directly affects the effectiveness of extreme weather warnings and the reliability of ecosystem carbon-water cycle assessments [4, 5]. With the development of remote sensing observation and numerical simulation technologies [6, 7], the accumulation of high-dimensional climate data, such as wind, temperature, humidity, pressure, and surface types, has provided a rich data foundation for capturing the spatiotemporal evolution patterns of energy flow [8, 9]. However, how to deeply integrate the representation ability of high-dimensional data with the basic thermodynamic laws to achieve reliable prediction of energy flow patterns under complex conditions remains a frontier challenge in the fields of surface processes and climate simulation.
PGML, as a mainstream method that integrates data and physical knowledge, is based on the idea of embedding known physical equations into the model to constrain the learning process [10, 11]. However, traditional methods often treat physical laws as fixed constraints, which are difficult to adapt to the spatiotemporal heterogeneity of key parameterization terms under complex underlying surfaces and non-uniform environments [12-15], leading to limited quantification accuracy of core parameters such as turbulence exchange coefficients and surface impedance. In contrast, purely data-driven models, though possessing strong high-dimensional data fitting capabilities [16, 17], often generate unreasonable predictions under extreme scenarios not covered by the training data due to the lack of inherent physical consistency constraints, and their generalization performance is difficult to guarantee. There is still a clear gap in current research: the discoverability of physical structures is insufficient, and it is difficult to extract parameterization forms that adapt to regional characteristics from data [18]; the adaptive learning ability of spatiotemporal heterogeneity of parameterization terms is limited [19]; the association between prediction uncertainty and thermodynamic mechanisms has not been effectively established [20, 21]. These shortcomings constrain the improvement of the accuracy and mechanism explanation ability of regional energy flow prediction models.
To address the above problems, the goal of this research is to propose a hybrid model that integrates high-dimensional climate data with differentiable physical structures, aiming to achieve high-accuracy prediction of regional-scale thermodynamic energy flow patterns while uncovering region-specific parameterization mechanisms with physical interpretability. The core innovative contributions are as follows: First, a dual-path framework of differentiable physical kernels and spatiotemporal neural parameterization is constructed, elevating physical laws from external constraints to an internally discoverable and interpretable structure within the model; second, a multi-objective progressive learning strategy is designed to achieve joint learning of the closure terms of physical equations and regional distribution patterns, enhancing the model's physical reasonableness and simplicity; third, an uncertainty quantification module is integrated to achieve the spatial tracing of prediction uncertainty and establish its relationship with thermodynamic mechanisms, improving the model's reliability and explanatory power.
The structure of this paper is arranged as follows: Chapter 2 provides a detailed explanation of the technical details of the dual-path physics-data hybrid framework, including the design of differentiable physical kernels, spatiotemporal neural parameterization, multi-objective learning strategies, and uncertainty quantification modules; Chapter 3 explains the experimental region, data sources, model configuration, and evaluation metrics; Chapter 4 presents the experimental results, validating the model's prediction accuracy, the effectiveness of innovative components, and the discovery of regional parameterization mechanisms; Chapter 5 discusses the academic value of the study, its comparative advantages over existing research, and its limitations; Chapter 6 summarizes the core conclusions and prospects for future research directions.
2.1 Overall framework design concept
The core logic of the dual-path physics-data hybrid framework is to achieve deep collaborative integration between thermodynamic physical laws and high-dimensional climate data, in order to overcome the adaptation limitations of traditional models under complex conditions. The framework uses the regional energy balance equation as the physical skeleton and integrates thermodynamic conservation laws into the model structure through differentiable modeling, ensuring that all prediction results strictly follow basic physical principles. At the same time, high-dimensional climate data drives the model through a spatiotemporal neural parameterization unit, which adaptively generates spatiotemporally heterogeneous key parameterization terms, accurately capturing the unique characteristics of complex underlying surfaces and non-uniform environments. The complete link of the framework is as follows: inputs include initial and boundary conditions such as surface temperature, radiation forcing, and high-dimensional climate data like wind, temperature, humidity, pressure, and surface types; the core modules consist of differentiable physical kernels and spatiotemporal neural parameterization units, which are collaboratively optimized through gradient backpropagation; outputs include high-precision energy flow spatial distribution predictions, regional thermodynamic parameterization fields, and corresponding uncertainty quantification results. The core innovation of this design lies in elevating physical laws from external fixed constraints in traditional models to discoverable and interpretable structures within the model, achieving an organic unity between physical skeleton and data-driven dynamics. The overall framework architecture is shown in Figure 1.
Figure 1. System flow architecture diagram
2.2 Design of differentiable forward physics simulator
Based on the thermodynamic energy conservation principle, the core control equation for energy balance per unit area of the surface at the regional scale can be expressed as:
${{\text{R}}_{\text{n}}}\text{=H+LE+G}$ (1)
where, ${{R}_{n}}$ is the net radiation flux at the surface, representing the total radiation energy received by the unit area of the surface per unit time; $H$ is the sensible heat flux, describing the sensible heat exchange between the surface and the atmosphere; $LE$ is the latent heat flux, corresponding to the latent heat transfer associated with water vapor phase change; $G$ is the soil heat flux, reflecting the energy conduction process between the surface and the soil. This equation forms the core constraint of the physical simulation, clarifying the basic balance relationship of regional energy flow transfer. To adapt to regional-scale simulation requirements, the finite volume method is used for spatiotemporal discretization. This method leverages the inherent conservation properties of integration to ensure that thermodynamic conservation laws are not violated during discretization. In space, the study area is divided into uniform grids, and the energy balance equation is integrated over each grid unit to obtain the integral form of the control equation at the grid scale; in time, a first-order implicit finite difference scheme is used to discretize the time derivative term, resulting in the discretized control equation:
$\frac{1}{\Delta t} \int_{\Omega_i}\left(T_{t+\Delta t}-T_t\right) \rho c~d \Omega=\int_{\Omega}\left(R_n-H-L E-G\right) d \Omega$ (2)
where, ${{\text{ }\!\!\Omega\!\!\text{ }}_{i}}$ is the control volume of the $i$-th grid unit, $\Delta t$ is the time step, $T$ is the surface temperature, and $\rho$ and $c$ are the density and specific heat capacity of the surface medium, respectively. The numerical solution steps are as follows: first, initialize the initial temperature and boundary fluxes for the grid units; then, substitute the parameterization terms generated by the neural parameterizer to solve for the sensible and latent heat fluxes; and iteratively update the energy balance state of each grid unit until the numerical solution meets the preset convergence criteria, ensuring that the discretized numerical solution maintains physical reasonableness.
To achieve collaborative training of the physical kernel and the neural parameterizer, automatic differentiation is used to convert the full numerical computation process of the physics simulator into a differentiable computational graph. By tracking the gradient propagation paths of each numerical operation, the gradients of the parameterization terms can be backpropagated along the numerical solution chain to the parameter space of the neural parameterizer. Specifically, the core steps of the finite volume method, including integral computation, flux interpolation, and iterative solving, are all encapsulated as differentiable operators to ensure that small changes in the parameterization terms are accurately transmitted and converted into update signals for the neural parameterizer. To balance numerical stability and gradient transmissibility, a step-size adaptation strategy is designed: the time step is dynamically optimized based on the residual size of the current iteration step. When the residual is large, the step size is reduced to improve stability, and when the residual converges, the step size is increased to enhance computational efficiency. At the same time, the QUICK scheme is used as the spatial interpolation scheme, as its inherent higher-order smoothing properties can reduce the gradient discretization error caused by numerical diffusion, ensuring the continuity of gradient propagation. The simulator’s inputs include surface initial temperature, top-of-atmosphere radiation forcing, near-surface meteorological variables, and other initial and boundary conditions. The outputs include energy flow fields such as sensible heat flux, latent heat flux, surface temperature, and state variables. Among them, input variables must satisfy thermodynamic dimensional consistency constraints, such as radiation flux with units of $\text{W}\cdot {{\text{m}}^{-2}}$ and temperature in Kelvin; output energy flow fields must meet the residual threshold requirements of the energy balance equation to ensure the physical validity of the output results. Figure 2 provides an illustrative diagram of the regional energy balance control volume analysis based on the finite volume method, showing the energy input and output terms (Rn, H, LE, G) within the grid unit and the time/space discretized grid structure, in conjunction with the physical meanings of Eqs. (1) and (2).
Figure 2. Regional energy balance control volume analysis diagram based on finite volume method
2.3 Design of spatiotemporal neural parameterizer
The spatiotemporal neural parameterizer adopts an encoder-decoder architecture. Its core function is to adaptively learn from high-dimensional climate data and generate parameterization fields that are fully matched with the physics simulator’s grid, achieving precise characterization of spatiotemporal heterogeneity of key parameters such as turbulence exchange coefficients and surface impedance. Figure 3 provides a detailed structure diagram of the encoder-decoder. It shows how 3D convolution layers extract spatiotemporal features, Bi-GRU captures long-term dependencies, the spatiotemporal attention module (Attention) is connected, and the embedding of physical prior constraints at the end of the network. The encoder part consists of 3 layers of 3D convolution layers and 1 layer of bidirectional gated recurrent units (Bi-GRU). The 3D convolution layers use 5 × 5 × 3 spatiotemporal convolution kernels to extract local spatiotemporal features from wind, temperature, humidity, pressure, and surface type data, while the Bi-GRU units capture long-term dependencies of climate variables along the time dimension. The decoder uses a fusion design of transpose convolution layers and spatiotemporal attention mechanisms. The transpose convolution layers upsample the feature maps to restore the spatial resolution consistent with the physical grid, while the spatiotemporal attention module calculates the correlation weights of different spatiotemporal positions in the feature map to enhance critical information. The attention weight calculation formula is:
$\alpha_{i, j, t}=\frac{exp \left({sim}\left(F_{i, j, t}, F_{i^{\prime}, j^{\prime}, t^{\prime}}\right)\right)}{\sum_{t^l, j^l, t^l} exp \left({sim}\left(F_{i, j, t}, F_{i^{\prime}, j^{\prime}, t^{\prime}}\right)\right)}$ (3)
where, ${{\alpha }_{i,j,t}}$ is the attention weight at grid point (i, j) at time $t$, ${{F}_{i,j,t}}$ is the corresponding feature vector at that position, and $\text{sim}\left( \cdot \right)$ uses cosine similarity for computation. The network output layer directly outputs the parameterization field corresponding to each grid point of the physics simulator, through a combination of the Sigmoid activation function and linear transformation. To ensure the thermodynamic validity of the output parameters, two physical prior constraints are embedded in the network design: first, a constraint on the range of parameter values, by adding hard threshold activations to the output layer, which limits the turbulence exchange coefficient to the range of ${{10}^{-3}}$ to ${{10}^{2}}\,{{\text{m}}^{2}}\cdot {{\text{s}}^{-1}}$ and surface impedance to the range of ${{10}^{2}}$ to ${{10}^{4}}\,\text{s}\cdot {{\text{m}}^{-1}}$; second, spatial smoothness initialization, where the weights of the decoder convolution kernels are initialized to a Gaussian distribution. The spatial smoothing property of the Gaussian kernel guides the spatial continuity of the parameter field, which is consistent with the spatial gradient characteristics of thermodynamic parameters.
The preprocessing of high-dimensional climate data focuses on ensuring thermodynamic consistency, and a full process handling scheme is constructed, including "spatiotemporal matching - missing value filling - physical consistency normalization." Spatiotemporal matching uses bilinear interpolation to resample climate data from different sources and resolutions to a grid scale consistent with the physics simulator, aligning the time dimension to the same time step. Missing value filling discards traditional interpolation methods and adopts a generative filling strategy based on thermodynamic equations. Using the temperature-pressure static balance relationship and the water vapor pressure-temperature saturation relationship as constraints, missing data is filled through least squares optimization, ensuring the filled data satisfies basic thermodynamic associations. The normalization process adopts a variable-specific physical normalization method. For variables with clear physical baselines such as temperature and pressure, normalization is performed based on standard atmospheric state parameters. Specifically, ${{T}_{\text{norm}}}=\frac{T-{{T}_{0}}}{{{T}_{\text{max}}}-{{T}_{0}}}$, where ${{T}_{0}}$ is the standard atmospheric sea-level temperature, and ${{T}_{\text{max}}}$ is the maximum temperature in the study region. This avoids the destruction of thermodynamic variable correlations caused by traditional normalization methods. Feature fusion uses a cross-modal attention fusion module, which dynamically allocates the feature contribution of different variables by calculating the correlation weight between each climate variable and the target parameterization term. Specifically, a variable weight vector $\omega =\left[ {{\omega }_{1}},{{\omega }_{2}},\ldots ,{{\omega }_{n}} \right]$ is learned, where $\mathop{\sum }_{k=1}^{n}{{\omega }_{k}}=1$, to achieve weighted fusion of multivariable features. At the same time, a physical feature extraction submodule is embedded to specifically extract derived physical features, such as wind shear and temperature gradient, which are directly related to turbulence exchange. Feature concatenation enhances the model’s ability to identify key thermodynamic driving factors, providing high-quality feature inputs for the precise learning of parameterization terms.
Figure 3. Detailed structure diagram of the spatiotemporal neural network encoder-decoder
2.4 Multi-objective progressive learning strategy
The multi-objective progressive learning adopts a two-phase progressive process: "supervised pretraining - physics-enhanced training." The core innovation lies in achieving an ordered optimization from "accuracy-first" to "balancing physical consistency and mechanism discovery" through phased target decomposition. The first phase is supervised pretraining, where the core loss function is the MSE between the outputs of the physics simulator and the observed energy flow values, expressed as:
$L_{M S E}=\frac{1}{N} \sum_{i=1}^N\left\|\widehat{Q}_i-Q_i^{o b s}\right\|^2$ (4)
where, $N$ is the number of observation samples, ${{\hat{Q}}_{i}}$ is the energy flow value output by the simulator, and $Q_{i}^{\text{obs}}$ is the corresponding observed value. The AdamW optimizer is selected, with an initial learning rate of ${{10}^{-4}}$ and a weight decay coefficient of ${{10}^{-5}}$. The training termination condition is when the MSE on the validation set decreases by less than ${{10}^{-5}}$ over 10 consecutive epochs. The core goal is to guide the neural parameterizer through data supervision to generate a parameter field that roughly aligns with real-world observations, laying the foundation for subsequent physics-enhanced training. The second phase is physics-enhanced training, where three physical constraint losses are introduced to build the total loss function on top of the MSE loss:
$L_{{total }}=L_{M S E}+\lambda_1 L_{p h y}+\lambda_2 L_{ {sparse }}+\lambda_3 L_{ {disc }}$ (5)
where, ${{\lambda }_{1}},{{\lambda }_{2}},{{\lambda }_{3}}$ are loss weights. The physical residual regularization loss ${{L}_{\text{phy}}}$ combines energy conservation and momentum conservation constraints. The energy conservation residual is obtained by calculating the deviation of the simulated energy flow field from the energy balance equation:
$L_{{energy }}=\frac{1}{M} \sum_{j=1}^M\left|R_{n, j}-\widehat{H}_j-\widehat{L} E_j-\widetilde{G}_j\right|$ (6)
where, $M$ is the number of grid units. The momentum conservation residual is calculated by the deviation of the simulated wind field from the geostrophic balance relationship:
$L_{ {momentum }}=\frac{1}{M} \sum_{j=1}^M\left|f \vec{u}_j-\nabla p_j\right|$ (7)
where, $f$ is the Coriolis parameter, $\vec{u}_j$ is the wind vector, and ${{p}_{j}}$ is the air pressure. Finally, ${{L}_{\text{phy}}}={{L}_{\text{energy}}}+{{L}_{\text{momentum}}}$. The sparsity penalty loss is expressed as:
$L_{{sparse}}=\frac{1}{M} \sum_{j=1}^M\left|K_{\mathrm{h}, j}\right|$ (8)
This loss uses L1 regularization to suppress redundant parameter fluctuations and encourage spatial simplicity in the parameter field, consistent with the smooth spatial gradient characteristics of parameter distribution in thermodynamic processes. The discovery loss ${{L}_{\text{disc}}}$ aims to guide the model to learn an analytically solvable parameterization form. A power function based on wind speed and stability, ${{K}_{h}}=a{{u}^{b}}R_{i}^{c}$, is selected as a simplified analytical function, where $a,b,c$ are the parameters to be fitted, $u$ is wind speed, and ${{R}_{i}}$ is the Richardson number. The loss value is the MSE between the analytical function output and the neural parameterizer output:
$L_{d i s c}=\frac{1}{M} \sum_{j=1}^M\left\|K_{\mathrm{h}, j}^{n e t}-a u_j^b R i_j^c\right\|_2^2$ (9)
To ensure the convergence of the two-stage training process, while avoiding over-suppression of the data-driven adaptive capability by physical constraints, a multidimensional stability guarantee mechanism is designed. The core innovation lies in dynamically adjusting the loss weights and fine-tuning the gradient control. The loss weights follow a progressive incremental strategy: the physical residual regularization loss weight ${{\lambda }_{1}}$ increases linearly from 0.1 at the end of pretraining to 0.5, the sparsity penalty loss weight ${{\lambda }_{2}}$ is fixed at 0.01, and the discovery loss weight ${{\lambda }_{3}}$ increases from 0.05 to 0.2. This setup ensures the model first establishes data fitting capability and gradually strengthens physical consistency and mechanism discovery objectives. Gradient control uses gradient clipping, setting a global gradient norm threshold of 1.0. Gradients exceeding this threshold are clipped to prevent gradient explosion during training, ensuring smooth parameter updates. Learning rate adjustment follows a cosine annealing decay strategy, maintaining the initial learning rate of ${{10}^{-4}}$ during pretraining, and decaying it following a cosine curve during physics-enhanced training, with a minimum value of ${{10}^{-6}}$. This guarantees rapid convergence in the early stage and stability in parameter fine-tuning later. Additionally, after each round of parameter updates, a physical validity check step is added. If the parameter field exceeds the pre-set thermodynamic validity range, the corresponding gradients are decay-corrected to avoid distortion of the physical meaning of parameterization terms, ensuring the model always converges in a physically valid direction during training.
2.5 Closed-loop integrated prediction and uncertainty quantification module
The core logic of the closed-loop integrated prediction is to construct a “prediction-feedback-update” autoregressive operation loop to achieve continuous scenario simulation of short-term thermodynamic energy flow evolution. In the initialization phase, the initial surface temperature, radiative forcing, and other boundary conditions, along with high-dimensional climate data, are input into the model. The neural parameterizer generates the initial parameter field, and the physical simulator outputs the energy flow field and surface state variables at time $t$. When transitioning to the simulation at time $t+1$, the energy flow field and surface temperature at time $t$ are used as new boundary conditions, and the real-time high-dimensional climate data for $t+1$ are input into the model. The neural parameterizer then adapts the parameter field based on the updated inputs, and the physical simulator performs a new round of energy balance calculations under the new parameters and boundary constraints. This closed-loop design dynamically updates input boundary conditions and parameter fields, allowing the simulation process to adaptively capture the spatiotemporal evolution patterns of energy flow. To ensure thermodynamic consistency in the autoregressive process, a dual-check procedure is performed after each simulation step: the energy balance equation residual $\mid {{R}_{n}}-H-LE-G\mid$ is computed. If the residual exceeds the preset threshold of $10\,\text{W}\cdot {{\text{m}}^{-2}}$, the output parameters of the neural parameterizer are recalculated by backtracking and adjusting; simultaneously, the value range and spatial smoothness of the parameter field are checked to ensure the physical validity of the parameterization terms does not distort during the iterative process, thus maintaining the physical validity of continuous simulations.
Uncertainty quantification is implemented using the Monte Carlo Dropout technique. The core innovation is to deeply bind uncertainty quantification with the parameterization process, enabling accurate characterization and spatial tracing of cognitive uncertainty. The specific configuration is as follows: Dropout layers are embedded in the key layers of the encoder and decoder of the spatiotemporal neural parameterizer, with a Dropout rate set to 0.15. During the prediction phase, the Dropout layers are kept active, and 50 independent forward propagations are performed for the same input sample, generating 50 independent parameterization fields and energy flow prediction results. Cognitive uncertainty is quantified through the statistical dispersion characteristics of the parameterization fields. For each grid cell, the uncertainty of the turbulence exchange coefficient is represented by the standard deviation of the 50 sampling results:
$\sigma_{K_{\mathrm{h}}}(i, j)=\sqrt{\frac{1}{s-1} \sum_{s=1}^S\left(K_{\mathrm{h}}^s(i, j)-\bar{K}_{\mathrm{h}}(i, j)\right)^2}$ (10)
where, $S=50$ is the number of samples, $K_{h}^{s}\left( i,j \right)$ is the turbulence exchange coefficient for the $s$-th sample at grid point (i, j), and $\bar{K}_h(i, j)$ is the mean of the 50 samples. Uncertainty tracing is achieved by mapping ${{\text{ }\!\!\sigma\!\!\text{ }}_{{{\text{K}}_{\text{h}}}}}\text{(i, }\!\!~\!\!\text{ j)}$ to the spatial grid of the physical simulator, generating an uncertainty spatial distribution map. High standard deviation regions are identified as "uncertainty hotspot areas." Further analysis is performed to establish the intrinsic relationship between these hotspot regions and thermodynamic features: the proportion of different surface types in the hotspot regions is statistically calculated to determine the contribution of complex surfaces to uncertainty; the wind speed gradient and Richardson number distribution in the hotspot regions are computed to reveal the positive correlation between turbulence exchange mechanisms and uncertainty. This provides data support for locating fuzzy regions in thermodynamic mechanisms and focusing subsequent research efforts.
2.6 Model output definition
The model adopts a dual-output design, with the core advantage being the simultaneous consideration of prediction accuracy and physical interpretability, enabling the coordinated output of “prediction results - physical mechanisms.” The first type of output is a high-precision energy flow spatial distribution prediction map, covering the spatiotemporal distribution of sensible heat flux, latent heat flux, and surface temperature. The output results undergo thermodynamic consistency checks and strictly satisfy the energy balance constraints, and can be directly used for regional energy flow evolution scenario analysis and energy flow forecasting under extreme weather conditions. The second type of output is region-specific parameterization results with clear physical meaning, along with corresponding uncertainty estimates, including the turbulence exchange coefficient field, surface impedance field, and their spatial uncertainty distribution maps. The two types of outputs are tightly thermodynamically linked: the parameterization results are the core physical foundation for energy flow prediction, and their spatiotemporal heterogeneity directly determines the energy flow transport efficiency and distribution pattern. The accuracy of energy flow predictions, in turn, verifies the rationality of the parameterization mechanism. In terms of application value, the energy flow spatial distribution predictions provide high-precision data support for climate simulation and ecosystem assessment; the parameterization results and uncertainty estimates provide direct observational calibration for improving traditional regional thermodynamic parameterization schemes. Particularly, the mechanism analysis of uncertainty hotspot areas can guide targeted field observation experiment design and promote the iterative optimization of regional thermodynamic models.
The study area is located in the northern part of the North China Plain, a typical agricultural-urban fringe zone. This region has a complex underlying surface, including farmland, urban built-up areas, forested areas, and a small amount of mountainous terrain. The spatial heterogeneity of surface cover significantly causes differences in the spatiotemporal distribution of thermodynamic energy flow. Additionally, influenced by the monsoon climate, there are drastic seasonal fluctuations in radiative forcing and meteorological conditions. The evolution of energy flow in this region is both complex and representative, making it suitable for testing the model's adaptability to complex surfaces and heterogeneous conditions. The data system includes three core types of data: (1) High-dimensional climate data come from multi-source fusion products. Surface temperature, surface albedo, and other optical remote sensing data are sourced from the MODIS satellite product (spatial-temporal resolution: 250 m/8 d). Near-surface wind speed, temperature, relative humidity, air pressure, and other meteorological data come from the ERA5 reanalysis product (spatial-temporal resolution: 0.25°/1 h). Land cover data is taken from the GlobeLand30 product (spatial resolution: 30 m). The time span of the data is from January 2020 to December 2022, and after preprocessing, the data is unified to a spatial-temporal resolution of 1 km/ 1 h. (2) Energy flow observation data is sourced from continuous observation records from three eddy covariance stations in the region. The core variables include sensible heat flux, latent heat flux, net radiation, and soil heat flux. The observational data is quality-controlled and used for model training and validation, with data from 2020-2021 serving as the training set and data from 2022 as the independent validation set to ensure the objectivity of the validation process. (3) Auxiliary data includes the digital elevation model (DEM, spatial resolution: 30 m) and land use classification maps, which are used to construct terrain constraints and surface prior information for the parameterization process, enhancing the neural parameterizer’s ability to adapt to regional characteristics.
The model parameters are configured to accurately capture spatiotemporal features while ensuring physical consistency. The encoder of the spatiotemporal neural parameterizer consists of 3 layers of 3D convolution (kernel size: 5 × 5 × 3, stride: 1, padding: 2) and 1 layer of Bi-GRU (256 hidden units). The decoder consists of 3 layers of transposed convolution (matching the encoder's downsampling factor) and 1 layer of spatiotemporal attention module (8 attention heads, feature dimension: 128). The training parameters use the AdamW optimizer, with an initial learning rate of 1e-4 for the pretraining phase, batch size of 32, and 50 epochs. During the physical enhancement training phase, the learning rate decays according to a cosine annealing strategy, with a batch size of 32 and 30 epochs. The uncertainty quantification parameters include a Dropout rate of 0.15 and Monte Carlo sampling with 50 samples. The comparative experiments are divided into two categories: (1) Benchmark models: The traditional PGML model (which uses turbulence exchange coefficients calculated from empirical formulas as fixed parameters for input into the physical simulator) and the pure data-driven model (CNN-LSTM, with the same input-output configuration as our model) are used to verify the prediction accuracy advantages of the proposed model. (2) Ablation experiments: Three groups of comparisons are made by removing the neural parameterizer, removing the physical residual regularization loss, and removing the discovery loss. By comparing the prediction accuracy and physical consistency of different models, the core roles of each innovative component are verified.
4.1 Energy flow prediction accuracy verification
The spatiotemporal scale verification and extreme scenario verification results for energy flow prediction accuracy are shown in Tables 1 and 2. The proposed model demonstrates significant advantages under different spatiotemporal scales and extreme conditions, with the core advantage stemming from the deep integration of physical consistency and data-driven adaptability.
Table 1. Comparison of energy flow prediction accuracy of different models under different spatiotemporal scales and surface types (Units: MSE: W²·m⁻⁴; MAE: W·m⁻²; R: dimensionless)
|
Prediction Scale |
Surface Type |
Model |
Sensible Heat Flux (H) |
Latent Heat Flux (LE) |
Energy Conservation Error (W·m⁻²) |
|
Daily Scale |
Homogeneous Farmland |
Proposed Model |
MSE: 892.3, MAE: 22.1, R: 0.92 |
MSE: 1056.7, MAE: 25.3, R: 0.90 |
4.8 |
|
Traditional PGML |
MSE: 1456.2, MAE: 31.5, R: 0.84 |
MSE: 1689.4, MAE: 34.7, R: 0.82 |
8.6 |
||
|
CNN-LSTM |
MSE: 1289.5, MAE: 28.7, R: 0.87 |
MSE: 1523.6, MAE: 32.1, R: 0.85 |
11.2 |
||
|
Heterogeneous Urban-Rural Fringe |
Proposed Model |
MSE: 1123.5, MAE: 26.4, R: 0.89 |
MSE: 1321.8, MAE: 29.6, R: 0.87 |
5.3 |
|
|
Traditional PGML |
MSE: 1892.7, MAE: 38.2, R: 0.78 |
MSE: 2156.9, MAE: 42.5, R: 0.75 |
9.8 |
||
|
CNN-LSTM |
MSE: 1645.3, MAE: 34.5, R: 0.82 |
MSE: 1987.4, MAE: 39.8, R: 0.79 |
13.5 |
||
|
Weekly Scale |
Homogeneous Farmland |
Proposed Model |
MSE: 987.6, MAE: 24.3, R: 0.90 |
MSE: 1189.2, MAE: 27.8, R: 0.88 |
5.2 |
|
Traditional PGML |
MSE: 1589.3, MAE: 33.8, R: 0.82 |
MSE: 1821.5, MAE: 36.9, R: 0.80 |
9.1 |
||
|
CNN-LSTM |
MSE: 1423.7, MAE: 31.2, R: 0.85 |
MSE: 1698.3, MAE: 34.5, R: 0.83 |
12.1 |
||
|
Heterogeneous Urban-Rural Fringe |
Proposed Model |
MSE: 1256.8, MAE: 28.7, R: 0.87 |
MSE: 1456.9, MAE: 32.4, R: 0.85 |
5.7 |
|
|
Traditional PGML |
MSE: 2034.5, MAE: 40.6, R: 0.76 |
MSE: 2321.7, MAE: 44.8, R: 0.73 |
10.5 |
||
|
CNN-LSTM |
MSE: 1789.2, MAE: 36.9, R: 0.80 |
MSE: 2145.6, MAE: 41.2, R: 0.77 |
14.3 |
Table 2. Comparison of energy flow prediction accuracy of different models under extreme weather scenarios (Units: MSE: W²·m⁻⁴; MAE: W·m⁻²; R: dimensionless)
|
Extreme Scenario |
Model |
Sensible Heat Flux (H) |
Latent Heat Flux (LE) |
Accuracy Degradation Rate (Daily Scale vs. Extreme Scenario, %) |
|
High Temperature (T>35℃) |
Proposed Model |
MSE: 1245.8, MAE: 28.9, R: 0.88 |
MSE: 1489.3, MAE: 33.2, R: 0.85 |
15.2 (H), 17.3 (LE) |
|
Traditional PGML |
MSE: 2134.6, MAE: 42.8, R: 0.75 |
MSE: 2567.8, MAE: 48.6, R: 0.72 |
31.5 (H), 33.7 (LE) |
|
|
CNN-LSTM |
MSE: 1987.4, MAE: 39.6, R: 0.79 |
MSE: 2345.7, MAE: 45.3, R: 0.76 |
36.8 (H), 38.2 (LE) |
|
|
Strong Wind (u>8 m/s) |
Proposed Model |
MSE: 1189.5, MAE: 27.6, R: 0.89 |
MSE: 1398.6, MAE: 31.5, R: 0.86 |
13.5 (H), 15.7 (LE) |
|
Traditional PGML |
MSE: 1987.3, MAE: 39.5, R: 0.77 |
MSE: 2289.5, MAE: 44.7, R: 0.74 |
28.4 (H), 30.2 (LE) |
|
|
CNN-LSTM |
MSE: 1823.6, MAE: 36.8, R: 0.81 |
MSE: 2134.8, MAE: 41.9, R: 0.78 |
33.1 (H), 34.5 (LE) |
From Table 1, it can be seen that on the daily scale, the proposed model reduces the MSE of sensible heat flux by 38.7% compared to the traditional PGML and by 30.8% compared to CNN-LSTM for homogeneous farmland. For the heterogeneous urban-rural fringe, the MSE of sensible heat flux is reduced by 40.6% compared to traditional PGML and by 29.7% compared to CNN-LSTM. Although the weekly scale prediction accuracy slightly decreases, the advantage remains stable. Spatially, the accuracy improvement for heterogeneous surfaces is significantly higher than for homogeneous surfaces, indicating that the neural parameterizer effectively captures the heterogeneity of the underlying surface. Regarding energy conservation error, the proposed model consistently stays below 5.7 W·m⁻², far better than the 8.6~10.5 W·m⁻² of traditional PGML and 11.2~14.3 W·m⁻² of CNN-LSTM, which verifies the role of embedded physical consistency in ensuring the reliability of predictions.
The extreme scenario validation results in Table 2 show that the accuracy degradation rate of the proposed model under high temperature and strong wind conditions is only 13.5%~17.3%, far lower than the 28.4%~33.7% for traditional PGML and 33.1%~38.2% for CNN-LSTM. In the high-temperature scenario, traditional PGML fails to adapt to the strong convection conditions due to fixed turbulence exchange coefficients, leading to a sharp increase in MSE for sensible heat flux prediction. CNN-LSTM, lacking physical constraints, exhibits numerical extrapolation bias. In contrast, the proposed model dynamically adjusts the turbulence exchange coefficients using the neural parameterizer and maintains energy balance under physical conservation constraints, effectively controlling prediction bias. In the strong wind scenario, the strict adherence to momentum conservation further reduces the deviation in energy flow simulation, verifying the core role of embedded physical consistency in enhancing model generalization capability.
4.2 Innovation component effectiveness analysis
The effectiveness verification results of the innovative components based on the ablation experiments are shown in Table 3. The synergistic effect of the neural parameterizer, multi-objective learning strategy, and uncertainty quantification module is the key to ensuring model performance. The independent contributions of each component are significant and complementary.
The effectiveness of the neural parameterizer is significant: when removed, the MSE of sensible heat flux increases by 75.7%, the MSE of latent heat flux increases by 69.3%, and the energy conservation error doubles, with the parameter reasonability coefficient dropping from 0.87 to 0.65. This indicates that fixed empirical parameters cannot adapt to the spatiotemporal heterogeneity of complex surfaces, whereas the neural parameterizer, driven by high-dimensional climate data, can precisely generate parameter fields that match regional features, providing core support for high-precision simulations. In the multi-objective learning strategy, removing physical residual regularization increases the energy conservation error from 4.8 to 8.5 W·m⁻², and the parameter reasonability coefficient drops by 0.11, verifying its key role in strengthening physical consistency. Removing the discovery loss decreases the parameter reasonability coefficient by 0.15, indicating that this loss term effectively guides the model to learn a simplified and physically intuitive parameterization form. The effectiveness of the uncertainty quantification module is verified by the confidence interval coverage, with the model achieving a coverage rate of nearly 94%, far exceeding the approximately 82% coverage rate of the model without the neural parameterizer. This demonstrates its ability to accurately depict cognitive uncertainty. The CRPS metric shows that the uncertainty quantification accuracy of the model is over 35% higher than that of the model without the neural parameterizer, and the uncertainty hotspot areas are primarily concentrated in regions with complex underlying surfaces such as urban-rural fringe zones and mountain-plain transitions, as well as areas with sparse observational data, which aligns highly with regional thermodynamic characteristics, achieving precise association between uncertainty and mechanisms.
Table 3. Impact of each component on model performance in ablation experiments (Units: MSE: W²·m⁻⁴; MAE: W·m⁻²; R: dimensionless; energy conservation error: W·m⁻²; Confidence interval coverage: %; CRPS: W·m⁻²)
|
Model Configuration |
Sensible Heat Flux (H) |
Latent Heat Flux (LE) |
Energy Conservation Error (W·m⁻²) |
Parameter Reasonability Coefficient |
Confidence Interval Coverage (H/LE, %) |
CRPS (H/LE) |
|
Proposed Model (All Components) |
MSE: 892.3, MAE: 22.1, R: 0.92 |
MSE: 1056.7, MAE: 25.3, R: 0.90 |
4.8 |
0.87 |
94.2 / 93.8 |
18.5 / 21.3 |
|
No Neural Parameterizer (Fixed Empirical Parameters) |
MSE: 1567.8, MAE: 33.2, R: 0.83 |
MSE: 1789.5, MAE: 36.8, R: 0.81 |
9.2 |
0.65 |
82.5 / 81.7 |
28.7 / 32.4 |
|
No Physical Residual Regularization |
MSE: 1089.4, MAE: 25.8, R: 0.89 |
MSE: 1267.8, MAE: 28.9, R: 0.87 |
8.5 |
0.76 |
90.3 / 89.5 |
22.3 / 25.6 |
|
No Discovery Loss |
MSE: 987.6, MAE: 24.5, R: 0.90 |
MSE: 1145.3, MAE: 27.2, R: 0.88 |
5.3 |
0.72 |
93.5 / 92.9 |
19.8 / 22.7 |
|
No Uncertainty Quantification Module |
MSE: 895.7, MAE: 22.3, R: 0.91 |
MSE: 1063.2, MAE: 25.6, R: 0.89 |
4.9 |
0.86 |
- / - |
- / - |
4.3 Discovery results of regional thermodynamic parameterization mechanism
The discovery results of the regional thermodynamic parameterization mechanism are the core innovation of this study, including the spatial distribution characteristics of parameterization terms, the derivable parameterization relationships, and the core thermodynamic driving mechanisms. The related results are shown in Table 4 and Table 5. The corresponding Figure 4 displays the regional turbulence exchange coefficient field ${{K}_{h}}$, surface impedance field ${{r}_{s}}$, and the corresponding uncertainty spatial distribution map generated by the model inversion. It highlights the parameter gradient changes in heterogeneous regions such as the urban-rural fringe.
Table 4 shows that the spatial distribution characteristics of parameterization terms are significant and consistent with thermodynamic principles: The turbulence exchange coefficient ${{K}_{h}}$ follows the distribution pattern of waterbody > bare land > forest > farmland > urban built-up area, which is negatively correlated with surface roughness—water bodies have smooth surfaces and low momentum exchange resistance, yielding the largest ${{K}_{h}}$; urban areas with dense buildings have high roughness, resulting in the smallest ${{K}_{h}}$. The surface impedance ${{r}_{s}}$, in contrast, shows an opposite trend, with urban built-up areas having the largest ${{r}_{s}}$ due to low vegetation cover and weak evaporation, while water bodies have the smallest ${{r}_{s}}$. Correlation analysis shows that ${{K}_{h}}$ is significantly positively correlated with both wind speed and temperature gradient, with coefficients higher than 0.85 for bare land and water, indicating that turbulence exchange in these regions is primarily driven by wind speed. For farmland and forest, the correlation with temperature gradient is higher, reflecting the important contribution of thermal convection to turbulence exchange. These distribution characteristics are consistent with the physical understanding of regional thermodynamic processes.
Figure 4. Regional spatial distribution heatmap
Table 4. Statistical results of turbulence exchange coefficient and surface impedance for different surface types (Units: ${{K}_{h}}$: m²·s⁻¹; ${{r}_{s}}$: s·m⁻¹)
|
Surface Type |
Turbulence Exchange Coefficient (Mean ± Standard Deviation) |
Surface Impedance (Mean ± Standard Deviation) |
Correlation Coefficient with Wind Speed (${{K}_{h}}-u$) |
Correlation Coefficient with Temperature Gradient (${{K}_{h}}-\nabla T$) |
|
Forest |
0.82 ± 0.15 |
856 ± 123 |
0.78 |
0.65 |
|
Farmland |
0.65 ± 0.12 |
623 ± 98 |
0.82 |
0.71 |
|
Urban Built-up Area |
0.48 ± 0.18 |
1258 ± 215 |
0.69 |
0.58 |
|
Bare Land |
0.95 ± 0.21 |
489 ± 76 |
0.85 |
0.78 |
|
Waterbody |
1.23 ± 0.25 |
325 ± 58 |
0.88 |
0.62 |
Table 5. Comparison of fit performance between newly discovered parameterization relationships and existing empirical formulas
|
Parameterization Relationship Type |
Expression |
Fit Quality (R²) |
Root Mean Square Error (RMSE, m²·s⁻¹) |
Applicable Surface Types |
|
New Discovered Relationship (This Model) |
$\text{Kh=0}\text{.087}{{\text{u}}^{\text{1}\text{.23}}}\text{R}{{\text{i}}^{\text{-0}\text{.32}}}$ |
0.89 |
0.078 |
All Regions (Multi-Surface Mix) |
|
Businger-Dyer Formula (Traditional) |
$\text{Kh=0}\text{.094uR}{{\text{i}}^{\text{-0}\text{.5}}}$ |
0.72 |
0.156 |
Homogeneous Farmland |
|
Zhang Formula (Improved Traditional) |
${{\text{K}}_{\text{h}}}\text{=0}\text{.075}{{\text{u}}^{\text{1}\text{.1}}}\text{R}{{\text{i}}^{\text{-0}\text{.4}}}$ |
0.78 |
0.123 |
Forest / Farmland Mix |
Figure 5. Fit comparison of newly discovered parameterization relationship with existing empirical formulas
Table 5 shows that the newly discovered parameterization relationship has significant advantages: its fit quality ${{R}^{2}}$ reaches 0.89, improving by 23.6% compared to the traditional Businger-Dyer formula and by 14.1% compared to the improved Zhang formula. The RMSE is only 0.078 m²·s⁻¹, more than 50% lower than the traditional formulas. Traditional formulas are mostly suitable for single homogeneous surfaces, while the new relationship, constrained by discovery loss, integrates the thermodynamic characteristics of multiple surface types and can adapt to the turbulence exchange laws of different regions. Combined with energy flow prediction results and parameter distribution characteristics, the core driving mechanisms of regional energy flow evolution can be clearly identified: In the plain farmland region, the turbulence exchange coefficient has a high correlation with the temperature gradient, and energy flow transmission is dominated by thermal convection, with latent heat flux accounting for more than 60%. In the urban built-up area, the large surface impedance and weak turbulence exchange lead to a significant increase in sensible heat flux, which accounts for 55%~65%, forming an "urban heat island" related energy flow distribution pattern. In the mountain-plain transition zone, the surface heterogeneity causes dramatic fluctuations in the turbulence exchange coefficient, making it a hotspot area for uncertainty in energy flow prediction. In this region, energy flow transmission is jointly dominated by terrain dynamics and thermal effects. Corresponding to Table 5, Figure 5 displays the comparison of fit performance between the newly discovered parameterization relationship, the traditional Businger-Dyer formula, and observational data. The X-axis represents the physical driving variables, and the Y-axis represents the turbulence exchange coefficient, showing the data point distribution and the fitting curves of different formulas.
The core academic breakthrough of the dual-path physical-data hybrid framework lies in achieving the paradigm shift from fixed physical constraints to discoverable physical structures in PGML, providing a new paradigm for the role of machine learning in thermodynamic research. Traditional PGML embeds physical laws as external static constraints in models, essentially still limited to data fitting and surrogate prediction. In contrast, this framework transforms physical laws into inherently optimizable structures through the collaborative design of a differentiable physical kernel and spatiotemporal neural parameterizer, allowing machine learning to actively participate in the discovery of thermodynamic parameterization mechanisms. This shift not only improves prediction accuracy but also achieves a leap from "black-box prediction" to "mechanism discovery." The newly discovered regional-adaptive turbulence exchange coefficient relationship compensates for the inability of traditional empirical formulas to adapt to complex surface mixed regions and provides direct theoretical support and data for improving the parameterization system of regional thermodynamic energy balance models. It deepens our understanding of the coexistence of multiple driving mechanisms of energy flow transmission under complex surface conditions.
Compared with existing studies, this framework shows significant advantages in the synergistic improvement of physical consistency, generalization, and interpretability. Compared to traditional PGML models, this framework, through the adaptive learning of the neural parameterizer, eliminates the reliance on fixed empirical parameters, achieving more than 30% improvement in prediction accuracy in heterogeneous surface regions and a 50% reduction in accuracy degradation rate under extreme scenarios, highlighting the enhancement of physical models by data-driven adaptive capability. Compared to purely data-driven models, the embedded thermodynamic conservation constraints fundamentally avoid non-physical predictions under extreme conditions, with energy conservation errors controlled within 5 W·m⁻², showing significantly better physical consistency than existing models. Furthermore, the design of the framework in terms of high-dimensional climate data fusion efficiency and uncertainty traceability is unique: the cross-modal attention fusion module achieves precise weighting of multi-source climate variables, enhancing the identification efficiency of key thermodynamic driving factors; Monte Carlo Dropout, deeply integrated with the parameterization process, enables uncertainty quantification to directly trace back to the thermodynamic mechanism complexity of specific grid units, providing clear guidance for future research focus.
This study still has some limitations, which point out directions for future improvements. The parameterization accuracy in extremely complex terrain regions still has room for improvement, as the influence of terrain dynamics on turbulence exchange has not been fully captured, leading to less precise spatial heterogeneity of parameterization terms. In long-term autoregressive predictions, the cumulative effect of energy balance residuals causes prediction bias to gradually increase over time, affecting the reliability of long-term simulations. The preset analytical function form in the discovery loss has prior dependencies, which may limit the discovery of more complex parameterization mechanisms. To address these issues, future improvements can focus on three aspects: (1) Introducing a terrain feature extraction module based on graph neural networks, which will integrate terrain dynamic parameters to improve the driving factor system for parameterization terms. (2) Designing a residual correction mechanism, which will adjust the boundary conditions and parameter fields of long-term simulations through real-time observational data feedback, helping to suppress the accumulation of errors. (3) Using Bayesian optimization combined with symbolic regression methods to build a search mechanism for analytical functions without prior constraints, enhancing the model's ability to discover complex thermodynamic mechanisms. Additionally, the framework can be expanded to global-scale thermodynamic energy flow simulations, further testing its generalization ability and application value.
This study proposed a dual-path physics-data hybrid framework that achieves deep synergistic integration of high-dimensional climate data and thermodynamic physical laws. The core innovation lies in constructing a collaborative architecture of differentiable physical kernels and spatiotemporal neural parameterizers, which upgrades physical laws from traditional external fixed constraints to an internally discoverable and interpretable structure within the model. This effectively solved the challenges of unknown thermodynamic parameterizations and spatial heterogeneity adaptation under complex underlying surface conditions. The integration of multi-objective progressive learning strategies and closed-loop uncertainty quantification modules further strengthened the model's physical consistency and mechanism discovery capabilities, forming a complete technological system of "physical skeleton-data flesh-mechanism discovery."
Experimental results show that this model outperforms traditional PGML and CNN-LSTM in both spatiotemporal scale and extreme scenario validations. The MSE of sensible heat flux in heterogeneous urban-rural transition zones at the daily scale is reduced by 40.6% compared to traditional PGML. The advantage remains stable at the weekly scale, and the accuracy degradation rate under high temperature and strong wind conditions is far lower than that of the comparison models. Ablation experiments show that after removing the neural parameterizer, the MSE of sensible heat flux increases by 75.7%. The physical residual regularization ensures energy conservation, and the collaborative components enhance model performance. The discovery of regional thermodynamic mechanisms shows that the turbulence exchange coefficient follows the order: water > bare land > forest > farmland > urban built-up area. The newly discovered parameterization relationship has a goodness of fit (R²) of 0.89, significantly improving compared to traditional formulas, and clarifies the regional differences in energy flow drivers, such as thermal convection dominance in plain farmland and sensible heat dominance in urban built-up areas.
The academic contribution of this study lies in providing a new method for regional thermodynamic energy flow pattern prediction, advancing the paradigm shift in machine learning from alternative prediction to mechanism discovery and model enhancement in the field of thermodynamics. The practical value is reflected in how the discovered regional parameterization mechanism provides direct data support and theoretical basis for improving traditional climate model parameterization schemes. This is of great significance for enhancing climate simulation reliability, optimizing extreme weather warnings, and ecological environment assessments. Future research can further expand the model's scale adaptability, improve parameterization systems under complex terrain conditions, and promote its application in thermodynamic system simulations across larger areas.
This work was funded by the Jiangxi Provincial Department of Science and Technology (Grant Nos.: S2021ZXXMC0272, S2023ZXXMC0288); the Project of Jiangxi Provincial Department of Education (Grant No.: GJJ2405305); and the Science and Technology Project of Jiangxi Provincial Department of Transportation (Grant Nos.: 2025YB032, 2024ZG015, 2024YB009, 2024YB008, 2022H0026).
[1] Fisher, J.B., Melton, F., Middleton, E., Hain, C., Anderson, M., Allen, R., Wood, E.F. (2017). The future of evapotranspiration: Global requirements for ecosystem functioning, carbon and climate feedbacks, agricultural management, and water resources. Water Resources Research, 53(4): 2618-2626. https://doi.org/10.1002/2016WR020175
[2] Kashinath, K., Mustafa, M., Albert, A., Wu, J.L., Jiang, C., Esmaeilzadeh, S., Prabhat, N. (2021). Physics-informed machine learning: Case studies for weather and climate modelling. Philosophical Transactions of the Royal Society A, 379(2194): 20200093. https://doi.org/10.1098/rsta.2020.0093
[3] Watt-Meyer, O., Brenowitz, N.D., Clark, S.K., Henn, B., Kwa, A., McGibbon, J., Bretherton, C.S. (2024). Neural network parameterization of subgrid-scale physics from a realistic geography global storm-resolving simulation. Journal of Advances in Modeling Earth Systems, 16(2): e2023MS003668. https://doi.org/10.1029/2023MS003668
[4] Dueben, P.D., Bauer, P. (2018). Challenges and design choices for global weather and climate models based on machine learning. Geoscientific Model Development, 11(10): 3999-4009. https://doi.org/10.5194/gmd-11-3999-2018
[5] Beucler, T., Pritchard, M., Rasp, S., Ott, J., Baldi, P., Gentine, P. (2021). Enforcing analytic constraints in neural networks emulating physical systems. Physical Review Letters, 126(9): 098302. https://doi.org/10.1103/PhysRevLett.126.098302
[6] Hess, P., Aich, M., Pan, B., Boers, N. (2025). Fast, scale-adaptive and uncertainty-aware downscaling of Earth system model fields with generative machine learning. Nature Machine Intelligence, 7(3): 363-373. https://doi.org/10.1038/s42256-025-00980-5
[7] Le Bras, P., Sévellec, F., Tandeo, P., Ruiz, J., Ailliot, P. (2024). Selecting and weighting dynamical models using data-driven approaches. Nonlinear Processes in Geophysics, 31(3): 303-317. https://doi.org/10.5194/npg-31-303-2024
[8] Jung, M., Schwalm, C., Migliavacca, M., Walther, S., Camps-Valls, G., Koirala, S., Reichstein, M. (2020). Scaling carbon fluxes from eddy covariance sites to globe: synthesis and evaluation of the FLUXCOM approach. Biogeosciences, 17(5): 1343-1365. https://doi.org/10.5194/bg-17-1343-2020
[9] Zscheischler, J., Martius, O., Westra, S., Bevacqua, E., Raymond, C., Horton, R.M., Vignotto, E. (2020). A typology of compound weather and climate events. Nature Reviews Earth & Environment, 1(7): 333-347. https://doi.org/10.1038/s43017-020-0060-z
[10] Manikkan, S., Srinivasan, B. (2023). Transfer physics informed neural network: A new framework for distributed physics informed neural networks via parameter sharing. Engineering with Computers, 39(4): 2961-2988. https://doi.org/10.1007/s00366-022-01703-9
[11] Fast, J.D., Berg, L.K., Feng, Z., Mei, F., Newsom, R., Sakaguchi, K., Xiao, H. (2019). The impact of variable land-atmosphere coupling on convective cloud populations observed during the 2016 HI-SCALE field campaign. Journal of Advances in Modeling Earth Systems, 11(8): 2629-2654. https://doi.org/10.1029/2019MS001727
[12] Behrens, G., Beucler, T., Iglesias-Suarez, F., Yu, S., Gentine, P., Pritchard, M., Eyring, V. (2025). Simulating atmospheric processes in Earth system models and quantifying uncertainties with deep learning multi-member and stochastic parameterizations. Journal of Advances in Modeling Earth Systems, 17(4): e2024MS004272. https://doi.org/10.1029/2024MS004272
[13] Schneider, T., Lan, S., Stuart, A., Teixeira, J. (2017). Earth system modeling 2.0: A blueprint for models that learn from observations and targeted high-resolution simulations. Geophysical Research Letters, 44(24): 12-396. https://doi.org/10.1002/2017GL076101
[14] Han, Y., Zhang, G.J., Huang, X., Wang, Y. (2020). A moist physics parameterization based on deep learning. Journal of Advances in Modeling Earth Systems, 12(9): e2020MS002076. https://doi.org/10.1029/2020MS002076
[15] Hassanzadeh, P., Kuang, Z. (2016). The linear response function of an idealized atmosphere. Part I: Construction using Green’s functions and applications. Journal of the Atmospheric Sciences, 73(9): 3423-3439. https://doi.org/10.1175/JAS-D-15-0338.1
[16] Gentine, P., Pritchard, M., Rasp, S., Reinaudi, G., Yacalis, G. (2018). Could machine learning break the convection parameterization deadlock? Geophysical Research Letters, 45(11): 5742-5751. https://doi.org/10.1029/2018GL078202
[17] Cheruy, F., Campoy, A., Dupont, J.C., Ducharne, A., Hourdin, F., Haeffelin, M., Idelkadi, A. (2013). Combined influence of atmospheric physics and soil hydrology on the simulated meteorology at the SIRTA atmospheric observatory. Climate Dynamics, 40(9): 2251-2269. https://doi.org/10.1007/s00382-012-1469-y
[18] Reichstein, M., Camps-Valls, G., Stevens, B., Jung, M., Denzler, J., Carvalhais, N., Prabhat, F. (2019). Deep learning and process understanding for data-driven Earth system science. Nature, 566(7743): 195-204. https://doi.org/10.1038/s41586-019-0912-1
[19] Schulz, J., Albert, P., Behr, H.D., Caprion, D., Deneke, H., Dewitte, S., Zelenka, A. (2009). Operational climate monitoring from space: The EUMETSAT Satellite Application Facility on Climate Monitoring (CM-SAF). Atmospheric Chemistry and Physics, 9(5): 1687-1709. https://doi.org/10.5194/acp-9-1687-2009
[20] Wu, S., Zhang, X., Bao, S., Dong, W., Wang, S., Li, X. (2023). Predicting ocean temperature in high-frequency internal wave area with physics-guided deep learning: A case study from the South China Sea. Journal of Marine Science and Engineering, 11(9): 1728. https://doi.org/10.3390/jmse11091728
[21] Bauer, P., Thorpe, A., Brunet, G. (2015). The quiet revolution of numerical weather prediction. Nature, 525(7567): 47-55. https://doi.org/10.1038/nature14956