A Garbage Classification and Environmental Impact Assessment Model Based on Image Recognition and Artificial Intelligence

A Garbage Classification and Environmental Impact Assessment Model Based on Image Recognition and Artificial Intelligence

Rong Lin

School of Law, Anhui Normal University, Wuhu 241000, China

Corresponding Author Email: 
linrongdida@ahnu.edu.cn
Page: 
3001-3010
|
DOI: 
https://doi.org/10.18280/ts.410618
Received: 
9 June 2024
|
Revised: 
12 October 2024
|
Accepted: 
7 November 2024
|
Available online: 
31 December 2024
| Citation

© 2024 The author. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).

OPEN ACCESS

Abstract: 

With the rapid urbanization process, waste management has become a significant environmental issue globally. Waste sorting, as an effective method of resource recycling and environmental protection, has gradually become a key solution to the waste pollution problem. Traditional waste classification methods rely on manual labor, which is inefficient and prone to errors, making them inadequate for modern urban waste management. In recent years, image recognition and artificial intelligence (AI)-based methods for waste classification have gained widespread attention, with deep learning techniques, particularly Convolutional Neural Networks (CNNs), showing great potential in waste sorting. However, existing research on waste classification models faces challenges such as imperfect network structures, insufficient training data, and poor environmental adaptability, which limit their application in complex environments. This study proposes a waste classification model based on image recognition and AI to enhance classification accuracy and efficiency. First, an improved PCANet and SDenseNet network structure is combined to propose a new feature extraction and representation method, enhancing the model’s feature learning ability. Secondly, a layered learning strategy, combined with the traditional backpropagation algorithm, is used to optimize the training process and improve learning efficiency. Finally, experimental results demonstrate that the proposed waste classification model significantly outperforms traditional models in classification accuracy and processing capability in various environments, providing a new solution for the advancement of waste classification technologies.

Keywords: 

waste classification, image recognition, Artificial Intelligence (AI), deep learning, network structure, backpropagation

1. Introduction

With the continuous advancement of urbanization, waste management has become a significant issue faced by modern society. Waste classification not only helps with the effective recycling and reuse of resources but also reduces pollution caused by landfilling and incineration, thus improving the ecological environment and the quality of human life [1-4]. However, traditional waste classification methods rely on manual identification and sorting, which are inefficient and prone to errors [5-7]. In recent years, waste classification technology based on image recognition and AI has gradually become an important means to address this issue [8-10]. Through image recognition technology, automatic classification of waste not only improves classification efficiency but also enables more precise processing, thereby reducing human labor intensity and minimizing potential errors in the waste sorting process.

In the fields of waste classification and environmental protection, AI technologies, especially deep learning and image recognition technologies, have shown great potential [11, 12]. The use of deep neural networks for image feature extraction and classification has become the mainstream direction in current waste classification research. Related studies have shown that using deep learning models for waste image recognition can significantly improve classification accuracy and processing efficiency [13-17]. However, existing waste classification systems still face certain challenges in terms of model accuracy, training efficiency, and environmental adaptability. These challenges mainly arise from imperfect network structures, limitations in feature extraction, and issues like overfitting and insufficient generalization ability during the training process [18, 19]. Therefore, further improving the network architecture and training strategy of waste classification models and optimizing classification results is of significant importance for enhancing the practicality and popularization of waste classification technologies.

Although many deep learning-based waste classification methods currently exist, they still have some shortcomings. For example, many existing models struggle to achieve ideal classification results when faced with waste images in complex environments, especially in cases of lighting variation, occlusion, and the diversity of waste types [20]. In addition, in existing methods, the model training process often relies on a large amount of labeled data and lacks effective adaptive strategies to address the balance between classification accuracy and computational efficiency. Therefore, designing a waste classification model that can both improve accuracy and perform well in complex environments has become a key issue that needs to be solved.

The main research content of this paper includes three aspects. First, a waste classification model network structure based on the improved PCANet and SDenseNet cascade is proposed, which enhances the model’s ability in feature extraction and representation by combining the advantages of both. Second, the study explores the training process that combines the layered learning strategy of the improved PCANet with the traditional backpropagation of SDenseNet cascade, thus improving the model's learning efficiency and accuracy. Finally, through experiments, the practical effect of the proposed model in waste classification tasks is validated, demonstrating its potential for application in different environments and scenarios. This research not only provides new ideas for the optimization of waste classification technology but also offers theoretical support and technical assurance for the construction of environmental protection and intelligent waste management systems.

2. Network Structure of the Waste Classification Model

Figure 1. Schematic diagram of the improved PCANet network structure

The waste classification model proposed in this paper combines the improved PCANet and SDenseNet cascade structure, aiming to improve the feature extraction ability and classification accuracy in waste classification tasks. In waste classification tasks, the diversity and complexity of images require the model to have strong feature learning ability. The improved PCANet can effectively avoid information loss in this regard, enhancing the model’s adaptability to different types of waste images, particularly when facing interference from different angles, lighting, and background noise, showing higher robustness. As a sub-network, SDenseNet further extracts high-level features based on the low-level features extracted by the improved PCANet. Through the alternating cascade design of dense modules and transition modules, SDenseNet not only enhances the transfer and reuse of features but also improves the network's expressive power by introducing deeper nonlinear mappings. This design not only improves the model's ability to handle complex backgrounds and details of waste images but also, through global average pooling (GAP) and fully connected layers, ultimately produces accurate classification results. Figure 1 shows the schematic diagram of the improved PCANet network structure.

In the first layer of the improved PCANet, each convolutional kernel group contains multiple convolutional templates, each with a size of j1×j2. These convolutional templates perform convolution operations on the original image to extract vh groups of feature maps. The number of convolution kernels in each group and the size of the convolutional templates can be set according to actual needs to achieve effective feature extraction for different types of waste images. The parallel application of multiple sets of convolution kernels enables the network to simultaneously extract image features in multiple directions and scales, capturing the diversity and detail information of the image. In particular, when dealing with complex backgrounds and diverse waste categories, it can maintain a high feature extraction ability. Through the "channel-wise pooling" operation, the network further enhances the nonlinear expression ability of the feature maps, while compressing the number of feature maps to effectively avoid information redundancy and reduce computational complexity. Let the first layer have vh groups of convolution kernels {Jn,h|n=1,2,…,vn,h= 1,2,.,vh}, and the n-th feature map of the h-th group be represented as Du,n,h. The vh groups of feature maps extracted from the original image Uu(u=1,2,...,vu) can be expressed as:

$D_{u, n, h}=U_u \otimes J_{n, h}$     (1)

Assuming that the absolute value and maximum value functions are represented by ABS(·) and MAX(·), the combined feature map Du,n can be expressed as:

$D_{u, n}=\underset{g \in\left[1,2, \ldots, v_h\right]}{M A X}\left(A B S\left(D_{u, n, h}\right)\right)$     (2)

It is clear that vn is the number of Du,n.

In the second layer of the improved PCANet, the input is the feature map Du,n from the first layer, where vh is consistent with the first layer, but the number of convolution kernels vn' and the size of the convolution kernels j1×j2 can be adjusted according to task requirements. The design of the convolutional kernel groups and templates in this layer allows the network to further extract higher-level and more abstract image features from the low-level features extracted in the first layer. Each convolution operation produces a feature map, which, after "channel-wise pooling," can effectively combine multiple feature maps into new combined feature maps. Thus, the output of the second layer not only increases the number of feature maps but also, through the combination of different convolution kernel groups, further enhances the network's ability to express image features. This layer's design enables the improved PCANet to better adapt to the complex waste image features in waste classification tasks, such as diverse shapes, colors, and materials, and to gradually build accurate classification abilities for waste types from low-level to high-level features. Specifically, let the convolution kernels in this layer be {Jnh|n′=1,2,…,vn,g=1,2,…,vh}, and, referring to Eq. (1), the rich feature maps D'u,n,n′,h=Du,nJnH can be obtained. Referring to equation (2), the combined feature maps of the second layer {D'u,n,n|n=1,2,…,vn,n'=1,2,…,vn'} can be obtained.

By merging the original image Uu with Du,n and D'u,n,n along the channels, the final output PLu of the improved PCANet is obtained:

$P_u^L=\operatorname{CONCAT}\left(U_u, D_{u, n}, D^{\prime} u_{i, n, n^{\prime}}\right)$     (3)

From the above, it can be seen that the dimensions of PLu are consistent with those of Uu.

In the waste classification model proposed in this paper, SDenseNet is designed to further extract low-level feature maps obtained from the improved PCANet and ultimately complete the waste classification task. The core structure of SDenseNet is alternately composed of dense modules and transition modules, with each dense module consisting of two cascaded "BN-Conv-LReLU" convolution units. Each convolution unit contains batch normalization, convolution, and layers with Leaky ReLU activation functions, aiming to enhance the network's nonlinear expression ability and accelerate the training process. For the waste classification task, the dense module can utilize detailed information in the low-level feature maps and gradually build high-level feature representations capable of distinguishing different types of waste. The dense connections and channel concatenation characteristics of SDenseNet allow the network to efficiently extract and utilize various useful features from images, especially when dealing with waste of different materials, shapes, and colors, demonstrating strong robustness and adaptability. In the SDenseNet structure, the transition module plays the role of feature combination and downsampling. The transition module, through "BN-Conv-LReLU" units combined with average pooling layers, further compresses and reduces the dimensions of the features processed by the dense module. However, the transition module does not compress the number of channels, which helps retain more feature information and improves the network's classification accuracy for waste images. Due to the alternating design of dense modules and transition modules, SDenseNet can effectively construct rich feature representations while controlling the network's computational complexity and parameter count. This makes it suitable for practical problems such as waste classification, especially when high efficiency is needed to process large volumes of image data, providing an excellent balance of performance and efficiency. Figure 2 shows the schematic diagram of the SDenseNet network structure.

Figure 2. Schematic diagram of the SDenseNet network structure

In the waste classification model proposed in this paper, SDenseNet adopts several design strategies to improve feature extraction ability while reducing the number of parameters, thereby enhancing the model’s efficiency and accuracy.

(1) SDenseNet uses only two small-sized convolution kernels, 1×1 and 3×3, in the convolutional layers. The 1×1 convolution kernel is primarily used to control the number of feature maps and effectively reduce the computational load. This design helps maintain sufficient feature representation capabilities while significantly reducing the model's complexity, making it suitable for the challenges of large datasets and complex backgrounds in waste classification tasks. The 3×3 convolution kernel is used to learn new feature maps and capture local information in the image, which is crucial for recognizing features like the shape and material of waste.

(2) SDenseNet frequently employs skip connections to reuse the feature maps that have already been extracted. In waste classification tasks, the diversity and complexity of image features require the model to effectively leverage the existing feature maps for deeper learning. Skip connections, by connecting feature maps from different layers, enhance the flow and fusion of information between layers, thus improving the model's adaptability to complex features in waste images and its classification accuracy.

(3) SDenseNet uses a GAP layer to compress features between the fully connected layer and the last convolution layer. This design is particularly important for waste classification tasks, as it effectively reduces model complexity, avoids introducing excessive redundant parameters during classification, and improves computational efficiency while preventing overfitting.

(4) SDenseNet uses only a single fully connected layer, which is employed to adjust the network’s output to match the number of categories in waste classification. In waste classification tasks, the number of categories is typically limited, so a single fully connected layer is sufficient to effectively adjust the output while maintaining the network’s simplicity and efficiency. Through this design, SDenseNet not only optimizes the parameter count but also significantly improves computational efficiency while maintaining high classification accuracy, making it suitable for large-scale waste classification tasks.

In conclusion, SDenseNet, through these four design strategies, ensures effective feature extraction while reducing the number of parameters. For the problem of waste classification, especially when handling large volumes of images of different types of waste, these designs provide an efficient, precise, and computationally resource-friendly solution.

3. Learning Process of the Waste Classification Model

The learning process of the waste classification model involves completing the learning steps of the improved PCANet and the training steps of SDenseNet in sequence.

In the proposed waste classification model, the improved PCANet extracts image features in the first layer through unsupervised learning. This process aims to automatically learn discriminative convolutional kernels through region segmentation and clustering of the image. Specifically, for each training image, it is first non-overlappingly divided into multiple small regions, with each region representing a part of the image that reflects the local features of the image. The image content of these regions varies significantly and may represent different waste categories or background information. To effectively extract features from these different regions, the improved PCANet utilizes the K-means clustering algorithm to classify all regions, thus grouping similar regions together to form multiple distinct sets of regions. The image regions in each set have similar visual features, which help the network capture common features of different waste categories. Figure 3 shows a schematic diagram of the improved PCANet first-layer convolutional kernel learning process.

Based on these region sets, the improved PCANet next learns convolution kernels for each region set. Since the data in each region set have variations, the convolution kernel corresponding to each clustered set will have different feature expressions, which allows the improved PCANet to learn a rich variety of convolution kernels. In the waste classification task, these convolution kernels not only extract low-level features such as texture, shape, and edge information from waste images, but also, through the aggregation of multiple region sets, extract unique features for different waste categories.

Figure 3. Schematic diagram of the learning process of the first layer convolution kernels in the improved PCANet

For example, in the case of the h-th region set, suppose this set contains ve image regions. Each image region is split pixel by pixel into image patches of size j1×j2. These image patches form the set O, where each image patch is a 2D matrix representing the local information of the image. By vectorizing these image patches, each patch is converted into a 1D vector, resulting in the form Oo. Next, the mean vector O- of all image patches is calculated, and each image patch vector is mean-centered. All processed vectors are then combined into a data matrix W=[O1-o-,O2-o-,...,Ovo-o-], which contains the feature information of all image patches. In this way, PCANet can extract local features from each small region of a waste image, providing the foundation for subsequent feature learning and classification. The Principal Component Analysis (PCA) algorithm is then applied to matrix W to find a mapping matrix N, such that the result after dimensionality reduction and recovery is as close as possible to the original data. The core idea of PCA is to identify the principal components in the data, project the data into a lower-dimensional space, retain the most important feature information, and discard redundant or unimportant information. Assuming that the number of columns in the mapping matrix N is vn, the unit matrix of size v×v is Uvn, and the Frobenius norm is represented by ||·||, its mathematical expression is:

$\underset{N \in E^{j_1 \times j_2\times v_n}}{\operatorname{Ni}}\left\|W-N N^T W\right\|_F^2, N^T N=U_{v n}$     (4)

Solving this equation requires performing an eigenvalue decomposition on the matrix WWT to obtain the largest vn eigenvalues. The eigenvectors corresponding to these eigenvalues are then combined to form the solution N=[N1,N2,…Nn,…,Nvn]. Suppose that the n-th convolution kernel obtained from the h-th region set is represented by Jn,h, where the position coordinates of the value Jn,h(u,k) in the kernel are (u,k). Each feature vector Nn can be transformed into a convolution kernel through the following mapping:

$J_{n, h}(u, k)=N_n\left((k-1) \times j_1+u\right)$     (5)

Other region sets of waste images are computed similarly to obtain vh sets of convolution kernels.

Because different types of waste have significant local differences in appearance, such as shape, color, texture, and other characteristics, the improved PCANet processes each image block by using the overall mean O- of all image blocks when performing mean subtraction on each vector Oo, rather than calculating and subtracting the mean for each image block individually. Through this method, the improved PCANet can better preserve the differences between different regions of waste images when constructing the data matrix W. In addition, when constructing the data matrix W, the improved PCANet uses the overall mean for processing before performing PCA dimensionality reduction, which allows the distribution of the data matrix to better reflect the global information and feature differences between different regions. In this case, the PCA algorithm can identify the main eigenvalues and eigenvectors of the matrix WWT and extract the most discriminative feature components. This process can effectively preserve the most representative and distinctive features in the image, thus helping the improved PCANet to learn more diverse and discriminative convolution kernels. Let ηn and Nn be the eigenvalues and eigenvectors of the matrix WWT, respectively, then the following equation holds:

$W W^T=\eta_n N_n$     (6)

Assume that the element of matrix W in the u-th row and o-th column is Wu,o, and the u-th element in Nn is NuNn(u=1,2,...,j1j2,n=1,2,...,vn). Then:

$\left[\begin{array}{ccc}\sum_o W_{1, o} W_{1, o} & \cdots & \sum_o W_{1, o} W_{j_1 j_2, o} \\ \vdots & \ddots & \vdots \\ \sum_o W_{j_1 j_2, o} W_{1, o} & \cdots & \sum_o W_{j_1 j_2, o} W_{j_1 j_2, o}\end{array}\right]\left[\begin{array}{c}n_1 \\ \vdots \\ n_{j_1 j_2}\end{array}\right]=\eta_n\left[\begin{array}{c}n_1 \\ \vdots \\ n_{j_1 j_2}\end{array}\right]$     (7)

Both sides of the above equation represent column vectors of length j1 and j2. If j1=j2, their 1-norms will also be equal, i.e.,

$\sum_k \sum_u\left(n_u\left(\sum_o W_{k, o} W_{u, o}\right)\right)=\eta_n\left\|N_n\right\|_1$    (8)

By equivalently reconstructing the left-hand side of the equation, we get:

$\sum_u\left(n_u\left(W_{u, o} \sum_k W_{k, o}\right)\right)=\eta_n\left\|N_n\right\|_1$     (9)

In the original PCANet, when constructing the data matrix W, it used the method of subtracting the mean of each image patch Oo. This processing method caused the sum of the column vectors of matrix W to be zero, i.e., ∑ kWk,o=||Oo||1-j1j2O-=0, which resulted in the constraint that the sum of the elements in the feature vector would be zero. This constraint made the convolution kernels Jn,h derived from the original PCANet only calculate the gradient values of the image in different directions and fail to effectively extract other key features, such as grayscale and shape. In waste classification tasks, this limitation is detrimental to capturing the complex details and rich local features of images, which could lead to reduced classification performance. In contrast, the improved PCANet constructs matrix W by subtracting the overall mean O- for each image patch Oo. This method avoids the zero-sum constraint in the original PCANet, ensuring that the sum of most column vectors in matrix W is non-zero, i.e., ∑ kWk,o≠0. Therefore, the feature vectors computed by the PCA algorithm are no longer restricted by the zero-sum condition, meaning that the convolution kernels Jn,h can be more flexible and diverse during the learning process. More diverse convolution kernels can better capture the complex features in waste images, such as color, shape, texture, and other information, thereby improving the classification model’s performance.

In the waste classification model, the second-layer learning process of the improved PCANet is further abstraction and refinement of the first-layer features. The input to the second layer is the set of feature maps {Du,n} output by the first layer, where these feature maps have already extracted both local and global features from the image using the convolution kernels. In the second layer, to further mine higher-level features, each feature map Du,n is first divided into several non-overlapping small regions {Du,x,n}. These small regions represent local areas of the feature map and contain different visual information and patterns. By grouping these local regions further, the second layer can focus on extracting representative and discriminative features from different regions. It is important to note that since the learning objective of the second layer is to further integrate and refine the features extracted in the first layer, there is no need to perform clustering again in the second layer. Instead, the clustering results from the first layer are retained. The process of learning convolution kernels in the second layer is similar to the first layer, still using feature learning on region sets to construct the convolution kernels. The key improvement in this layer is that, by dividing the feature maps output by the first layer into regions and using the first layer’s clustering results, the model can better capture high-level common features across different waste types. For example, different types of waste may visually have similar texture patterns or shape features, which the first layer has already effectively extracted. The second layer, through further grouping and convolution kernel learning, can combine these low-level features to form more discriminative high-level features. In waste classification tasks, this layered learning approach allows the model to progressively focus on the most discriminative parts of the image, improving classification performance. This process can be represented as:

$\hat{U}_h^{\prime}=\left\{D_{u, x, n} \mid n=1,2, \ldots, v_n, U_{u, x} \in \hat{U}_h\right\}$     (10)

Similar to the first layer, convolution kernels {j'n',h|n'=2,…,vn',h=1,2,.,vh} are obtained for the second layer’s region sets in waste images through similar calculations.

In the proposed waste classification model, SDenseNet serves as the core network structure, handling efficient feature extraction and classification tasks. Its learning principle is tightly linked to the features extracted by the improved PCANet. The training images are batch-fed into the improved PCANet, where convolution kernel learning is performed through the first and second layers to extract discriminative feature maps. These feature maps are then passed as input to SDenseNet for further processing. The structure of SDenseNet inherits the characteristics of DenseNet, connecting each layer's output to the input of all subsequent layers through dense connections. This allows the network to effectively retain feature information while deepening the network without causing the vanishing gradient problem. During training, SDenseNet computes the predicted values through forward propagation and compares them to the true labels to calculate the cross-entropy loss. This loss is optimized using the Adam optimization algorithm, and the key step in the optimization process is to propagate the error signal backward through the network using the backpropagation algorithm, adjusting the network parameters layer by layer to minimize the loss function. Assuming the predicted probability that a waste image sample Uu belongs to the k-th class is represented by Wtu,k, and whether Uu belongs to the k-th class is represented by 1bs=k, with the batch size denoted as vy and the number of categories as Z, the loss function expression is:

$M_{Z R}=-\frac{1}{v_y} \sum_u^{v_y} \sum_k^Z 1_{b u=k} \log \left(P_{u, k}^T\right)$     (11)

Because of the dense connection structure of SDenseNet, each layer in the network can access all feature information from previous layers, thus improving the efficiency and stability of parameter updates. This connection method allows the network to better learn the complex features in the waste classification task during training and avoids the gradient vanishing or feature loss problems that occur in traditional deep neural networks. As training progresses, the network parameters are gradually adjusted, the loss stabilizes, and the model ultimately achieves accurate classification of waste images.

4. Environmental Impact Assessment Scheme Based on Waste Classification Results

In this paper, waste classification is not only aimed at improving resource recycling efficiency but also involves a comprehensive environmental impact assessment. Based on the results of waste classification, the environmental impact assessment scheme can be carried out from multiple dimensions, specifically assessing the impact on resource recycling, energy consumption, carbon emissions, and other aspects according to the classification results of different types of waste. By utilizing the proposed deep learning method, the classification results can accurately reflect the composition and treatment methods of various types of waste, thus providing effective data support for environmental impact assessments.

First, for recyclable wastes such as paper, plastic, and metals, the environmental impact assessment will focus on evaluating the potential for resource reuse. Recycling can effectively reduce the environmental burden of raw material extraction and production processes, as well as reduce energy consumption and greenhouse gas emissions. For example, recycling plastic can significantly reduce the demand for petroleum resources and lower carbon emissions during its production process. By accurately classifying recyclable materials, the model can assess the improvement in the recycling rate of this type of waste and predict its specific contribution to reducing environmental pollution and saving energy. Additionally, the assessment can calculate the environmental benefits based on the carbon footprint of different recycling methods, quantifying the environmental benefits brought by the recycling and treatment of recyclable materials.

For hazardous waste, such as batteries, fluorescent lamps, and expired medications, the environmental impact assessment focuses on reducing the threat of harmful substances to the ecological environment and human health. This type of waste contains heavy metals or toxic chemicals, which, if not properly treated, can severely pollute water sources, soil, and air. Based on the waste classification results, the evaluation model can reflect the improvement in the recycling rate of hazardous waste and further assess its effectiveness in preventing the spread of harmful substances during treatment. For example, recycling waste batteries can effectively prevent heavy metals such as lead and cadmium from entering the environment, reducing soil and water pollution. At the same time, the evaluation model can predict the secondary pollution risks during the treatment process according to the treatment methods of hazardous waste, providing policymakers with more scientific waste management and regulatory strategies.

For wet waste, such as food scraps and agricultural waste, the environmental impact assessment will focus on its role in organic waste treatment. Wet waste, through composting or anaerobic fermentation, can be converted into organic fertilizers or bioenergy, which not only reduces the pressure on landfills but also promotes the recycling of agricultural production and energy. The evaluation model can quantify the environmental benefits after converting wet waste into usable resources such as organic fertilizers or biogas based on the classification and treatment methods of wet waste. This process can effectively reduce greenhouse gas emissions while alleviating the occupation and pollution of land resources caused by landfills by reducing the area of landfill usage.

For non-recyclable waste, such as polluted plastics, cigarette butts, and broken ceramics, the environmental impact assessment primarily focuses on the environmental impacts generated during landfill or incineration. If this type of waste is not effectively classified and treated, it will directly lead to an increase in environmental burden. Through the classification results, the evaluation model can accurately predict the negative impacts of non-recyclable waste during landfill or incineration, such as waste gas emissions and energy consumption. Especially during incineration, toxic gases from the waste may pollute the air and water quality. Therefore, the evaluation model can calculate the impact of waste treatment on air quality and greenhouse gas emissions based on the quantity of non-recyclable waste and the treatment methods of incineration or landfill, thereby providing decision-making support for improving waste treatment facilities and optimizing waste treatment processes.

In conclusion, the environmental impact assessment scheme based on waste classification results not only focuses on the resource recycling potential of different types of waste but also includes the treatment of harmful substances, the resource utilization of wet waste, and the environmental risk assessment of non-recyclable waste. By accurately classifying with deep learning models, combined with the treatment methods of various types of waste, the evaluation model can more comprehensively quantify the specific impact of the waste classification and treatment process on the environment, providing scientific evidence for environmental policy formulation, waste management optimization, and sustainable development.

5. Experimental Results and Analysis

Figure 4. Total entropy values of combined feature maps at different layers of the improved PCANet with varying numbers of convolution kernel groups

According to the data shown in Figure 4, the entropy values of the first layer's convolution kernel groups demonstrate a certain fluctuation trend as the number of convolution kernel groups increases, but there is a noticeable overall increase. Specifically, from 1 convolution kernel group (with a maximum entropy value of 57) to 48 convolution kernel groups (with a maximum entropy value of 63), the entropy value gradually increases from 33 to 56, peaking at 62.5 with 16 convolution kernel groups. This indicates that more convolution kernel groups can effectively extract and represent more features. The entropy values of the second layer’s convolution kernel groups also show a similar trend. As the number of groups increases from 1 to 48, the entropy values fluctuate within the maximum range, gradually increasing from 290 to 545. This shows that the number of convolution kernels also influences feature extraction in the second layer. Although the minimum entropy value of the second layer's convolution kernels is always higher than that of the first layer, the increase is smaller, suggesting that the feature map combinations in the second layer may have already reached a stable state compared to the first layer.

Figure 5. Impact of different training sample sizes of waste images on classification performance of ResNet-50 and the proposed method

From the experimental results shown in Figure 5, as the number of training samples increases, the classification performance of both ResNet-50 and the proposed improved PCANet with the SDenseNet cascade model significantly improves. Specifically, the accuracy of ResNet-50 increases from 76% with 1000 samples to 80.4% with 5000 samples. In comparison, the performance of the proposed method consistently outperforms ResNet-50, with its performance improvement being more significant as the training sample size increases. The accuracy increases from 79.7% with 1000 samples to 82.4% with 5000 samples, always surpassing ResNet-50's accuracy at all sample sizes. When the sample size is small (e.g., 1000 samples), the proposed method has a slight advantage, but as the training sample size increases, the advantage of the proposed method over ResNet-50 becomes more apparent, especially with more than 3000 samples, demonstrating a more stable and continuous improvement in performance.

According to the experimental results in Table 1, the proposed method demonstrates relatively balanced performance in terms of both training time and testing speed. Compared to other mainstream waste classification models, the proposed method requires 156 epochs for training and takes 54.23 minutes. This is moderate compared to LeNet (12.36 minutes) and VGG-16 (58.64 minutes), neither too short nor excessively long like LeNet or VGG-16. The single image testing time is 11.25 milliseconds, which is close to SENet (11.24 milliseconds), slightly slower than LeNet (2.38 milliseconds) and VGG-16 (3.69 milliseconds), but faster than Wide ResNet (17.89 milliseconds) and FractalNet (17.85 milliseconds), showing better real-time performance.

Table 1. Comparison of training time and testing speed of different waste classification models

Metric

LeNet

VGG-16

Wide ResNet

FractalNet

SENet

The Proposed Method

Number of Epochs (times)

118

224

95

225

118

156

Training Time (min)

12.36

58.64

38.54

132.56

31.25

54.23

Single Image Testing Time (ms)

2.38

3.69

17.89

17.85

11.24

11.25

Figure 6. Training curve of the proposed method

According to the training data shown in Figure 6, as the number of training epochs increases, the loss values for both the training and validation sets gradually increase, and the accuracy trends for both sets exhibit a similar pattern. In the first epoch, the loss for both the training and validation sets is 0.82, with a training accuracy of 50%, and the validation accuracy is not provided. As the training progresses, the loss for both sets gradually increases, while the training accuracy fluctuates significantly. During the increase in training set loss, the accuracy is unstable, with the lowest accuracy being only 0.01. After the 76th epoch, the validation loss becomes relatively stable, remaining around 0.95. However, both the training and validation accuracies mostly stay at relatively low levels without showing significant improvements.

Figure 7. Validation accuracy curves of different waste classification models

According to the data in Figure 7, the validation accuracy trends of the proposed method and other mainstream waste classification models (such as LeNet, VGG-16, Wide ResNet, FractalNet, and SENet) show a relatively steady improvement across different training epochs. Specifically, the proposed method starts with a validation accuracy of 0.65 in the first epoch, and as the training progresses, the accuracy gradually increases, reaching 0.939 in the 151st epoch. In comparison, LeNet and VGG-16 show relatively low accuracy in the early stages, but as training progresses, their accuracy gradually improves, reaching 0.929 and 0.92, respectively, by the 151st epoch. Wide ResNet shows a rapid increase in accuracy in the early stages, maintaining a high level and finally reaching 0.92. FractalNet exhibits significant fluctuations in accuracy in the early stages, with a final accuracy of 0.927 in the 151st epoch. SENet's accuracy stabilizes in the later stages, but it remains relatively low, with a maximum accuracy of only 0.918. The proposed method demonstrates a steady and continuous improvement in accuracy, maintaining a higher performance compared to all the other models.

Table 2. Evaluation metrics of different waste classification models

Network Model

SEN (%)

SPE (%)

ACC (%)

AUC

p-value

LeNet

81.25

93.54

91.25

0.9785

<0.01

VGG-16

92.36

94.58

93.21

0.9784

0.0241

Wide ResNet

92.48

94.26

92.58

0.9786

<0.01

FractalNet

93.56

95.36

92.35

0.9712

0.0278

SENet

88.24

93.21

88.88

0.9785

<0.01

The Proposed Method

96.36

96.58

94.25

0.9884

<0.01

Based on the evaluation metrics provided in Table 2, the proposed waste classification model demonstrates superior performance across multiple key indicators. Specifically, the proposed method’s sensitivity (SEN), specificity (SPE), and accuracy (ACC) are 96.36%, 96.58%, and 94.25%, respectively, all of which are significantly better than those of the other models. For example, LeNet’s sensitivity is 81.25%, specificity is 93.54%, and accuracy is 91.25%, while VGG-16’s sensitivity is 92.36%, specificity is 94.58%, and accuracy is 93.21%. Wide ResNet’s metrics are 92.48%, 94.26%, and 92.58%. FractalNet’s sensitivity and specificity are 93.56% and 95.36%, but its accuracy is 92.35%, which is relatively low. SENet's performance is more moderate, with sensitivity of 88.24%, specificity of 93.21%, and accuracy of 88.88%. Furthermore, the AUC of the proposed method is 0.9884, which is significantly higher than other models, and the p-values for all models are less than 0.05, indicating statistical significance.

In the application experiment of the proposed waste classification model based on the improved PCANet and SDenseNet cascade, the waste classification results for beach and greenbelt environments were evaluated (refer to Figure 8). In the beach waste classification task, the model successfully identified various types of waste, such as plastic bags, cigarette butts, glass bottles, and metal cans, and classified them accurately. Most of the classification results had an accuracy above 90%. Particularly in identifying small debris such as cigarette butts and plastic particles, the model demonstrated high sensitivity and specificity, effectively avoiding misclassification. In the greenbelt waste classification task, the model also performed excellently, especially in identifying hard-to-detect waste accumulated in the grass. The feature fusion in SDenseNet enhanced the model’s classification ability in complex environments, achieving classification accuracy above 90%. Overall, the design of the improved PCANet and SDenseNet cascade significantly improved the model’s adaptability and classification accuracy in environmental contexts, especially when handling the diverse waste types found in natural environments. This model outperformed traditional waste classification models. The waste classification system based on this model not only provides accurate classification results but also offers reliable data support for environmental impact assessments. The improvement in classification accuracy directly helps in more precisely assessing the impact of waste accumulation in different environments and provides scientific data for ecological protection, resource recycling, and waste management, offering important insights for environmental management decision-making.

Figure 8. Example of waste classification results

6. Conclusion

The main research work of this paper revolves around the waste classification model based on the improved PCANet and SDenseNet cascade. A new network architecture is proposed, combining the advantages of PCANet in layer-by-layer learning strategies with the high efficiency of SDenseNet in feature fusion, resulting in significant performance in waste classification tasks. Through comparative experiments, this paper validates the model's accuracy and stability in various waste classification tasks, especially in complex environments such as beaches and greenbelts, showing superior performance over traditional models (e.g., LeNet, VGG-16, etc.). Specifically, the improved PCANet enhances the model’s feature extraction ability through its layer-by-layer learning strategy, while SDenseNet, with its deep feature cascading, effectively increases the model’s classification accuracy for diverse types of waste. The research also combines traditional backpropagation algorithms with layer-by-layer learning strategies, which effectively improve training efficiency, enabling the model to achieve high classification accuracy in a shorter time.

The proposed waste classification model based on the improved PCANet and SDenseNet cascade successfully addresses the challenges in waste classification tasks in complex environments, demonstrating strong performance and stability. The study shows that the model can significantly improve waste classification accuracy, especially in complex natural environments, such as beaches and greenbelts, with classification accuracy reaching or exceeding 90%. The value of this research lies not only in proposing a new deep learning architecture but also in its training strategy, which significantly improves learning efficiency and generalization ability, making it highly applicable in real-world scenarios. However, there are limitations in this study, such as potential classification errors for certain similar types of waste and challenges when dealing with extremely complex or noisy environments. Additionally, the training time and computational resource consumption of the model are areas for future optimization.

Future research can be expanded in several directions: first, further optimization can be conducted on the model’s performance in complex backgrounds, particularly when dealing with highly similar waste categories, exploring more refined feature extraction methods or incorporating multimodal data (such as sound, temperature, etc.) to assist in classification. Second, for training efficiency, more efficient training algorithms can be explored to reduce computational resource consumption and enhance training efficiency on large-scale datasets. Lastly, as waste classification applications become more widespread, cross-regional and cross-cultural application scenarios will present new challenges. Therefore, enhancing the model’s adaptability across different environments and cultural contexts will be an important direction for future research.

  References

[1] Rahmanda, B., Njatrijani, R., Fadillah, R. (2023). Environmental policy in managing e-waste recycling: Promoting a clean environment in public policy. International Journal of Sustainable Development and Planning, 18(1): 121-126. https://doi.org/10.18280/ijsdp.180112

[2] Mor, S., Ravindra, K. (2023). Municipal solid waste landfills in lower-and middle-income countries: Environmental impacts, challenges and sustainable management practices. Process Safety and Environmental Protection, 174: 510-530. https://doi.org/10.1016/j.psep.2023.04.014

[3] Sunardi, Yudhana, A., Fahmi, M. (2023). SVM-CNN hybrid classification for waste image using morphology and HSV color model image processing. Traitement du Signal, 40(4): 1763-1769. https://doi.org/10.18280/ts.400446

[4] Megouache, L., Sadouni, S., Zitouni, A., Djoudi, M. (2024). Design and evaluation of geographic information systems for environmental protection through data-driven decision making: A case study of solid waste management in Ali Mendjeli, Algeria. Journal of Urban Development and Management, 3(2): 109-119. https://doi.org/10.56578/judm030203

[5] Zhang, B., Lai, K.H., Wang, B., Wang, Z. (2019). From intention to action: How do personal attitudes, facilities accessibility, and government stimulus matter for household waste sorting? Journal of Environmental Management, 233: 447-458. https://doi.org/10.1016/j.jenvman.2018.12.059

[6] Gundupalli, S.P., Hait, S., Thakur, A. (2017). A review on automated sorting of source-separated municipal solid waste for recycling. Waste Management, 60: 56-74. https://doi.org/10.1016/j.wasman.2016.09.015

[7] Lange, J.P. (2021). Managing plastic waste-sorting, recycling, disposal, and product redesign. ACS Sustainable Chemistry & Engineering, 9(47): 15722-15738. https://doi.org/10.1021/acssuschemeng.1c05013

[8] Rakesh, U., Ramya, V., Murugan, V.S. (2023). Classification, collection, and notification of medical waste using IoT based smart dust bins. Ingénierie des Systèmes d’Information, 28(1): 149-154. https://doi.org/10.18280/isi.280115

[9] Liu, W., Ouyang, H., Liu, Q., Cai, S., Wang, C., Xie, J., Hu, W. (2022). Image recognition for garbage classification based on transfer learning and model fusion. Mathematical Problems in Engineering, 2022(1): 4793555. https://doi.org/10.1155/2022/4793555

[10] Khan, N., Kulkarni, K., Mahale, Y., Kolhar, S., Mahajan, S. (2024). Waste objects segregation using deep reinforcement learning with Deep Q Networks. Ingénierie des Systèmes d’Information, 29(6): 2219-2229. https://doi.org/10.18280/isi.290612

[11] Qin, J., Wang, C., Ran, X., Yang, S., Chen, B. (2022). A robust framework combined saliency detection and image recognition for garbage classification. Waste Management, 140: 193-203. https://doi.org/10.1016/j.wasman.2021.11.027

[12] Luo, Q., Lin, Z., Yang, G., Zhao, X. (2023). DEC: A deep-learning based edge-cloud orchestrated system for recyclable garbage detection. Concurrency and Computation: Practice and Experience, 35(13): e6661. https://doi.org/10.1002/cpe.6661

[13] Xu, H., Tang, W., Li, Z., Qin, K., Zou, J. (2024). Multimodal dual cross-attention fusion strategy for autonomous garbage classification system. IEEE Transactions on Industrial Informatics, 20(11): 13319-13329. https://doi.org/10.1109/TII.2024.3435508

[14] Li, M. (2024). Multilevel characteristic weighted fusion algorithm in domestic waste information classification. International Journal of Advanced Computer Science & Applications, 15(10): 214-223. https://doi.org/10.14569/ijacsa.2024.0151024

[15] Yuan, J.Y., Nan, X.Y., Li, C.R., Sun, L.L. (2020). Research on real-time multiple single garbage classification based on convolutional neural network. Mathematical Problems in Engineering, 2020(1): 5795976. https://doi.org/10.1155/2020/5795976

[16] Yang, Z., Bao, Y., Liu, Y., Zhao, Q., Zheng, H. (2022). Research on deep learning garbage classification system based on fusion of image classification and object detection classification. Mathematical Biosciences and Engineering, 20(3): 4741-4759. https://doi.org/10.3934/mbe.2023219

[17] Song, Y., He, X., Tang, X., Yin, B., et al. (2024). DEEPBIN: Deep learning based garbage classification for households using sustainable natural technologies. Journal of Grid Computing, 22(1): 2. https://doi.org/10.1007/s10723-023-09722-6

[18] Liang, G., Guan, J. (2024). FConvNet: Leveraging fused convolution for household garbage classification. Journal of Circuits, Systems & Computers, 33(8): 2450140. https://doi.org/10.1142/S0218126624501408

[19] Li, X., Li, T., Li, S., Tian, B., Ju, J., Liu, T., Liu, H. (2023). Learning fusion feature representation for garbage image classification model in human–robot interaction. Infrared Physics & Technology, 128: 104457. https://doi.org/10.1016/j.infrared.2022.104457

[20] Gupta, T., Joshi, R., Mukhopadhyay, D., Sachdeva, K., Jain, N., Virmani, D., Garcia-Hernandez, L. (2022). A deep learning approach based hardware solution to categorise garbage in environment. Complex & Intelligent Systems, 8(2): 1129-1152. https://doi.org/10.1007/s40747-021-00529-0