JOURNAL METRICS

CiteScore 2022: 1.8 ℹCiteScore:

CiteScore is the number of citations received by a journal in one year to documents published in the three previous years, divided by the number of documents indexed in Scopus published in those same three years.

SCImago Journal Rank (SJR) 2022: 0.232 ℹSCImago Journal Rank (SJR):

The SJR is a size-independent prestige indicator that ranks journals by their 'average prestige per article'. It is based on the idea that 'all citations are not created equal'. SJR is a measure of scientific influence of journals that accounts for both the number of citations received by a journal and the importance or prestige of the journals where such citations come from It measures the scientific influence of the average article in a journal, it expresses how central to the global scientific discussion an average article of the journal is.

Source Normalized Impact per Paper (SNIP) 2022: 0.578 ℹSource Normalized Impact per Paper(SNIP):

SNIP measures a source’s contextual citation impact by weighting citations based on the total number of citations in a subject field. It helps you make a direct comparison of sources in different subject fields. SNIP takes into account characteristics of the source's subject field, which is the set of documents citing that source.

123.png

Estimation and Analysis of Building Costs Using Artificial Intelligence Support Vector Machine

Zahra Salahaldain | Sepanta Naimi | Riyadh Alsultani^*

Department of Civil Engineering, Altinbas University, Istanbul 212, Turkey

Department of Building and Construction Techniques Engineering, Al-Mustaqbal University College, Hilla 51001, Iraq

Corresponding Author Email:

dr.riyadh.abdulabbas@uomus.edu.iq

Received:

20 October 2022

Revised:

12 December 2022

Accepted:

20 December 2022

Available online:

28 April 2023

| Citation

mmep_10.02_03.pdf

OPEN ACCESS

Abstract:

An essential component of the project feasibility assessment is the conceptual cost estimate. In actuality, it is carried out based on the estimator's prior expertise. However, budgeting and cost control are planned and carried out ineffectively as a result of inaccurate cost estimates. The purpose of this article is to introduce an intelligent model to improve modeling approaches accuracy throughout early phases of a project's development in the construction sector. A support vector machine model, which is computationally effective, is created to calculate the conceptual costs of building projects. To get accurate estimates, the suggested neural network model is trained using a cross-validation method. Through the research of the literature and interviews with experts, the cost estimate's influencing elements are determined. As training instances, the cost information from 40 structures is used. Two potent intelligence methods-Nonlinear Regression (NR) and Evolutionary Fuzzy Neural Interface Model (EFNIM)-are offered to illustrate how well the suggested model performs. Based on the readily accessible dataset from the relevant literature in the construction business, their results are contrasted. The computational findings show that the intelligent model that is being provided outperforms the other two potent methods. During the planning and conceptual design phase, the inaccuracy is satisfied for a project's conceptual cost estimate. Case studies demonstrate how SVMs may help planners anticipate the cost of construction in an effective and precise manner.

Keywords:

conceptual cost estimation, building cost, cross-validation, support vector machines, sustainable economic model

1. Introduction

Building cost conceptual estimates give planners a foundation for assessing the project's viability at the conceptual planning stage. The feasibility and profitability of a project are both significantly impacted by erroneous cost estimation. Low feasibility prevents clients from selling their existing projects because of overestimated expenses. On the other side, an underestimated cost can deceive planners into believing it is very feasible, which would result in extra expenditures for the customer at the building stage. Overestimated or underestimated expenses, therefore, have an impact on clients' earnings, necessitating the use of an accurate measurement technique [1].

Experience is the main focus of the conceptual cost estimate. Cost estimators are limited to basing their construction cost estimates during the conceptual planning phase on the initial design and project concepts. Cost estimators consult past situations when there is little information, and they then assess the conceptual cost in light of their prior experiences [2]. Nevertheless, a wide range of factors affects building costs. Some of these characteristics, including geological property and decorative class, are rife with ambiguity [3]. Estimators cannot correctly estimate building costs using a simple linear method since the evaluation process is so complicated and imprecise. Because of this, current building cost estimates are imprecise.

The related literature contains a variety of research methods to calculate the conceptual costs of building projects. Neural networks (NNs) have received a lot of attention in this sector, especially in the previous 10 years. An evolutionary conceptual cost estimation model for building was created using the "Evolutionary Fuzzy Neural Inference Model (EFNIM)". In the model, fuzzy input-output mapping is handled by neural networks, fuzzy logic is utilized to represent uncertainty and approximate reasoning, and genetic algorithms are predominantly employed for optimization. However, it takes a very considerable amount of time to conduct the computation to find the best answer [4]. Support vector machines (SVMs), a kind of artificial intelligence, were used to forecast the conceptual cost of building [5]. The SVM model was introduced for assessing the accuracy of conceptual cost estimates, and its use in the construction industry was looked at the research [6]. It was suggested to anticipate the cost of the construction project using an intelligent strategy based on the SVM. The outcomes demonstrated that the least square SVM's prediction accuracy was superior to the NN [7].

The appropriateness of any given estimating method depends typically on the purpose for which it is employed, the quantity of information available at the time of the estimation, and the person utilizing it. Although customers and contractors rely on existing cost prediction and forecasting techniques, real final building project prices still differ significantly from initial projections [8]. To suggest a general copula-based Monte Carlo simulation approach for forecasting the overall costs of building projects with dependent cost components. It was discovered that various dependency structures might produce various total cost probability distributions. Moreover, it is discovered that the current goodness of fit tests may be used to choose the copula that performs the best. The article concluded that the copula-based Monte Carlo simulation approach may reasonably anticipate the overall cost of construction projects [9, 10]. Firouzi et al. [11] contrast the precision of several estimation methodologies. By completing construction cost estimations using linear regression analysis (LRA) and support vector machine techniques (SVMs), the results highlight the urgent need for a solution to the problem of cost overruns.

To enhance the decision-making process in feasibility studies, a novel intelligent model for building projects based on an SVM model is implemented in this study. The suggested approach may be successfully used to anticipate conceptual costs over the long run in the building sector. In the SVM model, a cross-validation procedure is also utilized on the preparation dataset to minimize overfitting and to generate accurate results. The available information from the relevant literature in the construction sector is utilized to evaluate the suggested intelligent model's precision in estimating the expenses of these projects. The applicability and usefulness of the suggested approach are further demonstrated by comparisons with other well-known methodologies. The suggested model exhibited improved generalization performance and produced lower estimation errors. Given that these results pertain to actual situations from practice. An expansion of this research suggests improving the recommended model to boost yields and address this issue.

2. Proposed Model

It is usually hard to accurately calculate a model using cost time series data produced by linear methods. In the construction industry, the use of linear estimation techniques is not feasible [12-16]. Cost information is used in building projects in a complex and nonlinear way. Therefore, as conventional linear estimate approaches are inappropriate for these projects, it is crucial to develop cutting-edge methodologies like artificial intelligence for estimating time series data. In order to address the weaknesses of the widely-used strategies, this research reports on the cost estimate of these projects by suggesting a new, intelligent model built on two effective methodologies.

2.1 Support vector machine

Vapnik built the first SVMs on the back of SLT in the late 1960s. However, from the middle of the 1990s, as more computer power became available, the algorithms used for SVMs began to emerge, opening the door for several real-world applications with significant outcomes [16]. As shown in Figure 1 [17], the basic SVM works with two-class issues where the data are divided by a hyperplane that is specified by a set of support vectors. For the sake of thoroughness, a basic introduction to SVM is provided below. Readers can refer to the SVM tutorials [18] for detailed descriptions. A novel training method based on the SVM called least squares support vector machines (LS-SVM) only needs the answers to a few linear equations as opposed to the regular SVM's lengthy and computationally challenging quadratic programming issue. The least squares cost function is used by the LS-SVM [19].

figure_1.png

Figure 1. Nonlinear SVM [11]

Classification: The SVM is designed to learn a unique function that categorises training instances into different groups according to the class labels that have been provided to them. This viewpoint asserts that SVMs are a category of supervised learning model that are mostly employed to address classification and regression problems [20]. By converting the input vector x into a high-dimensional feature space, SVM models developed in the new space can represent a linear or nonlinear decision boundary in the old space. A linear hyperplane can be used to partition the instances if they are linearly separable; if not, the case is nonlinearly separable [20].

Consider a given set S with n labeled training instances: $\left\{\left(\mathrm{x}_1, \mathrm{y}_1\right),\left(\mathrm{x}_2, \mathrm{y}_2\right), \ldots,\left(\mathrm{x}_{\mathrm{n}}, \mathrm{y}_{\mathrm{n}}\right)\right\}$ for the linearly separable case. Each training instance $x_j \in R^k$, for i=1, …, n, belongs to one of the two classes in comparison to its label $\mathrm{y}_{\mathrm{i}} \in\{-1,+1\}$, where $\mathrm{k}$ is the input dimension. The following equation may be used to explain the greatest margin hyperplane:

$\mathrm{y}=\mathrm{b}+\sum \mathrm{w}_{\mathrm{i}} \mathrm{y}_{\mathrm{i}} \mathrm{x}(\mathrm{i}) \mathrm{x}$ (1)

where denotes a dot product, a test example is represented by the vector x , and support vectors are represented by the vectors x(i)s. The parameters, b and $\mathcal{w}_i$ in this equation, which determines the hyperplane, must be learned by the SVM. To create a perfect hyperplane, the following sequential quadratic programming (QP) problem must be solved:

$$\text { Minimize } \frac{1}{2}\|\mathrm{w}\|^2$$ Subject to $y_i\left(w x_i+b\right) \geq 1 \quad i=1, \ldots, n$ (2)

where, the kernel function is defined as $\mathrm{K}(\mathrm{x}(\mathrm{i}), \mathrm{x})$. There are several kernels available for creating the inner products needed to construct SVMs with various nonlinear decision surfaces. The most often used kernel functions are the Gaussian radial basis function $K(x, y)=\exp \left(-1 / \delta^2(x-y)^2\right)$ and the polynomial kernel $K(x, y)=(x y+1)^d$, where, d is the degree of the polynomial kernel and $\delta^2$ is the bandwidth of the Gaussian radial basis function [20].

Regression: A version of the SVM for regression has been included in the SVMs, which have been created for general estimation and prediction applications (LS-SVM). By reducing the prediction error, LS-SVM seeks to identify a function that accurately approximates the training examples. By simultaneously attempting to maximize the flatness of the function, the danger of over-fitting is reduced when the error is minimized. One must once again resolve the following quadratic programming issue to identify an ideal hyperplane:

$$\operatorname{Minimize} \frac{1}{2}\|\mathrm{w}\|^2$$ Subject to $\left\|y_i\left(w \cdot x_i+b\right)\right\| \leq \varepsilon$ (3)

where, the value $\varepsilon \geq 0$ is the prediction error bound. If $\mathrm{f}=\langle\mathrm{w}, \mathrm{x}\rangle+\mathrm{b}$ genuinely exists and approximates all pairs $\left(x_i, y_i\right)$ with $\mathcal{\varepsilon}$ precision, then the aforementioned convex optimization problem will be achievable. The following optimization problem has restrictions that are ordinarily impossible to satisfy, thus one inserts slack variables $\varphi_{\mathrm{i}}, \varphi_{\mathrm{i}}^*$ to deal with them:

$\operatorname{Minimize} \frac{1}{2}\|\mathrm{~W}\|^2+C \sum_{\mathrm{i}=1}^l\left(\varphi_{\mathrm{i}}, \varphi_{\mathrm{i}}^*\right)$

Subject to $\left\{\begin{array}{c}y_i-\left\langle w, x_i\right\rangle-b \leq \varepsilon+\varphi_i \\ \left\langle\mathrm{w}, \mathrm{x}_{\mathrm{i}}\right\rangle+\mathrm{b}-\mathrm{y}_{\mathrm{i}} \leq \varepsilon+\varphi_{\mathrm{i}}^* \\ \varphi_{\mathrm{i}}, \varphi_{\mathrm{i}}^* \geq 0\end{array}\right.$ (4)

The trade-off between f flatness and the maximum allowed deviation, $\varepsilon$, is calculated using the constant C. This optimization problem may be built as a dual problem by building the Lagrangian function:

$$\begin{aligned}\mathrm{L} & =\frac{1}{2}\|\mathrm{w}\|^2+\mathrm{C} \sum_{\mathrm{i}=1}^{\mathrm{l}}\left(\varphi_{\mathrm{i}}, \varphi_{\mathrm{i}}^*\right)-\sum_{\mathrm{i}=1}^{\mathrm{l}} \lambda_{\mathrm{i}}\left(\varepsilon+\varphi_{\mathrm{i}}-\right. \\\mathrm{y}_{\mathrm{i}} & \left.+\left\langle\mathrm{w}, \mathrm{x}_{\mathrm{i}}\right\rangle+\mathrm{b}\right)-\sum_{\mathrm{i}=1}^{\mathrm{l}} \lambda_{\mathrm{i}}^*\left(\varepsilon+\varphi_{\mathrm{i}}^*-\mathrm{y}_{\mathrm{i}}+\left\langle\mathrm{w}, \mathrm{x}_{\mathrm{i}}\right\rangle-\right.\end{aligned}$$ b) and $\lambda_{\mathrm{i}}, \lambda_{\mathrm{i}}^* \geq 0$ (5)

Solving the Lagrangian, one provides the optimal solutions $\mathrm{w}^*$ and $\mathrm{b}^*$:

$\begin{aligned} \mathrm{b}^*=\mathrm{y}_{\mathrm{i}}-\left\langle\mathrm{w}, \mathrm{x}_{\mathrm{i}}\right\rangle-\varepsilon, & 0 \leq \lambda_{\mathrm{i}} \leq \mathrm{c}, \mathrm{i}=1, \ldots, \mathrm{l}, \\ \mathrm{~b}^*=\mathrm{y}_{\mathrm{i}}-\left\langle\mathrm{w}, \mathrm{x}_{\mathrm{i}}\right\rangle+\varepsilon, & 0 \leq \lambda_{\mathrm{i}}^* \leq \mathrm{c}, \mathrm{i}=1, \ldots, \mathrm{l}\end{aligned}$ (6)

For nonlinear issues, the inner products can be changed by appropriate kernels, much like in classification. Enforcing the upper limit C on the absolute value of the coefficients $\mathrm{w}_{\mathrm{i}} \mathrm{s}$ allows us to keep an eye on the trade-off between reducing prediction error and maximizing the flatness of the regression function. The function may match the data more closely the greater C. The approach essentially does the least absolute-error regression with the coefficient size restriction in the degenerate situation where ε=0, regardless of the amount of C. In contrast, if is big enough, the error decreases to zero, and the algorithm returns the flattest curve that contains the data, regardless of the amount of C [20-22].

2.2 Cross-validation technique

Cross-validation [23] is a well-known technique for calculating generalisation mistakes. Among various cross-validation techniques, the k-Fold cross-validation approach is taken into consideration in this study. K-Fold cross-validation is one way to enhance the holdout strategy. Each of the k subgroups of the dataset uses the holdout method. Every time, one of the k subsets is used as the test set, and the remaining k-1 subsets are used to create the training set. Next, the average error over all k trials is calculated. The main benefit of this method is that it doesn't matter how the data is split up.

Each data set appears k times in the training set and exactly once in the test set. The variability of the resulting estimate diminishes as k is increased. By randomly dividing the data into test and training sets k times, this strategy may be changed. An example of how neural network architectures may be utilised to get the best generalisation is the k-fold cross-validation approach. In the pertinent literature, 3 and ten-fold cross-validation approaches were advised to estimate practical applications [24].

2.3 Intelligent proposed model

The proposed model is based on the LS-SVM and cross-validation, two effective techniques. The input-output mapping in this model is managed by the LS-SVM, which focuses on the features of cost data in building projects. The LS-SVM is trained using k-fold cross-validation to provide accurate findings and to enable a more realistic evaluation of the accuracy by splitting the whole dataset into numerous training and test sets. The suggested intelligent model may be a mix that is both suitable and efficient computationally for cost estimation of construction projects. The recommended model's steps are as follows: dividing the data into training and test sets is Stage 1: The test set is used to evaluate the model's performance after the training set has been used to create the LS-SVM model. Stage 2: Instructional data the model uses sequential data as training data. To help prevent numerical difficulties, the training data are now normalised into the same domain (0, 1) and the sequential data represents the desired attributes. The function used it to normalise data is demonstrated in the following illustration:

$x_{\text {sca }}=\frac{x_i-x_{\min }}{x_{\max }-x_{\min }}$ (7)

All the variables have no dimensions as a result of this change. Stage 3: Instruction in LS-SVM This step manages input-output mapping using the LS-SVM. The Gaussian radial basis function kernel is employed as a logical alternative [12, 20]. Using the k-fold cross-validation approach on the training data set, the LS-SVM training is done to build the prediction model.

Fitness definition: Chen and Wang [21] assert that although the training dataset's fitness may be easily calculated, it is prone to over-fitting.

This problem is solved using the k-fold cross-validation technique. This technique divides the training dataset into k subsets at random, then uses the k-1 subset as the training set to construct the regression function with the specified set of parameters $\left(\mathrm{C}_{\mathrm{i}}, \delta_{\mathrm{i}}\right)$. The final set is taken into account for validation. The preceding procedure is repeated k times. As a result, the fitness function is defined by the MAPECV of the k-fold trans procedure on the training dataset:

Fitness $=\min \mathrm{f}=\mathrm{MAPE}_{\mathrm{cv}}$ (8)

$\operatorname{MAPE}_{\mathrm{cv}}=\frac{1}{\mathrm{l}} \sum_{j=1}^1\left|\frac{y_j-\widehat{y}_j}{\mathrm{y}_{\mathrm{j}}}\right| \times 100 \%$ (9)

where, y_j is the actual value; $\widehat{y_j}$ are the validation value and l is the number of subsets. The solution with a smaller $\mathrm{MAPE}_{\mathrm{cv}}$ of the training, the dataset has a smaller fitness value. Stage 4: The LS-SVM parameters are supplied in this stage so that the estimations may be made. Choose the best parameter settings to construct the LS-SVM model in Stage 5: LS-SVM model with the test dataset substituted to achieve the estimation values. Through testing performance, it is possible to validate the LS-SVM model's estimated capacity by taking into account performance criteria to compute the error between real and estimates values. The structure of the suggested intelligent model is shown in Figure 2. To reduce the MAPECV during prediction iterations, this model is used to try to optimize the LS-parameter SVM's combinations.

figure_2.png

Figure 2. The proposed LS-SVM model

3. Evolutionary Fuzzy Neural Interface Model

The majority of issues in construction management are complicated, unclear, and environment-dependent. Neural networks, fuzzy logic, and genetic algorithms have all been effectively used in construction management to address a variety of issues. These three computing techniques make up for one paradigm's shortcomings with those of the other two. By combining the aforementioned three techniques and taking into consideration each method's advantages, Cheng [1] develops an "Evolutionary Fuzzy Neural Inference Model (EFNIM"). Since GAs is primarily concerned with optimization, FL with imprecision and approximation reasoning, and NNs with learning and curve fitting, the best adaptation mode is automatically found in the model. Figure 3 depicts the EFNIM's architecture.

The FL, NN, and GA paradigms are combined to form the proposed EFNIM. The benefits of one paradigm were balanced out by the others when FL, NNs, and GAs were used together. FL, NN, and GAs are largely focused on learning and curve fitting in the formulated model, whereas FL, NN, and GAs are mostly focused on imprecision and approximation reasoning. Technologies FL and NNs complement one another.

A possible first step toward the development of intelligent machines that can replicate the operations of the human brain appears to be the combination of two techniques into a single system. The NN in Figure 3 takes the role of the fuzzy rule base and fuzzy inference engine of the conventional fuzzy logic system. The NN is used to overcome difficulties in collecting fuzzy rules and figuring out composition operators as well as to provide the integrated system a learning capability. A neural network with both fuzzy inputs and fuzzy outputs is referred to as a "neuro with fuzzy input-output," and it combines the FL and NN. The FNN, a general word to denote the fusion or union of FL and NN, is used in this study to initialise the "neuro with fuzzy input-output" for convenience of usage.

Although the FNN is more reasonable than conventional FL to mimic the qualities and process of human reasoning, it has trouble choosing an acceptable topology and sufficient parameters for human inference. Furthermore, selecting a suitable distribution for the MFs to address different problems takes time, and the difficulty increases with problem complexity. GA is an effective tactic for addressing the drawbacks of FNN. The EFNIM makes advantage of GA to simultaneously determine the best FNN topology, best FNN variables, and fittest MF forms.

figure_3.png

Figure 3. EFNIM Architecture [1]

4. Model Validation

In order to evaluate the proposed model's effectiveness, the accessible dataset based on an actual data presented in the study [3] is applied to the LS-SVM model with a cross-validation strategy. The conceptual phase of a construction project's cost is affected by 10 input patterns in this dataset. Site studies and the owner's preliminary needs can be divided into two primary categories. Table 1 shows eight simulated validation of multi-story buildings and 32 multi-story buildings. These models are actual community housing complexes with reinforced concrete buildings that will be built in the Central region of Iraq between 2015 and 2022. These input patterns are from actual group initiatives in Taiwan between 1997 and 2001 [3].

The descriptions of these patterns are as follows: (1) Site area (in square meters); (2) Geology property; (3) Influencing householder number; (4) Earthquake impact; (5) Planning householder number; (6) Total floor area (in square meters); (7) Floor over ground (in stories); (8) Floor underground (in stories); (9) Decoration class; (10) Facility class; and (11) Normalized cost of the building project.

The cost of building projects was in Iraqi Dinar per square meter. In Table 1, factors 1, 3, 5, 6, 7, and 8 are quantitative, whereas factors 2, 4, 9, and 10 are qualitative.

In Table 2, the qualitative characteristics are outlined. The dataset contains 80 rows of data, of which 20 rows are chosen to evaluate the suggested intelligent model as well as two well-known methods, namely Nonlinear Regression (NR) and Evolutionary Fuzzy Neural Interface Model (EFNIM). The sequential data is then used by the model as training data. In the training phase, a cross-validation approach is used to address the issues caused by the short training dataset.

To avoid numerical difficulties and attributes with greater numeric ranges predominating those with smaller numeric ranges, training data are normalised into a (0, 1) range and sequential data reflects the suggested qualities. Data normalisation using a function (7). Three metrics are used to evaluate the performance of the proposed model: mean absolute average error (MAPE), mean square error (MSE), and R-squared (R2), as indicated by (10) to (12):

$\mathrm{MAPE}=\frac{1}{\mathrm{~N}} \sum_{\mathrm{i}=1}^{\mathrm{N}}\left|\frac{\mathrm{y}_{\mathrm{i}}-\widehat{\mathrm{y}}_1}{\mathrm{y}_{\mathrm{i}}}\right| \times 100 \%$ (10)

$\mathrm{MSE}=\frac{\sum_{i=1}^{\mathrm{N}}\;\left(\mathrm{y}_{\mathrm{i}}-\widehat{\mathrm{y}_1}\right)^2}{\sum_{\mathrm{i}=1}^{\mathrm{N}}\;\left(\mathrm{y}_{\mathrm{i}}-\overline{\mathrm{y}}\right)^2}$ (11)

$\mathrm{R}^2=1-\frac{\sum_{i=1}^{\mathrm{N}}\;\left(\mathrm{y}_{\mathrm{i}}-\widehat{\mathrm{y}_1}\right)^2}{\sum_{\mathrm{i}=1}^{\mathrm{N}}\;\left(\mathrm{y}_{\mathrm{i}}-\overline{\mathrm{y}}\right)^2}$ (12)

where, $y_i$ and $\widehat{\mathrm{y}_1}$ represent the actual and estimated values of the i-th data, respectively. y is the average of actual data and N is the number of data. The overall comparative results based on the MAPE, MSE and R² indices are illustrated for the proposed model and two famous techniques in Table 3.

Table 1. Patterns for conceptual estimation of building cost

Input patterns
No.	Building cost	Input										Output
No.	Building cost	1	2	3	4	5	6	7	8	9	10	11
1	643,458,474	909.3438	1	38	1	60	7137.7	14	2	1	2	0.643458
2	704,478,078	1715.448	2	2	1	29	6848.01	15	1	3	1	0.704478
3	972,456,659	2259.313	1	20	1	16	6934.26	19	3	1	2	0.972457
4	614,184,888	1851.813	1	120	1	95	9238.73	10	2	2	2	0.614185
5	917,898,343	2989.521	1	44	1	175	19275.83	12	2	1	3	0.917898
6	980,104,713	3913.427	2	18	1	104	6629.14	14	2	3	3	0.980105
7	561,769,346	3144.719	1	23	1	241	24185.31	22	2	2	2	0.561769
8	465,773,082	927.7292	1	124	1	14	10606.56	16	3	1	1	0.465773
9	400,665,726	6019.646	1	147	1	272	43767.47	19	3	1	1	0.400666
10	727,554,103	2982.083	1	12	1	94	16934.4	16	2	2	3	0.727554
11	557,318,970	1928.323	2	0	1	218	13233.82	19	3	2	2	0.557319
12	490,002,908	2238.24	2	33	1	55	17399	19	2	2	1	0.490003
13	405,050,170	3360.313	2	78	1	154	17667.02	12	2	1	1	0.40505
14	622,854,881	2902.823	1	22	1	160	36542.68	33	4	2	2	0.622855
15	647,084,707	867.75	1	175	1	151	8638.9	19	3	2	2	0.647085
16	631,327,078	1370.49	3	224	1	12	7665.56	8	2	3	2	0.631327
17	843,956,166	2397.938	1	88	1	70	20517.38	19	3	3	3	0.843956
18	790,419,788	840.6146	1	45	1	86	8727.04	19	3	3	3	0.79042
19	892,811,407	4557.667	2	12	1	59	15588.21	15	2	3	3	0.892811
20	485,025,080	822.9792	1	0	1	38	6190.73	16	2	2	1	0.485025
21	810,924,484	1619.24	2	125	1	233	13191.88	16	2	3	3	0.810924
22	706,456,023	2896.927	1	29	1	68	6629.14	14	1	2	2	0.706456
23	682,456,957	7924.792	1	78	1	283	28735.09	23	3	1	2	0.682457
24	762,860,421	1968.719	1	74	1	235	12098.87	18	2	3	3	0.76286
25	405,280,931	3042.625	1	147	1	161	17222.73	11	2	1	1	0.405281
26	401,588,767	1415.594	1	23	1	96	9051.38	12	2	1	1	0.401589
27	833,538,989	1821.75	1	36	1	103	15263.04	16	2	3	3	0.833539
28	746,872,032	2412.438	2	124	1	100	13371.74	15	2	1	3	0.746872
29	527,979,452	3278.052	2	78	1	173	20768.76	12	2	1	2	0.527979
30	747,795,073	1828.01	1	21	1	106	23942.59	19	2	1	2	0.747795
31	614,712,340	2560.063	2	63	1	89	2393.2	23	2	1	3	0.614712
32	869,867,245	3487.479	1	77	1	154	28498.2	12	1	1	2	0.869867
33	655,161,316	3840.885	1	82	1	115	8773.19	10	1	3	1	0.655161
34	664,128,000	2962.875	1	64	1	40	13616.38	18	3	3	2	0.664128
35	447,608,954	1941.635	3	29	1	90	2832.2	23	2	2	1	0.447609
36	589,922,096	3278.052	2	33	1	152	23840.39	14	3	1	1	0.589922
37	647,315,467	2963.188	1	47	1	89	27368.39	10	1	2	2	0.647315
38	589,922,096	3557.167	1	86	1	118	25145.72	15	2	3	1	0.589922
39	809,737,717	2015.969	2	98	1	82	6087.76	23	1	2	3	0.809738
40	717,301,754	2788.802	1	122	1	151	13719.02	22	2	1	2	0.717302

Table 2. Description of qualitative factors for conceptual cost estimating

Influencing factor	Qualitative option	Value
Geology property	Soft	1
	Medium	2
	Hard	3
Earthquake impact	Low	1
Earthquake impact	High	2
Decoration class	Basic type	1
	Normal type	2
	Luxurious type	3
Facility class	Basic type	1
	Normal type	2
	Luxurious type	3

Table 3. Overall comparative results

Intelligent techniques	MAPE	MSE	R²
Proposed model	0.625	0.1687	0.9147
EFNIM	2.214	0.3187	0.6687
NR	3.935	0.4987	0.5214

figure_4.png

Figure 4. Comparison of accuracy for proposed model, EFNIM, and NL cost estimation models

figure_5.png

Figure 5. Examples illustrate the suggested model and two more well-known methods

MAPE and MSE often reflect accuracy as a percentage in statistics. According to this model, MAPE = 0.625 for LS-SVM, 2.214 for EFNIM, and 3.935 for NL indicating that the LS-SVM has a lower and greater percentage error than EFNIM and NR, respectively. The wide applicability of the prediction model is measured by the coefficient of determination R2, which quantifies how well data points fit a curve or a line. When represented as a percentage, R2 computes how much of the variance in the output response can be attributed to the model's predictors. R2 values fall within the range [0,1]. As shown in Figure 4, the value R2= 0.9147 for our cleverly developed model may be understood as follows: about 91% of the variation in the response is attributable to the selected predictors, and the remaining 9% is related to unknown variability.

Here, it is important to emphasize that since various types of predictive models may be appropriate depending on the type of relationship between the predictors and the dependent variable, it is best to test them all out to find the one that best fits our data and provides the most accurate predictions. The authors confirmed that the cross-validation approach was well suited for trustworthy estimation after obtaining pleasing results with LS-SVM with an error of less than 9%.

As shown in Figure 5, the prediction made using the suggested model yields results that are adequate. The minimum three indices in the final prediction results of the suggested model show strong performance and adaptability to the conceptual building cost in the construction sector. Project managers may assess construction projects appropriately, particularly in feasibility studies, by utilizing the suggested model. Additionally, it raises the likelihood that building projects will be successful.

5. Conclusion

In order to evaluate the conceptual cost of building projects, a new intelligent model created with cross-validation and support vector machine approaches was suggested in this study.

With the help of the suggested model, the LS-SVM was given a stronger ability to generalize the input-output connection it had learned throughout its training phase to make accurate predictions for brand-new input data. The Evolutionary Fuzzy Neural Interface Model and Nonlinear Regression were put up against the suggested model for a conceptual cost estimate of construction projects.

It was determined through comparison that the cross-validation approach was suitable for accurate estimation. Using the data from the building projects, the suggested model exhibited improved generalization performance and produced lower estimation errors. Given that these results pertain to actual situations from practice, the results MAPE=0.625, MSE=0.1687, and R2=0.9147 (91.5%) demonstrate the high predictive capabilities of the suggested LS-SVM.

The values of the input parameters, which were taken for Taiwan for the period from 1997-2001, have an effect on the performance of the proposed intelligent model for conceptual cost estimation in an overall way, although it matches the Iraqi conditions, but this matter must be reconsidered. As an extension of this research, optimizing the suggested model is suggested to increase yields and overcome some of the challenges.

Acknowledgment

The corresponding author extends his thanks to Al-Mustaqbal University for the financial support for the research.

References

[1] Cheng, M.Y., Tsai, H.C., Sudjono, E. (2010). Conceptual cost estimates using evolutionary fuzzy hybrid neural network for projects in construction industry. Expert Systems with Applications, 37: 4224-4231. https://doi.org/10.1016/j.eswa.2009.11.080

[2] Cheng, M.Y., Wu, Y.W. (2005). Construction conceptual cost estimates using support vector machine. In 22nd International Symposium on Automation and Robotics in Construction, ISARC, pp. 1-5.

[3] Hsieh, W.S. (2002). Construction conceptual cost estimates using evolutionary fuzzy neural inference model. M. S. thesis, Dept. of Construction Engineering, National Taiwan University of Science and Technology, Taiwan.

[4] Cheng, M.Y., Wu, Y.W. (2005). Construction conceptual cost estimates using support vector machine. In 22nd International Symposium on Automation and Robotics in Construction ISARC, pp. 1-5. https://doi.org/10.22260/ISARC2005/0064

[5] Chang, C.C., Lin, C.J. (2001). Training nu-support vector classifiers: Theory and algorithms. Neural Computation, 13(9): 2119-47. https://doi.org/10.1162/089976601750399335

[6] An, S.H., Park, U.Y., Kang, K.I., Cho, M.Y., Cho, H.H. (2007). Application of support vector machines in assessing conceptual cost estimates. Journal of Computing in Civil Engineering, 21(4): 259-264. https://doi.org/10.1061/(ASCE)0887-3801(2007)21:4(259)

[7] Kong, F., Wu, X.J., Cai, L.Y. (2008). A novel approach based on support vector machine to forecasting the construction project cost. Proc. of the International Symp. on Computational Intelligence and Design (ISCID 2008), Wuhan, China. https://doi.org/10.1109/ISCID.2008.13

[8] Kim, G.H., Shin, J.M., Kim, S., Shin, Y. (2014). Comparison of school building construction costs estimation methods using regression analysis, neural network, and support vector machine. Journal of Building Construction and Planning Research, 1(1): 1-7. https://doi.org/10.4236/jbcpr.2013.11001

[9] Mačková, D., Bašková, R. (2014). Applicability of Bromilow´s time-cost model for residential projects in Slovakia. SSP - Journal of Civil Engineering, 9(2): 5-12. https://doi.org/10.2478/sspjce-2014-0011

[10] El-Sawalhi, I.N. (2015). Support vector machine cost estimation model for road projects. Journal of Civil Engineering and Architecture, 9: 1115-1125. https://doi.org/10.17265/1934-7359/2015.09.012

[11] Firouzi, A., Yang, W., Li, C.Q. (2016). Prediction of total cost of construction project with dependent cost items. Journal of Construction Engineering and Management, 142(12): 04016072. https://doi.org/10.1061/(ASCE)CO.1943-7862.0001194

[12] Sapankevych, N.I., Sankar, R. (2009). Time series prediction using support vector machine: A survey. IEEE Computational Intelligence Magazine, 4(2): 24-38. https://doi.org/10.1109/MCI.2009.932254

[13] Mousavi, S.M., Iranmanesh, S.H. (2011). Least squares support vector machines with genetic algorithm for estimating costs in NPD projects. The IEEE International Conference on Industrial and Intelligent Information (ICIII 2011), Indonesia, Xi'an, China, pp. 127-131. https://doi.org/10.1109/ICCSN.2011.6014864

[14] Tavakkoli-Moghaddam, R., Mousavi, S.M., Hashemi, H., Ghodratnama, A. (2011). Predicting the conceptual cost of construction projects: A locally linear neuro-fuzzy model. The First International Conference on Data Engineering and Internet Technology (DEIT), pp. 398-401. https://doi.org/10.22094/JOIE.2016.231

[15] Vahdani, B., Iranmanesh, S.H., Mousavi, S.M., Abdollahzade, M. (2011). A locally linear neuro-fuzzy model for supplier selection in cosmetics industry. Applied Mathematical Modelling, 36(10): 4714-4727. https://doi.org/10.1016/j.apm.12.006

[16] Salunkhe, A.A., Patilin, P.A. (2020). Comparative analysis of construction cost estimation model using SVM and regression analysis. https://www.researchgate.net/publication/343065757_Comparative_Analysis_of_Construction_Cost_Estimation_Model_Using_SVM_and_Regression_Analysis.

[17] Pisner, D.A., Schnyer, D.M. (2020). Support vector machine. Machine Learning, Academic Press, 101-121.

[18] Wang, Y.R., Yu, C.Y., Chan, H.H. (2012). Predicting construction cost and schedule success using artificial neural networks ensemble and support vector machines classification models. International Journal of Project Management, 30(4): 470-478. https://doi.org/10.1016/j.ijproman.2011.09.002

[19] Khanday, A.M.U.D., Khan, Q.R., Rabani, S.T. (2021). SVMBPI: Support vector machine-based propaganda identification. In: Mallick, P.K., Bhoi, A.K., Marques, G., Hugo C. de Albuquerque, V. (eds) Cognitive Informatics and Soft Computing. Advances in Intelligent Systems and Computing, vol. 1317. Springer, Singapore. https://doi.org/10.1007/978-981-16-1056-1_35

[20] Huang, C.F. (2012). A hybrid stock selection model using genetic algorithms and support vector regression. Applied Soft Computing, 12(2): 807-818. https://doi.org/10.1016/j.asoc.2011.10.009

[21] Chen, K.Y., Wang, C.H. (2007). Support vector regression with genetic algorithms in forecasting tourism demand. Tourism Management, 28(1): 215-226. http://dx.doi.org/10.1016/j.tourman.2005.12.018

[22] Salgado, D.R., Alonso, F.J. (2007). An approach based on current and sound signals for in-process tool wear monitoring. International Journal of Machine Tools and Manufacture, 47(14): 2140-2152. https://doi.org/10.1016/j.ijmachtools.2007.04.013

[23] Tanveer, M., Rajani, T., Rastogi, R., Shao, Y.H., Ganaie, M.A. (2022). Comprehensive review on twin support vector machines. Annals of Operations Research, 1-46. https://doi.org/10.48550/arXiv.2105.00336

[24] Tanveer, M., Tiwari, A., Choudhary, R., Ganaie, M.A. (2022). Large-scale pinball twin support vector machines. Machine Learning, 111(10): 3525-3548. https://doi.org/10.1007/s10994-021-06061-z

IJHT
MMEP
ACSM
EJEE
ISI
I2M
JESA
RCMA
RIA
TS
IJSDP
IJSSE
IJDNE
JNMES
IJES
EESRJ
RCES
AMA_A
AMA_B
AMA_C
AMA_D
MMC_A
MMC_B
MMC_C
MMC_D

Username
Password
Remember me

Search form