OPEN ACCESS
This paper attempts to accurately classify ecommerce sellers based on data mining. Firstly, the original data from an ecommerce platform were preprocessed, and the classification indices were identified from five categories (products, users, traffic, sales and basic attribute). Next, the principal component analysis (PCA) and the selforganizing feature map (SOM) were fused into a hierarchical model that divides ecommerce sellers into three categories: large sellers, medium sellers and small sellers. The effectiveness of our model was verified through experiments. Finally, several operating strategies were put forward for ecommerce sellers in each category. The research results provide a good reference for the development of the ecommerce industry.
Ecommerce sellers, hierarchical model, selforganizing feature map (SOM), principal component analysis (PCA), data mining
With the rise of online shopping, ecommerce has become a research hotspot. However, the relevant studies mainly focus on the interests of users, such as the security of online transactions. There are very few reports on sellers, not to mention their hierarchy and specific operations.
The hierarchy of ecommerce sellers is usually defined by gross merchandise value (GMV) [1]. Based on the GMV, ecommerce sellers can be divided into large, medium and small sellers [2, 3]. Considering the sheer number of sellers on ecommerce platforms, the simple and rough classification is unable to support refined operations. Therefore, it is necessary to develop a comprehensive hierarchical model for sellers across ecommerce platforms.
This paper mainly proposes a hierarchical model for ecommerce sellers based on data mining [4]. Firstly, the data mining tools selected for modelling were introduced, namely, the selforganizing feature map (SOM) and principal component analysis (PCA). Next, the data processing and index selection were described in details. After that, the SOM and the PCA were fused into our hierarchical model. Finally, the effectiveness of our model was verified through experiments [58].
2.1 SOM
The SOM was proposed by Finnish professor T. Kohonen, a doctor at the University of Helsinki. Prof. Kohonen held that the neurons in a neural network (NN) respond inconsistently to external stimuli, and the entire NN can be automatically divided into different regions based on the responses [911].
The SOM has a strong ability to learn the intrinsic attributes of every input, efficiently acquire the statistical features of the input, and iteratively update the parameters and structure of the NN. The competitive learning ability naturally leads to the excellence in selforganization. The SOM can continuously process and transmit the various dimensional information of the input dataset through network neurons, eliminating complex operations like derivation and differentiation.
The SOM outperforms other algorithms of its kind in computing efficiency and feature learning. On the one hand, the SOM updates the neuron weights iteratively based on the latest learning results, and assigns the highest weight to the optimal neuron in each round; in this way, SOM algorithm can quickly converge to the global optimal solution. On the other hands, the SOM can acquire the intrinsic statistical features of input samples and reflect the probability distribution of training samples. Therefore, the SOM is suitable for classification of ecommerce sellers based on a massive amount of data.
2.2 PCA
Empirical problems generally involve various intercorrelated influencing factors, a.k.a. indices. The correlation between some indices is rather strong. If all indices are combined into a dataset, there must be many overlaps between the information carried by different data elements. If there are too many indices, the problem analysis will become very complex and require a huge computing power. In severe cases, the algorithm may face overfitting or fail to find the optimal solution [12, 13].
The PCA provides a viable solution to the above defects. This dimensionality reduction strategy mainly transforms multiple indices into a few composite indices: the original data are reduced in dimensionality; then, the useful indices are selected and imported to the algorithm, improving the analysis accuracy. The PCA was selected to preprocess the original data in our research, because the data involve five primary indices (products, users, traffic, sales and basic attribute), each of which contains multiple closely coupled secondary indices [14, 15].
The original data are the user behaviors (e.g. “page view”, “add to cart” and “order placement”) that reflect the traffic and sales of 140,000 ecommerce sellers on a Chinese ecommerce platform. The data were collected from multiple channels, including the app and website of the platform and instant messengers like WeChat and QQ.
The original data from different databases were allocated to Hadoop clusters, processed by MapReduce, and aggregated into the raw data. The raw data were examined in details and then preprocessed in five steps: missing value treatment, outlier processing, skewed distribution correction, normalization, and correlation analysis.
(1) Missing value treatment
The Rlanguage was adopted to remove redundant entries and supplement the missing data [4, 16].
(2) Outlier processing
The calculation errors and outliers, i.e. the entries with business field of zero, were processed according to the business logic. First, the entries of the stores with no transaction were deleted; Second, the outliers containing the information of actual transactions were retained; Third, the other outliers that deviate greatly from the normal range were processed to ensure the accuracy of subsequent normalization [17].
(3) Skewed distribution correction
Most indices in the original data have leftskewed distributions, which reflect the Pareto principle: the few large sellers attract most traffic and sales on the ecommerce platform, i.e. the winner takes all. To solve the problem, the psych package of the R language was employed to obtain the descriptive value of each index whose distribution skews to the left [18]. Then, the bcPower function was called from the car package of R language to reduce the skewness of the skewed distributions.
(4) Normalization
For convenience, each number was normalized as a decimal between zero and one, and the dimensional expressions were made dimensionless. The normalization does not change the size of the dataset.
(5) Correlation analysis
If two indices in the same dataset are highly correlated (i.e. too close to each other), much computing power will be wasted to handle identical or highly similar information in the subsequent iterative computations. Here, the correlation coefficients between the indices in our dataset are calculated by the cor function of R language. Then, the correlations between indices were displayed with the ggcorrplot, a visualization tool [19, 20].
The indices of the preprocessed data were roughly divided into five categories: products, users, traffic, sales and basic attribute. The indices in the five categories are respectively explained in Tables 15, where SKU is stock keeping unit (a number or code used to identify products in most online stores), SPU is standard product unit (the smallest unit of product information used by most online stores), PV is page view (the number of pages viewed by a visitor), and UV is unique visitor (an aggregate of PVs generated by the same user during the same session).
Table 1. The indices in the category of products
Category 
Symbol 
Meaning 
Products 
SPN (SKU) 
The number of sold products (SKU) 
SPN (SPU) 
The number of sold products (SPU) 

SPR (SKU) 
The ratio of sold products to displayed products (SKU) 

SPR (SPU) 
The ratio of sold products to displayed products (SPU) 

VBCR 
The conversion ratio of visitors to buyers 

ACPN (SKU) 
The number of addtocart products (SKU) 

ACPN (SPU) 
The number of addtocart products (SPU) 

ACPR 
The ratio of addtocart products to displayed products 

VPN (SKU) 
The number of visited products (SKU) 

VPN (SPU) 
The number of visited products (SPU) 

DPN (SKU) 
The number of displayed products (SKU) 

DPN (SPU) 
The number of displayed products (SPU) 

OPN 
The number of ordered products 
As shown in Table 1, four indices in the category of products have both SKU and SPU dimensions: the number of sold products, the number of addtocart products, the number of visited products and the number of displayed products.
As shown in Table 2, the indices in the category of users all focus on users, and the users follow two kinds of items: product and store.
As shown in Table 3, the SPV is the sum of the PVs on any page of the store, including but not limited to the front page, the product details page, the query page and the promotion page; similarly, the SUV is the sum of the UVs on any page of the store.
Table 2. The indices in the category of users
Category 
Symbol 
Meaning 
Users 
ACUN 
The number of users that add products to cart 
FGUN 
The number of users that follow one or more products 

FSUN 
The number of users that follow one or more stores 

POUN 
The number of users that place one or more orders 

OBN 
The number of old buyers 

NBN 
The number of new buyers 

30DRT 
The 30day repurchase rate 

90DRT 
The 90day repurchase rate 
Table 3. The indices in the category of traffic
Category 
Symbol 
Meaning 
Traffic 
SPV 
Store PV 
SUV 
Store UV 

ATP 
Average time on page (s) 

ADV 
Average depth of visit 

BR 
Bounce rate 

PE (SKU) 
Product exposure (SKU) 

PE (SPU) 
Product exposure (SPU) 

PPV 
Product page views 

PVN 
Number of product visitors 

PVPCR 
The conversion ratio of product visitors to order placers 

SVPCR 
The conversion ratio of store visitors to order placers 
Table 4. The indices in the category of sales
Category 
Symbol 
Meaning 
Sales 
OA 
Order amount 
ON 
The number of orders 

PBA 
Per buyer amount 

OPON 
The number of orders paid online 

OPDN 
The number of orders paid upon delivery 

OPTN 
The number of orders paid by bank transfer 
As shown in Table 4, the indices in the category of sales all focus on user purchases. There are three payment methods for ecommerce users: online payment, payment upon delivery and bank transfer.
Table 5. The index in the category of basic attribute
Category 
Symbol 
Meaning 
Basic attribute 
SOY 
The number of years since the opening of the online store 
As shown in Table 5, a store with large SOY tends to gain rich sales experience. The greater the SOY, the more likely it is for the seller to become a large seller. Therefore, the SOY is an important reference for the classification of ecommerce sellers.
4.1 The PCA phase
The PCA is the most popular unsupervised algorithm for dimensionality reduction of features. Through the PCA, the correlated highdimensional indices are linearly mapped to the lowdimensional space. The resulting lowdimensional indices are called principal components. The nonlinear mapping is comparable to transforming the original dataset to a new coordinate system containing n orthogonal coordinates. The first coordinate is the first principal component, the second coordinate is the second principal component, and the rest may be deduced by analogy. Based on the maximum variance theory, the first principal component has the greatest explanatory power of the original dataset, followed in descending order by the second to the nth principal component [21, 22].
As the first part of our hierarchical model, the PCA is performed on the normalized dataset, using the FactoMineR and factoextra packages of R language. According to the KaiserHarris criterion, the principal components whose eigenvalues are greater than one should be selected to represent the feature space of the original dataset, because these components are useful and capable of explaining more than two original indices. To capture sufficient feature information, all the principal components whose eigenvalues are greater than 0.4 were selected to form the training set. The established training set contains 147,008 entries, each of which covers features in 14 dimensions.
4.2 The SOM phase
As shown in Figure 1, the SOM, the second part of our hierarchical model, is trained through the following steps:
First, the number of output layer neurons is configured and inputted into the model, and the weight of each output layer neuron is initialized as a small random number.
Second, the input data and weights are normalized. Although every index has been normalized in preprocessing, every entry and weight must be normalized again to form a consistent training set.
Third, some samples are collected from the dataset by a sampling module and taken as training samples. The sampling is highly necessary, as it is difficult to train the model with all the147,008 entries in the preprocessed dataset. Distance calculation is essential to model training. Thus, Euclidean distance and cosine distance are compared to find which is more suitable for the training of our model.
Fourth, weight update is implemented after each input for the neurons in the neighborhood radius of the winning neuron, and the updated weights are normalized again.
Fifth, the learning rate and neighborhood radius are updated as two functions, waiting to be called for model training. The two parameters both decreases with the growing number of iterations.
Sixth, the model training is terminated under one of the following conditions: the learning rate falls below the preset threshold; the number of iterations surpasses the preset maximum number of iterations.
Figure 1. Workflow of the SOM
4.3 Workflow of PCASOM model
In general, the PCASOM model can be executed in nine steps:
Step 1. Dataset reading
Filter out the store IDs and save feature data of 14 dimensions. Read the data with the pandas Python library. To facilitate subsequent calculations, convert the obtained DataFrame into the datatype of the NumPy Array.
Step 2. Data initialization
First, set up the structure output layer neurons as a 2× 2tput l array, which contains four 14dimensional neurons. Next, initialize the data of the 14×4 matrix in a random manner, plus another two parameters (i.e. size of sampling batch and number of iterations). Define these parameters as passable in the coding process, laying the basis for tuning and reference in subsequent experiments.
Step 3. Data normalization
Normalize the sampled dataset and output matrix of weight errors before each round of competitive learning:
$\mathcal{x}_{\mathrm{i}}^{\prime}=\frac{x_{i}}{\sqrt{x_{1}^{2}+x_{2}^{2}+\ldots+x_{n}^{2}}}$ $\left(x_{1}, x_{2}, \cdots, x_{n}\right)$ (1)
Step 4. Competitive learning
Compute the distances between each sample and the weights of the four output layer neurons. Since the data have been normalized, replace the distance computation with the dot product of weights, i.e. the multiplication between two matrices, and then select the winning neuron for each sample.
Step 5. Weight update
Based on the winning neuron and neighborhood radius, identify the weights to be updated, and update them at the preset learning rate:
$W_{j}(n+1)=W_{j}(n)+\eta_{i(x) j}(n)\left(X(n)W_{j}(n)\right)$ (2)
Step 6. Update of neighborhood radius and learning rate
Update neighborhood radius and learning rate after weight update in each round, such that the model can adapt to the latest environment. The neighborhood radius can be updated by:
$N=a\frac{a * t}{\text { iter }}$ (3)
where, a is the shortedge distance of the output layer; t is the current number of iterations; iter is the maximum number of iterations. Obviously, the neighborhood radius is negatively correlated with the number of iterations.
The learning rate can be updated by:
eta $=\frac{e^{n}}{t+2}$ (4)
where, n is the current neighborhood radius; t is the current number of iterations. Obviously, the learning rate is also negatively correlated with the number of iterations.
Step 7. Checking termination condition
If the current number of iterations is greater than the maximum number of iterations, terminate the training and save the latest weights.
Step 8. Sample prediction
According to the latest weights, calculate the level (category) of each seller, and save the predicted dataset.
Step 9. Visualization
Visualize the predicted dataset with PyLab, such that each category is displayed in a unique color.
5.1 Parameter settings
Table 6. Correlations between 14 principal components and four categories in Parameter Set A
Label 
Dim1 
Dim2 
Dim3 
Dim4 
Dim5 
Dim6 
Dim7 
0 
0.145462 
0.114938 
0.368108 
0.391941 
0.089378 
0.263032 
0.175507 
1 
0.252905 
0.18781 
0.218794 
0.15147 
0.623843 
0.086672 
0.120722 
2 
0.898651 
0.140505 
0.080103 
0.080886 
0.075871 
0.063685 
0.065003 
3 
0.24596 
0.810467 
0.105459 
0.085603 
0.114202 
0.07422 
0.077887 
Label 
Dim8 
Dim9 
Dim10 
Dim11 
Dim12 
Dim13 
Dim14 
0 
0.137724 
0.080217 
0.158189 
0.086634 
0.057333 
0.063127 
0.089902 
1 
0.132 
0.083466 
0.073325 
0.067509 
0.063904 
0.074067 
0.089163 
2 
0.046135 
0.033628 
0.040682 
0.024839 
0.022028 
0.024663 
0.030701 
3 
0.060589 
0.056506 
0.080123 
0.03729 
0.039968 
0.04165 
0.048224 
Through repeated tunings, two sets of parameters were selected to train our model: Parameter set A: maximum number of iterations iter=5; size of sampling batch batchsize=20,000; Parameter set B: maximum number of iterations iter=10; size of sampling batch batchsize=20,000. After the training, the final dataset under each parameter set was saved to analyze the hierarchy of ecommerce sellers.
Since the data on each seller contain 14 principal components, the correlations between each principal component and each category were obtained through clustering. The results under each parameter set are recorded as a correlation matrix (Tables 6 and 7). The correlation matrix represents the distribution of eigenvalues of each category across the 14 principal components [23, 24].
Table 7. Correlations between 14 principal components and four categories in Parameter Set B
Label 
Dim1 
Dim2 
Dim3 
Dim4 
Dim5 
Dim6 
Dim7 
0 
0.144169 
0.13101 
0.457498 
0.319664 
0.087179 
0.246859 
0.19463 
1 
0.215309 
0.214191 
0.146652 
0.224041 
0.594014 
0.098243 
0.124524 
2 
0.89625 
0.141599 
0.078692 
0.081162 
0.081033 
0.064357 
0.064039 
3 
0.246716 
0.825063 
0.093644 
0.085554 
0.111229 
0.072239 
0.072092 
Label 
Dim8 
Dim9 
Dim10 
Dim11 
Dim12 
Dim13 
Dim14 
0 
0.118006 
0.089593 
0.135805 
0.08509 
0.057211 
0.057739 
0.076073 
1 
0.147773 
0.083255 
0.100536 
0.069718 
0.063902 
0.077903 
0.108046 
2 
0.048089 
0.033167 
0.042368 
0.025315 
0.022536 
0.025333 
0.031392 
3 
0.058085 
0.054796 
0.077205 
0.035749 
0.039071 
0.041042 
0.046528 
5.2 Results analysis
As shown in Table 6, the eigenvalues of category 0 mainly appeared in principal components 3 and 4, those of category 1 in principal component 5, those of category 2 in principal component 1, and those of category 3 in principal component 2. The comparison between Tables 6 and 7 shows that the eigenvalues were distributed similarly between the two parameter sets. The following results were drawn from the two tables:
(1) The top 5 principal components explain over 77% of the variance in the entire dataset.
(2) Principal component 1 (Dim1) explains 41% of the variance in the dataset; the greater the variance, the more important the component. This component has a significant positive correlation with the indices of absolute values in the categories of traffic, products and sales.
(3) Principal component 2 (Dim2) explains 16% of the variance in the dataset. This component has a significant positive correlation with the indices of ratios (conversion ratios) in the categories of products and traffic, and a significant negative correlation with indices about sold products, displayed products and visited products.
(4) Principal component 3 (Dim3) has a significant positive correlation with BR and ADV. It can be abstracted as the decoration quality and attractiveness of a store.
(5) Principal component 4 (Dim4) is positively correlated with BR and ADV, as well as several indices of ratios (conversion ratios).
(6) Principal component 5 (Dim5) has a relatively weak correlation with the indices. The only exceptions are the strong correlation with SOYN, and slight correlations with several indices of ratios (conversion ratios).
5.3 Discussion
The sellers in Category 0 are featured by high levels of BR, ADV, VBCR and SVPCR. The high BR and ADV indicate that the stores are well decorated and organized; Upon accessing a page of such a store, the user prefers to browse the related pages, rather than jump elsewhere in a short time. The high VBCR and SVPCR reflect the good traffic value of these stores, i.e. many visitors are converted to buyers per unit of traffic [25].
The sellers in Category 1 have a relatively long SOYN. Among the top 5 principal components, principal component 5 plays the greatest role among these sellers. Hence, many of these sellers have being operating for many years. Meanwhile, the retention and conversion ratios of these old sellers are not desirable, as they fail to update the operation mode according to the constant changes in user preference and platform policy [26].
The sellers in Category 2 stand out for the indices of absolute values in the categories of traffic, products and sales. These indices are yardsticks of the scale of stores. Thus, Category 2 sellers generally have many displayed products, high traffic and good sales. However, these sellers perform poorly in retention and conversion ratios. As shown in Tables 67, these sellers lag behind their counterparts of other categories in Dim14, especially in Dim1. These results show that the excellence of these sellers in sales comes from the traffic brought by the scale of products, instead of high traffic value.
The sellers in Category 3 have high ratios (conversion ratios) in the categories of products and traffic, but very low results in indices about sold products, displayed products and visited products. These sellers must have a small scale of products. The high ratios (conversion ratios) are attributable to the small base of products. In general, Category 3 sellers are small sellers on the ecommerce platform.
5.4 Countermeasures
The large sellers, mostly in Category 2, generally process wellknown brands, numerous products and high traffic, and achieve high results on the indices in the category of sales. The common problem among these sellers lays in the poor traffic quality and low conversion ratios. To solve the problems, the ecommerce platform should encourage them to optimize the product structure (e.g. eliminating slow movers), and improve store decoration.
The medium sellers, mostly in Categories 0 and 1, have good traffic quality or a long SOYN, i.e. the sellers have been working hard for better sales. The ecommerce platform should guide these sellers to expand the scale of their stores, and provide supports to those with a long SOYN, aiming to optimize their store structure and quality.
The small sellers, mostly in Category 3, mostly operate a small store with very low SPN, DPN and VPN. Their high conversion ratios cannot cover up the serious operating problems. The ecommerce platform must acknowledge the problems of these sellers: small scale, i.e. low attractiveness and irrational product structure, and help them to solve each of the three problems. For example, the platform could encourage these sellers to refine store decoration, relax the promotion policy on them, and assign high weights to their products.
This paper puts forward a novel hierarchical model for ecommerce sellers based on the PCA and the SOM, uses the model to classify 140,000 sellers on an ecommerce platform, and designs operating strategies for the sellers in each category. The research results provide a good reference for the operation of ecommerce sellers.
The further research will focus on the following aspects: the hierarchical model will be improved with data involving more dimensions; the product range will be introduced to the classification problem; the SOM algorithm will be improved in the light of the actual demand of ecommerce sellers and the state of online shopping.
[1] Zhang, J., Zhang, C., Yu, H. (2018). Research on ecommerce intelligent service based on data mining. In MATEC Web of Conferences, 173: 03012. https://doi.org/10.1051/matecconf/201817303012
[2] Najafi, I. (2019). Assessment and modeling of decisionmaking process for ecommerce trust based on machine learning algorithms. Fundamental Research in Electrical Engineering, 969986. https://doi.org/10.1007/9789811086724_74
[3] Ju, C., Wang, J., Zhou, G. (2019). The commodity recommendation method for online shopping based on data mining. Multimedia Tools and Applications, 78(21): 3009730110. https://doi.org/10.1007/s1104201869807
[4] Shah, T.H., Naveed, N., Rauf, Z. (2018). A methodology for brand name hierarchical clustering based on social media data. Journal of Applied and Emerging Sciences, 8(1): 1023. http://dx.doi.org/10.36785/jaes.v8i1.238
[5] Alsenan, S., Zemirli, N. (2016). PERSOretailer: Modeling the retailer's business data: Toward recommender system of retailers' marketing plan for personalized CMS. 2016 International Conference on Computing, Communication and Automation (ICCCA), Noida, pp. 106111. http://dx.doi.org/10.1109/CCAA.2016.7813699
[6] Mu, M., Liu, C. (2018). Research on the construction of o2o ecommerce and express industry collaborative development model in big data environment. In Proceedings of the 2018 International Conference on Internet and eBusiness, 2933. https://doi.org/10.1145/3230348.3230424
[7] CastroLopez, A., Alonso, J.M. (2019). Modeling human perceptions in ecommerce applications: A case study on businesstoconsumers websites in the textile and fashion sector. Applying Fuzzy Logic for the Digital Economy and Society, 115134. https://doi.org/10.1007/9783030033682_6
[8] Zaim, H., Ramdani, M., Haddi, A. (2018). A model of ecommerce selfassessment system based on ecustomer behavior. Smart Application and Data Analysis for Smart Cities (SADASC'18). http://dx.doi.org/10.2139/ssrn.3179244
[9] Sulova, S. (2018). Integration of structured and unstructured data in the analysis of ecommerce customers. International Multidisciplinary Scientific GeoConference: SGEM: Surveying Geology & Mining Ecology Management, 18: 499505. http://dx.doi.org/10.5593/sgem2018/2.1/S07.063
[10] Hsieh, P.H. (2019). A study of models for forecasting ecommerce sales during a price war in the medical product industry. International Conference on HumanComputer Interaction, 321. https://doi.org/10.1007/9783030223359_1
[11] García, M.D.M.R., GarcíaNieto, J., AldanaMontes, J.F. (2016). An ontologybased data integration approach for web analytics in ecommerce. Expert Systems with Applications, 63: 2034. https://doi.org/10.1016/j.eswa.2016.06.034
[12] Lee, H.C., Rim, H.C., Lee, D.G. (2019). Learning to rank products based on online product reviews using a hierarchical deep neural network. Electronic Commerce Research and Applications, 36: 100874. https://doi.org/10.1016/j.elerap.2019.100874
[13] Goswami, A., Mohapatra, P., Zhai, C. (2019). Quantifying and visualizing the demand and supply gap from ecommerce search data using topic models. In Companion Proceedings of the 2019 World Wide Web Conference, 348353. https://doi.org/10.1145/3308560.3316605
[14] Huang, H.J., Yang, J., Zheng, B. (2019). Demand effects of product similarity network in ecommerce platform. Electronic Commerce Research, 131. https://doi.org/10.1007/s10660019093529
[15] Yoo, B., Jang, M. (2019). A bibliographic survey of business models, service relationships, and technology in electronic commerce. Electronic Commerce Research and Applications, 33: 100818. https://doi.org/10.1016/j.elerap.2018.11.005
[16] Hamidi, H., Moradi, S. (2017). Analysis of consideration of security parameters by vendors on trust and customer satisfaction in ecommerce. Journal of Global Information Management (JGIM), 25(4): 3245. https://doi.org/10.4018/JGIM.2017100103
[17] Wakil, K., Alyari, F., Ghasvari, M., Lesani, Z., Rajabion, L. (2019). A new model for assessing the role of customer behavior history, product classification, and prices on the success of the recommender systems in ecommerce. Kybernetes. https://doi.org/10.1108/K0320190199
[18] Zhao, J., Wang, L., Li, D.A., Li, Y., Yang, B., Zhu, B., Bai, R. (2018). Mining shopping data with passive tags via velocity analysis. EURASIP Journal on Wireless Communications and Networking, 2018(1): 113. https://doi.org/10.1186/s1363801810335
[19] Sohaib, O., Naderpour, M., Hussain, W., Martinez, L. (2019). Cloud computing model selection for ecommerce enterprises using a new 2tuple fuzzy linguistic decisionmaking method. Computers & Industrial Engineering, 132: 4758. https://doi.org/10.1016/j.cie.2019.04.020
[20] Liu, S. (2016). The innovation study of ebusiness mode based on big database environment. In 2016 2nd International Conference on Education Technology, Management and Humanities Science, 416419. https://doi.org/10.2991/etmhs16.2016.92
[21] Behl, A., Dutta, P., Lessmann, S., Dwivedi, Y.K., Kar, S. (2019). A conceptual framework for the adoption of big data analytics by ecommerce startups: a casebased approach. Information Systems and eBusiness Management, 17(2): 285318. https://doi.org/10.1007/s10257019004525
[22] Leng, K., Jing, L., Lin, I.C., Chang, S.H., Lam, A. (2019). Research on mining collaborative behaviour patterns of dynamic supply chain network from the perspective of big data. Neural Computing and Applications, 31(1): 113121. https://doi.org/10.1007/s005210183666z
[23] Jiang, H., Sabharwal, A., Henderson, A., Hu, D., Hong, L. (2019). Understanding the role of style in ecommerce shopping. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 31123120. https://doi.org/10.1145/3292500.3330760
[24] Kaushik, V., Khare, A., Boardman, R., Cano, M.B. (2020). Why do online retailers succeed? The identification and prioritization of success factors for Indian fashion retailers. Electronic Commerce Research and Applications, 39: 100906. https://doi.org/10.1016/j.elerap.2019.100906
[25] Leung, W., Shi, S., Chow, W. (2019), Impacts of user interactions on trust development in C2C social commerce: The central role of reciprocity. Internet Research. https://doi.org/10.1108/INTR0920180413
[26] Sharma, H., Aggarwal, A. (2019), Finding determinants of ecommerce success: a PLSSEM approach. Journal of Advances in Management Research, 16(4): 453471. https://doi.org/10.1108/JAMR0820180074