OPEN ACCESS
The world has taken dramatic transformation after advent of Information Technology, it is hard to find the people without cyber connected and every activity of us is guided and regulated by the connected networks. As the world is depending upon the information technology there is same extent of research is getting on cyber monitoring activities taking place around the world. Now, it is very vital to classify and prediction of cybercrimes on the connected era. The objective of the paper is to classify the cyber crime judgments precedents for providing knowledgeable and relevant information to the cyber crime legal stakeholders. The stakeholders extract information from the precedents is a crucial research problem because so much of judgments available in a digital form with remarkable evaluation of internet and bid data analytics. It is necessary to classify the precedents and to provide a bird- eye view of the relevant legal topics. In this study cybercrime related 2500 judgments are considered for evaluation of the Feed Forward Neural - Shuffled Frog Leaping (FNN-SFL) model. To achieve this objective a Feed Forward Neural based model with tuning of Term weights by adaption of a Bio Inspired tuning model Shuffled Frog Leaping model. The experiments are conducted and implemented the newly proposed FNN-SFL algorithm. The results and discussions are presented. The conclusions and future scope are presented at the end of the paper.
judgement case classification, shuffled frog leaping model, optimization
Data Mining, just as information revelation, is the framework helped procedure of making by removal through and examining monstrous arrangements of information and afterward extricating the valuable data present in the information. It is additionally a strategy for finding astute, intriguing, and novel examples, just as enlightening, justifiable, and prescient models from enormous scale information [1]. In a basic manner, Data mining alludes to the way toward separating information that is important to the user. Due to the intense increment of advanced information the web, innovation allows the framework to play out the outline procedure to get to the abbreviated form of the computerized information [2]. Such innovation was executed in different fields to improve the advancement of work completed identified with that.
These days Legal Experts need the examination network to do some innovative creation to limit their work pressure and to accelerate the procedure. In this way the synopsis strategy was executed in the lawful field, to upgrade the judgment outline process. Indian Legal System follows the Statutes just as the Common Law.
Resolutions were the authoritative procedure or regulations issued by the Government, while Common Law was created by the judges through choices of courts and councils. In detail, a Common law is likewise called as ‘Precedent’, a standard of law which is built up by a court just because for a specific sort of case and after that it is alluded for basic leadership in comparable cases. Choices of the judges are the wellsprings of law. Right now, lawful experts were doing the complex administrative work of deciphering the lawful focuses and condensing the past judgment substance for their case contentions or to settle on the choice from them, such procedure needs precision and speed. Human-produced synopses need additional time and labour and are moderately costly. Creating the judgment outline is a repetitive undertaking too. In this way NLP based Summarization Techniques satisfy the requirements of the legitimate specialists in a basic and productive way. In this paper we have developed an effective FNN-SFL model to classify the documents based on their relevancy.
Artificial Neural Network in machine learning plays a vital role in classification and prediction. Feed forward Neural Network (FNN) is one of the sub sections of ANN where the flow of data will be in forward manner. There exists no backward propagation to tune the weights. In such cases the choice of weights goes random which may degrade the performance of FNN. In recent days evolutionary algorithms are used for solving these optimization problems. Shuffled Frog Leaping is one such optimization model works based on the memeplexes of frogs. In this paper the SFL algorithm is used for solving FNN regarding tuning of random weights.
Occasion based learning calculations are sluggish learning calculations [3], as they defer the enlistment or speculation process until characterization is performed. Sluggish learning calculations require less calculation time during the preparation stage than energetic learning calculations, (for example, choice trees, neural and Bayes nets) yet more calculation time during the arrangement procedure. One of the clearest occurrence-based learning calculations is the closest neighbour calculation. Zaki and Jr Meira [4] introduced an audit of occasion-based learning classifiers. Along these lines, right now, from a short portrayal of the nearest neighbour calculation, we will allude to some later works. K-Nearest Neighbour (KNN) depends on the rule that the occurrences inside a dataset will for the most part exist in nearness to different cases that have comparative properties [5]. In the event that the occurrences are labelled with an arrangement mark, at that point the estimation of the name of an unclassified occasion can be dictated by watching the class of its closest neighbours. The KNN finds the k closest examples to the question case and decides its class by recognizing the absolute most continuous class label.
A study of weighting plans is given by Wettschereck et al. [6]. The intensity of KNN has been exhibited in various genuine spaces, however there are a few hesitations about the handiness of KNN, for example, I) they have huge stockpiling prerequisites, ii) they are delicate to the decision of the likeness work that is utilized to think about cases, iii) they do not have a principled method to pick k, aside from through cross-approval or comparable, computationally-costly procedure [7]. The decision of k influences the exhibition of the KNN calculation. Consider the accompanying reasons why a K-Nearest Neighbour classifier may mistakenly arrange an inquiry occasion:
Wettschereck et al. [6] explored the conduct of the KNN within the sight of loud occurrences. The trials indicated that the presentation of KNN was not delicate to the specific decision of k when k was huge. They found that for little estimations of k, the KNN calculation was stronger than the single closest neighbour calculation (1NN) for most of enormous datasets tried. Be that as it may, the exhibition of the KNN was mediocre compared to that accomplished by the 1NN on little datasets.
Okamoto and Yugami [8] spoke to the normal grouping exactness of k-NN as an element of area qualities including the quantity of preparing occasions, the quantity of significant and unessential properties, the likelihood of each characteristic, the commotion rate for each kind of clamor, and k. They likewise investigated the conduct ramifications of the examinations by exhibiting the impacts of area qualities on the normal precision of k-NN and on the ideal estimation of k for fake areas.
An opportunity to group the question occasion is firmly identified with the quantity of put away occurrences and the quantity of highlights that are utilized to portray each example. Along these lines, so as to decrease the quantity of put away occurrences, case sifting calculations have been proposed by Kubat and Cooperson [9]. Brighton and Mellish [10] found that their ICF calculation and RT3 calculation [11] accomplished the most noteworthy level of example set decrease just as the maintenance of characterization precision: they are near accomplishing unintrusive stockpiling decrease. How much these calculations perform is very great: a normal of 80% of cases is evacuated and grouping precision doesn't drop essentially. One other decision in planning preparing set decrease calculation is to change the occasions utilizing another portrayal, for example, models [12] revealed that the solidness of closest neighbour classifiers recognizes them from choice trees and a few sorts of neural systems. A learning technique is named "temperamental" if little changes in the preparation test set split can bring about huge changes in the subsequent classifier. As we have just referenced, the significant weakness of occasion-based classifiers is their enormous computational time for arrangement. A key issue in numerous applications is to figure out which of the accessible info highlights ought to be utilized in displaying by means of highlight determination [13], on the grounds that it could improve the arrangement precision and scale down the necessary characterization time. Moreover, picking an increasingly reasonable separation metric for the particular dataset can improve the exactness of case-based classifiers.
The novel contribution of paper is to classify the cyber crime judgments precedents for providing expert and pertinent information to the cyber crime legal stakeholders. The stakeholders extract information from the precedents is a crucial research problem because so much of judgments available in a digital form with remarkable evaluation of internet and bid data analytics. It is necessary to classify the precedents and to provide a bird- eye view of the relevant legal topics. In this study cybercrime related 2500 judgments are considered for evaluation of the Feed Forward Neural - Shuffled Frog Leaping (FNN-SFL) model. To achieve this objective a Feed Forward Neural based model with tuning of Term weights by adaption of a Bio Inspired tuning model Shuffled Frog Leaping model.
2.1 Classification of judgement database
The summarization of documents in a criminal case is the base for judgement classification. As a base model, the public prosecutors are the people who are responsible for examining the prosecution against the offender. Few are of with less impact and the judgement in the court will be summarised in the judgement case files. This will be done after the procedures of policemen being carried out. The policemen will be collecting the evidences of the crime scenes and the summary will be submitted in front of the judges. These are all the criterion that are carried out in hard criminal cases [14].
In terms of cyber crime cases the police will be submitting the proof’s in the form of softcopy and they are not meant to be summarised in full fledged manner in the case files. In such cases the classification model is of high with difficulty to categorise into proforma.
In our model we used ANN model for classification purpose and the classification model will be taken care with the other procedures such as classification and weight tuning. In weight tuning we have incorporated Shuffled Frog Leaping algorithm for examining purposes.
2.2 Working principle multi-layer perceptron in FNN
An example module of Multilayer perceptron in Feed Forward Neural Network is given in Figure 1. It consists of 1 input layer which can take n number of input features and one hidden layer where the number of neurons can be higher or lessen than the input neurons and one other output layer where it has to be at least one.
The MLP output can be calculated as follows: Initially the weighted sum of inputs is computed using Eq. (1)
$s_{j}=\sum_{i=1}^{n}\left(W_{i, j} \times X_{i}\right)-\theta_{j}$ (1)
Figure 1. MLP
Such that the j ranges from 1 to n, and n represents the number of input neurons in the model. S refers the out of input neuron with the computation of it along with the weights and it will be the input for the hidden layer. The output of the hidden layer will be 3 computed as follows
$S_{j}=\frac{1}{\left(1+\exp \left(-s_{j}\right)\right)}$ (2)
where, j ranges from 1 to h such that h in the number of hidden neurons in the hidden layer.
Then the final output can be computed as
$o_{k}=\sum_{j=1}^{h}\left(W_{j, k} \times S_{j}\right)-\theta_{k}$ (3)
here, k ranges from 1 to m.
$O_{k}=\frac{1}{\left(1+\exp \left(-o_{k}\right)\right)}$ (4)
where, $W_{j, k}$ represents the weight of each edge from hidden layer j to the output layer node $k, \theta_{k}$ is the bias value of output layer node k.
In MLP the weights and bias values have a great impact on outputs. When the weights and bias values are tuned to obtain the ideal output values then the classification of test datasets will be ideal towards the predicted output. Training of MLP includes obtaining optimal values for each value of weight and bias.
2.3 Tuning of weight parameters
The weights and the bias values are the most prominent features of FNN which possess the theme to produce better classification model of the given problem. In our cases the problem will be described as a set of features for every judgement case file. Each file will be represented as a set of features and each feature in the judgement case will act as an input to FNN. The weights in FNN tunes itself to an optimal value using the mathematical equations so that the output classification to be crisp and effective. In this paper we proposed Shuffled Frog Leaping algorithm for tuning the weights of FNN.
Basically, there are three different approaches are followed to train an MLP using heuristic algorithm.
(1): In the first approach, the heuristic algorithms are employed to obtain an optimal combination of weights and bias values which reduces the total minimal error rate.
(2): In the second approach, the MLP architecture will be designed using a heuristic algorithm w.r.t. the problem domain.
(3): The third approach is to employ the heuristic algorithm to fine tune the parameters of gradient based algorithms which further carry over the process of MLP.
In our method we follow the first method namely Vector based approach. An example of solution representation is given in Figure 2
Figure 2. Structure of MLP (2-3-1)
2-3-1 indicates that the MLP in Figure 2 consists of two input nodes, three hidden nodes and one output node. The weights in this which are in need to be optimized includes $\left\{w_{1,3}, w_{1,4}, w_{1,5}, w_{2,3}, w_{2,4}, w_{2,5}, w_{3,6}, w_{4,6}, w_{5,6}\right\}$.
SHUFFLED FROG LEAPING ALGORITHM for Tuning the Weights in FNN |
Variables Used nß # number of weights GßMaximum No. of Generations PopSizeßTotal number of frogs f()ß objective function pß Total number of memeplexes qß number of frogs in each memeplex Pop_{Popsize}_{×n}ßPossible solutions / Search space $\operatorname{Frog}_{n \times d}$ßSingle solution in the possible solutions / Search space $D_{\max }$ßMax allowed position drift by a individual Frog $_{v}$ßVirtual individual Ubßupper bound (1) Lbßlower bound (0) ALGORITHM: Step 1: Initialize t←1, p,q, PopSize, G Step 2: Generate random values for frogs For each i=1:PopSize do $\operatorname{Frog}_{i} \leftarrow u b+(u b-l b) *$ rand () // rand is used to generate the random numbers between 0 to 1 End Step 3: Calculate the fitness using FNN For each $i=1:$ PopSize do Fit_Frog$_{i} \leftarrow f\left(\right.$Frog $\left._{i}\right)$// f() is the objective function where the individuals are evaluated based on the fitness value. this fitness value will be used for ranking the solutions End Step 4: Sort the frogs in ascending order based on the objective function //Rank the frogs based on the fitness value. Sor the individuals in ascending order based on the fitness array. Step 5: Partitioning the frogs into memeplexes For each $i=1:$ PopSize do memeplex $_{q} \leftarrow \operatorname{imod}(p)$// q ranges from 1 to q which indicates the total number of memeplexes. End Step 6: for each memeplex Repeat through Step 10 Until $t \leq G$, else go to Step 10 Step 7: iteration starts for each memeplex Step 7.1: Re-frame the labels of each frogs with the memeplex number and the index of the frog in each memeplex. Step 7.2: Compute the probability for each frog in the memeplex $P\left(\right.$Frog $\left._{j}\right)=2(q+1-j) / q(q+1)$ Step 7.3: Generate a sub-memeplex with q frogs based on the random probability. Step 7.3.1: Find the worst and the best frogs Frog $_{B}$ and Frog $_{w}$ based on the fitness value in each sub memeplex. Step 7.3.2: Alter the position of the worst frog using the Equation $T_{j}=\operatorname{rand}() \times\left(\operatorname{Frog}_{B}-\operatorname{Frog}_{w}\right)$ Step 7.3.3: Add the new position into the existing frog position. $\operatorname{Frog}_{w_{n e w}}=\operatorname{Frog}_{w}+T_{j}$ Step 7.3.4: Find the fitness of the frog. Fit_Frog$_{w_{n e w}} \leftarrow f\left(\right.$Frog $\left._{w_{\text {new}}}\right)$ Step 7.3.5: If (Fit_Frog $_{w_{n e w}}<$Fit_Frog $\left._{w}\right)$ // for min problem $\operatorname{Frog}_{w}=$Frog$_{w_{\text {new}}}$ Step 7.3.6: Else Step 7.3.7: Alter the position of the worst frog using the Equation $T_{j}=$Rand ()$\times\left(\operatorname{Frog}_{G}-\operatorname{Frog}_{w}\right)$ Step 7.3.8: Add the new position into the existing frog position. Frog $_{w_{\text {new}}}=\operatorname{Frog}_{w}+T_{j}$ Step 7.3.9: Find the fitness of the frog using FNN Fit_Frog$_{w_{n e w}} \leftarrow f\left(\right.$Frog $\left._{w_{\text {new}}}\right)$ Step 7.3.10: If (Fit_Frog$_{w_{\text {new}}}<$Fit_Frog$_{w}$) // for min problem $\operatorname{Frog}_{w}=$Frog$_{w_{\text {new}}}$ Step 7.3.11: Else Step 7.3.12: $\operatorname{Frog}_{w} \leftarrow u b+(u b-l b) *$rand () Step 8: Shuffle all the memeplexes // Shuffle the groups once the number of sub iterations completed Step 9: t=t+1 Step 10: Find the final best solution stored in Frog $_{b}$ |
OUTPUT: $\operatorname{Frog}_{B}$ |
In this paper a detailed analysis on the working model of classification with the fusion of FNN and SFL has been done. The results show the significance of the proposed model in terms of accuracy, precision, recall, sensitivity, specificity and miss ratio. With BSFL, FNN-SFL outperforms existing models such as KNN against Decision Tree, Gaussian Naïve Bayes, Logistic Regression, Random Forest and against SVM. Hence the research gives the clear idea on classification of judgement cases in which the future model can be extended with phenomenal accuracy over the classification model on judgement case files.
[1] Mani, I., Maybury, M.T. (1999). Advances in Automatic Text Summarization. Cambridge MIT Press.
[2] Domingos, P., Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, 29: 103-130. https://doi.org/10.1023/A:1007413511361
[3] Breslow, L.A., Aha, D.W. (1997). Simplifying decision trees: A survey. Knowledge Engineering Review, 12(1): 1-40. https://doi.org/10.1017/S0269888997000015
[4] Zaki, M.J., Jr Meira, W. (2014). Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press.
[5] Cover, T., Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1): 21-27. https://doi.org/10.1109/TIT.1967.1053964
[6] Wettschereck, D., Aha, D.W., Mohri, T. (1997). A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artificial Intelligence Review, 11: 273-314. https://doi.org/10.1023/A:1006593614256
[7] Yang, Y., Webb, G.I. (2003). On why discretization works for Naive-Bayes classifiers. Lecture Notes in Computer Science, 2903: 440-452. https://doi.org/10.1007/978-3-540-24581-0_37
[8] Okamoto, S., Yugami, N. (2003). Effects of domain characteristics on instance-based learning algorithms. Theoretical Computer Science, 298(1): 207-233. https://doi.org/10.1016/S0304-3975(02)00424-3
[9] Kubat, M., Cooperson, M. (2001). A reduction technique for nearest-neighbor classification: Small groups of examples. Intelligent Data Analysis, 5(6): 463-476. https://doi.org/10.3233/ida-2001-5603
[10] Brighton, H., Mellish, C. (2002). Advances in instance selection for instance-based learning algorithms. Data Mining and Knowledge Discovery, 6: 153-172. https://doi.org/10.1023/A:1014043630878
[11] Wilson, D.R., Martinez, T. (2000). Reduction techniques for instance-based learning algorithms. Machine Learning, 38: 257-286. https://doi.org/10.1023/A:1007626913721
[12] Jensen, F. (1998). An introduction to Bayesian networks. Journal of the Royal Statistical Society. Series D (The Statistician), 47(2): 397-398.
[13] Bouckaert, R.R. (2004). Naive Bayes classifiers that perform well with continuous variables. Lecture Notes in Computer Science, 3339: 1089-1094. https://doi.org/10.1007/978-3-540-30549-1_106
[14] Vishwanathan, S.V.M., Murty, M.N. (2002). SSVM: A simple SVM algorithm. Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290), Honolulu, HI, USA, pp. 2393-2398. https://doi.org/10.1109/IJCNN.2002.1007516
[15] Soucy, P., Mineau, G.W. (2001). A simple KNN algorithm for text categorization. Proceedings 2001 IEEE International Conference on Data Mining, San Jose, CA, USA, pp. 647-648. https://doi.org/10.1109/ICDM.2001.989592
[16] Sugumaran, V., Muralidharan, V., Ramachandran, K.I. (2007). Feature selection using decision tree and classification through proximal support vector machine for fault diagnostics of roller bearing. Mechanical Systems and Signal Processing, 21(2): 930-942. https://doi.org/10.1016/j.ymssp.2006.05.004
[17] McGeachie, M.J., Chang, H.H., Weiss, S.T. (2014). CGBayesNets: Conditional Gaussian Bayesian network learning and inference with mixed discrete and continuous data. PLoS Computational Biology, 10(6): e1003676. https://doi.org/10.1371/journal.pcbi.1003676
[18] Hosmer, D.W., Lemeshow, S. (2013). Applied Logistic Regression. John Wiley & Sons. https://doi.org/10.1002/0471722146
[19] Ho, T.K. (1995). Random decision forests. Proceedings of 3rd International Conference on Document Analysis and Recognition, Montreal, Quebec, Canada, pp. 278-282. https://doi.org/10.1109/ICDAR.1995.598994