Recognition of Hand Motion Trajectory Gestures for Novel Input Interfaces

Recognition of Hand Motion Trajectory Gestures for Novel Input Interfaces

Prashant Richhariya Piyush Chauhan Lalit Kane Ashutosh Pasricha Bhupesh Kumar Dewangan

School of Computer Science, University of Petroleum and Energy Studies, Dehradun-248007, India

Department of IT, Slumberger Asia Limited, Gurgaon-110038, India

Department of Computer Science and Engineering, O P Jindal University, Raigarh-496109, India

Corresponding Author Email: 
bhupesh.dewangan@gmail.com
Page: 
919-924
|
DOI: 
https://doi.org/10.18280/ria.360613
Received: 
28 September 2022
|
Revised: 
24 October 2022
|
Accepted: 
29 October 2022
|
Available online: 
31 December 2022
| Citation

© 2022 IIETA. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).

OPEN ACCESS

Abstract: 

This work addresses an example for dynamic hand signal acknowledgment by utilizing a Kinect V2. The projected plan takes oneself inspired motion (general media stream) as info, separates hand region and processes hand signal highlights, and uses these elements to perceive the motion. We projected free penmanship and our strategy remembers it progressively utilizing the proposed highlight portrayal. This proposed strategy utilizes an efficient fingertip acknowledgment approach and composing with the free hand the utilization of a fingertip. We verify our strategy on Kinect V2. On a dataset gathered from various clients, we accomplish an acknowledgment exactness of 98% for character acknowledgment. We likewise show the way that this framework can be stretched out for word list acknowledgment with solid execution and additionally arranged a dataset containing data frame of the moving video and fetching characters from the database a typical benchmark to manually written character acknowledgment utilizing understanding word and finding better accuracy through machine learning algorithm parameters.

Keywords: 

recognition, computer vision, gesture, trajectory

1. Introduction

The effective Human Computer Interaction (HCI) is principally supported by the use of gestures. In natural human communication, speaker and audience both uses gestures to enhance the expressiveness and linguistic reasoning of their communication. The audience take up the conversation being done is also based on the speakers’ signal. Thus, an investigation of gestures should add value to the function of the interaction [1]. The approaches that have been in use for hand gesture recognition are non-vision and vision-based [2]. Vision-based are more natural and are further divided into active and passive logic. The Technique for gesture recognition becomes more successful by using active sensing, in particular through the use of Kinect V2 [3, 4] and Leap Motion cameras.

The tools developed for helping the linguist to analyses the Gesture for interaction [4]. rest, stroke, preparation, hold and retraction involve a range of phases of hand Gesture. Applications involving hand gesture, the first step is Segmentation. Segmentation of these phases for gesture analysis is one of the major concerns [4, 5]. For CHGSR either of the following approaches are followed:

  • Segmentation of image before recognition,
  • Synchronized segmentation and recognition.

Later move toward deemed to be more natural as it does not need extra gestures [5, 6]. So, this paper is focused on developing a framework based on simultaneous segmentation and recognition.

In passive sensing under vision-based HCI using sensors such as Kinect, the system needs to execute segmentation based on spatial and sequential information. The hand is positioned at each frame during gesturing when Spatial segmentation is established. Temporal segmentation determines when the gesture motion starts and ends. In continuous video stream both spatial and temporal segmentation are of primary importance. However, in a continuous stream of motion, the gestures of attention are fixed in front of a cluttered or non-static background. So, a key component of communication shows position coordinates and direction of trajectory information. In addition, variations in gesture velocity may also result related difficulties.

2. Literature Review

With the end goal of movement affirmation, a strong hand locator is utilized that identifies the signaling hand by utilizing the movement recognition which is simply founded on outline differencing and profundity division to such an extent that when we are catching a picture with a dynamic camera the profundity esteem addresses for every pixel of that specific picture. Furthermore, hence, a powerful programming strategy has presented the calculation to be specific DTW, to look at the test and show directions and afterward perceive the motion. A principal hindrance of DTW is that when it is utilized having a precise hand location is assumed. In this exploration, the primary errand is to lessen the time intricacy O(mn) [1].

The informational index was utilized through the motor camera has 480x640 size. In this method, we have considered various location of the competitor's hand districts and this can be utilizing Dynamic Space-Time-Warping (DSTW) [2]. For this, main another comparative methodology is numerous speculations following where the various theories are related with different perceptions. On the other hand, HMM structure was likewise proposed by Satao and Kobayashi and expanded the Viterbi calculation for various up-and-comer perceptions. In any case, this approach isn't interpretation invariant.

The DTW calculation adjusts the inquiry grouping and a model succession and the n processes a score that characterizes the question grouping [3].

The hand identifier utilized in this exploration is applied to RGB variety pictures and is additionally founded on movement and skin discovery. Use the Skin Type Histogram to create an image of skin types. In this histogram, each pixel is assigned a value indicating the probability that it is skin. [4, 5]. These utilizations outline differencing, which plays out the activity of registering for every pixel. The boundaries utilized for the given work are not difficult to test, hardness test, and aggregate test. Also, the ideal outcome for the work is "simple test precision +85% daunting test =20% ".

To improve the exhibition pre-handling strategy is proposed. This technique can foster the example coordinating blend with DTW arrangement and distance vector likenesses [6]. In this paper, the scientist removed the element vector through video catch through the dynamic camera in its grasp identification cycle should be possible with position speed increase, chain code with the pre-handling of direction change, and standardization [7, 8]. This cycle can be additionally improved with DTW arrangement and the K-NN characterization technique, in it L1, and L2 distances can figure out through various factual strategies. The technique applied in this paper is to deduct the place of the hand in the principal outline from the whole direction then find the base shut shape of the informational collection and find the bouncing 3D square then, at that point, resize the block to a unit solid shape. Interpretation and scaling invariant can be standardized, however different boundaries like speed, speed increase, and chain code can't, because of interpretation invariant [9].

In this work, the issue of irregularity of all human motions is dealt with. The gaussian blend model is utilized with its parts and the directions were like past models [10].

In this research, work was finished on the precision of the estimation. To lessen and work with the handling season of rearrangement of Gestures, Gestures are encoded into modules or portions. In this work primary center was to further develop the handling effectiveness of Gesture Recognition on dubious Gestures [11]. The calculation utilized by the scientist was lower guess and LCS. The informational index utilized in this work is a motor profundity-based camera picture, which is 320x240 with 30 fps and the boundary utilized were unclearness and sureness acknowledgment rate, and pre- handling time. In any case, the serious issue was that LCS- based Gesture acknowledgment is tested by suspicious Gestures and on the other hand CRF based strategies are complicated and costly.

The examination was finished on bunching Gesture direction. In this work the examples of Gestures were of various layered spaces and layered lengths accordingly as opposed to utilizing Euclidean distance, the LCS measure is utilized. Scientists made sense of MDS as a sub-fragment of a Gesture that is generally unlike the wide range of various motion sub-sections [12]. These Gestures are ordered utilizing a MDSLCS calculation and this calculation is utilized to acknowledge most separating fragments utilizing LCS measure. For the extraction of MDS boundaries utilized are comparability estimation, MDSLCS calculation as O(logK)+O(ck+B)mn) and the informational collection utilized is Kinetic profundity-based camera pictures.

The calculation utilized for this work is HMM and IOHMM [13]. This paper shows the acknowledgment rate which is 98% in HMM though in IOHMM it is a lot of poor [14]. The data set utilized in this work comprises of 16 signals.

3. Steps in HGR

Direction assessment is more muddled than assessment of pen-up/pen-down movement; clients utilize different composing styles and velocities, which thus yield various directions and uncover new skylines of conceivable outcomes. Nonetheless, some cutting edge movement sensors and gadgets give us recovery from this issue.

Nonetheless, crisscross impacts persevere because of the unique composing style. Clients compose digits in the 3D space inside a fanciful box, which isn't fixed; no delimiter recognizes the limit of the area of interest. Subsequently, the direction is unstable while undesirable groupings are drawn that make the framework really testing. Eliminating incorrect fleeting and spatial direction information by applying a few channels and standardization techniques is fundamental. Standardization is basic to prune feeble elements. AI based approaches require emotional component determination, which is a difficult errand. Better elements yield improved results. In any case, the profound learning-based approach deters the requirement for manual element age nearby dataset.

Figure 1. Movement of angles

The movement of angle is shown in Figure 1. The quantifiable display association shows that the projected model strokes with the previous work. The function arctan_2, (y1 x1) is a four-quadrant arctangent function, such that (),

arctan(X1/Y1)   Y1, X1>=0

arctan_2(yx)=π-arctan(Y1/-X1), Y1>0, X1<0

π+arctan(-Y1/-X1)   X1, Y1<=0

2π-arctan(-Y1/X1)   Y1<0, X1>0

The previous three features’ angles θ8, θ9, θ10 are viewpoints of the trio containing of the initial point (X0, Y0), finish point (X7, Y7) and Normal point (Xc, Yc) and are found using the Cosine theorem,

θ8=arc_cos(b2+c2-a2)/2bc

θ9=arc_cos(a2+c2-b2)/2ac

θ10=arc_cos(a2+b2-c2)/2ab

where, (a, b, c) are edge distances of the trio.

3.1 Recognition

For air-based composing frameworks, we really want a portrayal that can be adjusted to utilize the cutting edge methods to perceive the person for the vast majority of the internet penmanship character acknowledgment frameworks, highlights are produced as time series information. With this rationale, we utilize the spatial data of the time series information as the summed up highlight portrayal for the person acknowledgment procedures [15].

3.2 Character formation

The fingertip point found from the above step is definitely not a solitary point yet a bunch of 3 focuses as examined. This outcome in a scanty portrayal of the composition on the off chance that the focuses are associated with structure a direction. This is because of the explanation that a lot of focuses lie at a similar profundity as Xmax because of low- goal pictures given by Kinect. We take care of this issue by choosing just the mean point among the pack of focuses. This arrangement of focuses brings about an intermittent direction. The division between the two focuses and the degree of brokenness unequivocally relies on the speed of composing and the edge catching pace of the Kinect. The process is shown in Figure 2.

Figure 2. Flow process

We conquer this issue by involving vector variable-based math for the places. For each example, we presently get just the spatial area to which the fingertip is pointing which is addressed as (xi, Yi). We draw the direction by joining this multitude of focuses as displayed in Fig 3(c). Focuses are joined by drawing a vector between the contiguous focuses to such an extent that the direction contains N vectors beginning from outline f=0 to f=N with vectors v1, v2..., vN. This direction is addressed as T= {vi:vi = (xi-xi-1, Yi-Yi-1)}. The length of these vectors relies on the composing speed and the edge rate.

The point between the vectors relies on the composing style of the client. Utilizing these vectors, we work out the speed of the client as:

st=Xmax(M+1)m=1||vm||

where, M+1 is the quantity of edges each second and it is the length of direction crossed between time t-1 to t. We call the piece of the direction significant info in the event that the speed ascends from one zero to another. This dispenses with the need for movement of the pen motions [15].

3.3 Motion separation

A motion comprises at least 2 strokes. Each motion is interestingly distinguished in view of:

  • number of strokes (>1),
  • point grouping and
  • strokes proportionality.

The sequence is shown in Figure 3. The separation cycle is done successively, in view of the 3 boundaries. As every one of the motions that arrive at this handling step was at that point approved from the number of strokes and point grouping perspective, these boundaries are severe, while the stroke proportionality is permitted to change inside a mistake window.

The primary boundary to consider while separating between signals is the number of strokes. Further, the point grouping is decoded. Much of the time, this step is sufficient to extraordinarily recognize the signal and end the investigation. Some 4-stroke motion codes (e.g., square/square shape codes) need further order in view of stroke proportionality. The stroke extents are permitted to fluctuate inside a mistake window. The mistake window should be picked adequately enormous to oblige the blemishes of the hand-drawn direction, however little enough to permit the right separation between various motions.

Figure 3. Sequential gesture representation

4. Proposed Method

4.1 Subsequent

Basically, there are two strategies for direction Linear and Non-Linear [16, 17]. The district containing the hand to follow should be first chosen. Then, a cover is applied to the HSV picture to take out the pixels which have too little immersion and furthermore pixels with too little or too high worth. The tone is excessively uproarious for these eliminated pixels. A model of the ideal tint of the item to follow is made for the main casing utilizing a variety of histograms. The camshaft calculation is then used to follow the article in the following cases.

4.2 Trajectory recording

The continuous focuses of the followed district characterize a somewhat unpleasant (loud) direction, expanding the trouble of stroke discovery. Hence, the direction is smoothed so that each new recorded direction point, t[i], is gotten as a weighted normal of the new estimated point, m[i], and the past direction point, t[i-1]:

t[i]=am[i]+(1-α)t[i-1]

Our useful tests uncovered that worth of 0.4 for the weighting boundary, α, produces a generally smooth direction. More modest qualities will quite often over smooth the direction, while bigger qualities lead to unpleasant directions somewhat. The recording of another signal direction is set off by a development of the client's hand happening after a short span (1-2 seconds) of a static position. Least edges are forced on the sufficiency and speed of the development to abstain from bogus setting off because of following commotion or hand shudder. The motion direction recording closes when the development speed falls underneath the forced edge for something like 2 seconds. A bunch of points with the level pivot is registered over the recorded direction. Registering the plot for every not entirely set in stone by two successive places of the direction might bring about an exceptionally boisterous point set, with many misleading point discontinuities. This commotion is brought about by points between direction focuses that are somewhat near one another in light of the fact that the picture is tested on a rectangular network. Choosing a decreased number of directions focuses utilizing a decent step (e.g., selecting each second or third mark of the direction) brings about a generally smoothed point set. Further developed results can be gotten by adaptively choosing direction focuses in view of an edge distance. Indeed, even with the proper step determination, a distance limit should be forced to try not to process the point on the off chance that the two focuses have a similar position.

4.3 Trajectory segmentation

In request to part the direction into strokes, the finishes of these strokes should be distinguished. The beginning stage of the direction is additionally the beginning stage of the principal section, and the endpoint of the direction is the endpoint of the last fragment. Any remaining strokes' endpoints are recognized as point brokenness focuses, the beginning stage of each portion being the finish of the past section. A stroke irregularity is distinguished as a point between two little sections that have essentially various points with the flat hub. For this reason, a subsidiary over the points set is processed:

dθ/di[i]=(θ[i]-θ[i-1])/(i-(i-1))=θ[i]-θ[i-1]

Every one of the maximums of the outright worth of this subsidiary that surpass a forced edge demonstrates a significant point irregularity and relates to stroke closes. Normally, an edge of 30° is sufficient to dismiss the little fragment's point of commotion. While figuring the subsidiaries, extraordinary consideration should be taken because of the round meaning of point, so the point distinction should likewise be registered on a round space (e.g., the contrast between points of -179° and 178° is of 3° not -357°). Consequently, the point distinction in (2) is registered utilizing the connection.

dθ/di[i]=θ[i]-θ[i-1]+360,θ[i]-θ[i-1]<-180

dθ/di[i]=θ[i]-θ[i-1]+360,θ[i]-θ[i-1]≤-180

dθ/di[i]=θ[i]-θ[i-1]-360,θ[i]-θ[i-1]>-180

One more limit should be forced on the base length of a stroke, to stay away from the identification of bogus strokes. A sensible incentive for this edge is 1/10 of the picture level.

4.4 Feature extraction

After the stroke closes are recognized, the boundaries of the direction are separated. The helpful boundaries which should be separated are the quantity of strokes, point arrangement, and the strokes' lengths. The quantity of strokes can be effectively acquired from the quantity of stroke closes. Likewise, the strokes' lengths can be effectively gotten utilizing the directions of the stroke closes. The investigation of the point arrangement is done consecutively, at each step being feasible to stop the calculation in the event that a point esteem can't be grouped. Each motion realized by the framework is addressed by a systematized point grouping. The points of the 8 headings permitted and the related codes are introduced in the Table 1.

Table 1. Related code

θ[°]

0

45

90

135

-45

-90

-135

180

Code

0

1

2

3

4

5

6

7

The point of each stroke should be characterized and appointed to one of the 8 classes. To relegate the point to a class, it’s worth should fit a ±20° window around the standard worth. A5° gatekeeper space is left between sequential windows.

In the event that a stroke's point falls inside a watchman space it is preposterous to expect to order it and the examination is quit, nullifying the ongoing signal. The main point of the grouping is the point between the primary stroke and the flat pivot, while the following points can be either the points made by each stroke in the succession with the even hub or with the past stroke. The subsequent arrangement might build the strength of little worldwide direction turns however decreases the potential motions letters in order. A worldwide pivot remedy might be applied to the strokes point grouping if a huge middle deviation from the closest class is identified. The directed stroke is shown in Figure 4.

Direct grouping of the strokes point succession (left side) delivers the incorrect succession (0°, -90°, 0°), while the worldwide revolution revision permits a right grouping o the succession to (0°, -135°, 0°).

Figure 4. Directed stroke

To work on the exhibition of the framework six highlights are utilized all together. They are:

-Area includes: The typical separation from the focal point of direction to one more place of direction is called area highlights.

-Direction highlights: The condition coordinated (direction) of the beginning hand and closures hand and the quantity of critical bends tracked down in a motion. It incorporates three elements.

-MCC (Motion Chain Code): It illuminates about the direction way.

-Speed Features: It gives data about the normal speed of the way of the direction followed by a motion.

-Self-co-verbalized highlights: There are various self-co-explained strokes and the place of start and end points of the co-enunciated strokes. It can assist with recognizing self-co-verbalized motions all the more securely the non-self-co-explained signals. This incorporates a bunch of elements in view of self-co- enunciated strokes.

-Proportion and Distance highlight: It gives the proportion between the longest to littlest distances by the focal point of direction to the place of direction is determined. Subsequently the distance highlight works out the distance between the beginning and the end points of the given motions.

The area condition of been coordinated and the speed highlights are additionally been shouted in the writing.

4.5 Description of the dataset

The dataset involves 603 successions, 807241 casings (approx. 6h40m) gathered from 10 individuals performing 8 motions. Altogether, there are 6324 motion occasions. The movement documents contain tracks of 4 joints assessed utilizing the Kinect Pose Estimation pipeline. The body present is caught at an example pace of 30Hz with ~2cm exactness in joint positions. The Background is Colored as well as Black to appropriately show the direction. Following are the movement exercises of the direction shown in Figure 5.

Figure 5. Movement exercises of the direction

5. Results and Discussion

After the stroke closes are identified, the boundaries of the direction are extricated. The valuable boundaries which should be extricated are the number of strokes, the stroke point grouping, and the stroke lengths. A basic investigation of the mean shift sifted point succession got at the past step permits the extraction of the expected boundaries point and end focuses for each stroke and an all-out number of strokes. The stroke lengths can be effortlessly gotten utilizing the directions of the stroke closes. Each motion realized by the framework is addressed by a systematized point grouping. The points of the 8 headings permitted and the related codes are introduced in Figure 6. Opposite directions have complementary codes.

Figure 6. Trajectory representation

Here we will construct an error estimate in light of how well the competitor direction fulfills the framework elements between the collocation focuses. The rationale here is that in the event that the framework elements are precisely fulfilled between the collocation focuses, the polynomial spline is an exact portrayal of the framework, which would then suggest that the nonlinear program is an exact portrayal of the first direction enhancement issue.

Using our proposed structure, we had the choice to achieve an unrivalled slip-up speed of 27.6%. It was illogical to differentiate our work and a couple of open systems, since they shift in various perspectives like the sort of data getting; immediate assessment, or vision-based. For sure, even in the vision-based system, a couple of structures require the guarantor to wear concealed gloves. Another issue rises out of the shortfall of a normal informational collection for the evaluation of signal-based correspondence affirmation structures.

When providing numerical values followed by measurement units, please leave a regular space or non- breaking space between each value and the measurement unit. This also includes percentages and degrees Celsius (e.g., 42% or 35%, 234°C, 504K). This rule also applies to the unit for litre, which is recommended to be capital “L”.

The authors are encouraged to render the numbers specifying the dot as a decimal separator and the comma as a thousand separator. Please use the British style for numbers - i.e., 1,000,000 and not 1000000 or 1 000 000.

6. Conclusion

We proposed another mistake free signal acknowledgment framework by carrying out the unique motions autonomous of skin tone and actual design of client by separating Pen and HGR method dynamically using the motion of hand trajectory properties which works in better places with ordinary force of light. The main objective of proposed system is to estimate the movement direction which recognized right hand, which separate between 20 unique motions gathered from eight distinct clients, with a typical acknowledgment pace of 85.67%. The mistake rate was principally because of the great likeness between the motions. The proposed framework had the option to further develop the mistake rate in perceiving the troublesome dataset of HGR from 41% to 26.3%. The updated plan of the proposed work is to taking care of mistake by tokens of hands or slow down the face tended to look by utilizing both changed and comparative signals as hints to have the option to expand the acknowledgment exactness.

  References

[1] Krueger, M.W. (1991). Artificial Reality II. Addison-Wesley. https://www.researchgate.net/profile/Yoshiaki-Shirai-3/publication/2904731_Extraction_of_Hand_Features_for_Recognition_of_Sign_Language/links/53feb7990cf21edafd151e69/Extraction-of-Hand-Features-for-Recognition-of-Sign-Language.pdf, accessed on Sept. 27, 2022.

[2] Fan, W., Chen, X., Wang, W.H., Zhang, X., Yang, J.H., Wang, K.Q. (2010). A method of hand gesture recognition based on multiple sensors. In 2010 4th International Conference on Bioinformatics and Biomedical Engineering, Chengdu, China, pp. 1-4. https://doi.org/10.1109/ICBBE.2010.5516722

[3] Zhang, X., Chen, X., Li, Y., Lantz, V., Wang, K., Yang, J. (2011). A framework for hand gesture recognition based on accelerometer and EMG sensors. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 41(6): 1064-1076. https://doi.org/10.1109/TSMCA.2011.2116004

[4] Yang, Q. (2010). Chinese sign language recognition based on video sequence appearance modeling. In 2010 5th IEEE Conference on Industrial Electronics and Applications, Taiwan, China, pp. 1537-1542. https://doi.org/10.1109/ICIEA.2010.5514688

[5] Cao, X.Y., Liu, H.F., Zou, Y. (2010). Gesture segmentation based on monocular vision using skin color and motion cues. In 2010 International Conference on Image Analysis and Signal Processing, Zhejiang, China, pp. 358-362. https://doi.org/10.1109/IASP.2010.5476096

[6] Sturman, D.J., Zeltzer, D. (1994). A survey of glove-based input. IEEE Computer graphics and Applications, 14(1): 30-39. https://doi.org/10.1109/38.250916

[7] Mitra, S., Acharya, T. (2007). Gesture recognition: A survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 37(3): 311-324. https://doi.org/10.1109/TSMCC.2007.893280

[8] Wu, Y., Huang, T.S. (1999). Vision-based gesture recognition: A review. In International Gesture Workshop, GW'99, Gif-sur-Yvette, France,  pp. 103-115. https://doi.org/10.1007/3-540-46616-9_10

[9] Hamada, Y., Shimada, N., Shirai, Y. (2002). Hand shape estimation using sequence of multi-ocular images based on transition network. In Proceedings of the International Conference on Vision Interface, pp. 161-166, https://www.academia.edu/960247/Hand_shape_estimation_using_sequence_of_multi_ocular_images_based_on_transition_network?from_sitemaps=true, accessed on Sept. 22, 2022.

[10] Tanibata, N., Shimada, N., Shirai, Y. (2002). Extraction of hand features for recognition of sign language words. In International Conference on Vision Interface, pp. 391-398. https://www.researchgate.net/profile/Yoshiaki-Shirai-3/publication/2904731_Extraction_of_Hand_Features_for_Recognition_of_Sign_Language/links/53feb7990cf21edafd151e69/Extraction-of-Hand-Features-for-Recognition-of-Sign-Language.pdf, accessed on Sept. 27, 2022.

[11] Wu, Y., Huang, T.S. (2002). Nonstationary color tracking for vision-based human-computer interaction. IEEE Transactions on Neural Networks, 13(4): 948-960. https://doi.org/10.1109/TNN.2002.1021895

[12] Tomasi, C., Petrov, S., Sastry, A. (2003). 3D Tracking= Classification+ Interpolation. In Proceedings Ninth IEEE International Conference on Computer Vision, Nice, France, pp. 144-1448. https://doi.org/10.1109/ICCV.2003.1238659

[13] Ye, G., Corso, J.J., Hager, G.D. (2004). Gesture recognition using 3D appearance and motion features. In 2004 Conference on Computer Vision and Pattern Recognition Workshop, Washington, DC, USA, pp. 160-160. https://doi.org/10.1109/CVPR.2004.356

[14] Lin, J.Y., Wu, Y., Huang, T.S. (2004). 3D model-based hand tracking using stochastic direct search method. In Sixth IEEE International Conference on Automatic Face and Gesture Recognition, Seoul, Korea (South), pp. 693-698. https://doi.org/10.1109/AFGR.2004.1301615

[15] Aggarwal, R., Swetha, S., Namboodiri, A.M., Sivaswamy, J., Jawahar, C.V. (2015). Online handwriting recognition using depth sensors. In 2015 13th International Conference on Document Analysis and Recognition (ICDAR), Tunis, Tunisia, pp. 1061-1065. https://doi.org/10.1109/ICDAR.2015.7333924

[16] Challa, R.K., Rao, K.S. (2022). An effective optimization of time and cost estimation for prefabrication construction management using artificial neural networks. Revue d'Intelligence Artificielle, 36(1): 115-123. https://doi.org/10.18280/ria.360113

[17] Li, W.X. (2020). Financial crisis warning of financial robot based on artificial intelligence. Revue d'Intelligence Artificielle, 34(5): 553-561. https://doi.org/10.18280/ria.340504