Home Journals AMA_B A Psychoacoustic Model and a Filter Bank Design Using Optimization for Speech Compression

JOURNAL METRICS

CiteScore 2019: 0.50 ℹCiteScore:

CiteScore is the number of citations received by a journal in one year to documents published in the three previous years, divided by the number of documents indexed in Scopus published in those same three years.

SCImago Journal Rank (SJR) 2019: 0.117 ℹSCImago Journal Rank (SJR):

The SJR is a size-independent prestige indicator that ranks journals by their 'average prestige per article'. It is based on the idea that 'all citations are not created equal'. SJR is a measure of scientific influence of journals that accounts for both the number of citations received by a journal and the importance or prestige of the journals where such citations come from It measures the scientific influence of the average article in a journal, it expresses how central to the global scientific discussion an average article of the journal is.

Source Normalized Impact per Paper (SNIP) 2019: 0.415 ℹSource Normalized Impact per Paper(SNIP):

SNIP measures a source’s contextual citation impact by weighting citations based on the total number of citations in a subject field. It helps you make a direct comparison of sources in different subject fields. SNIP takes into account characteristics of the source's subject field, which is the set of documents citing that source.

123.png

A Psychoacoustic Model and a Filter Bank Design Using Optimization for Speech Compression

Talbi Mourad^*| MedS. Bouhlel

Laboratoire des Semi-Conducteurs, Nanostructures et Technologie Avancée, Center of Researches and Technologies of Energy of Borj Cedria, Tunis 952050, Tunisia

Sciences Electroniques, Technologie de l'Information et Télécommunications (SETIT), Sfax BP 11693029, Tunisia

Corresponding Author Email:

talbi1969@yahoo.fr

Received:

7 April 2018

| |

Accepted:

8 June 2018

| | Citation

61.02_05.pdf

OPEN ACCESS

Abstract:

In this paper we propose a new speech compression technique employing psychoacoustic model and a general approach for Filter Bank Design using optimization. This technique is inspired from an audio compression technique using psychoacoustic model and a Modified Discrete Cosine Transform (MDCT) filter banks of 32 filters. In fact, in this proposed approach, we have used Uniform/Non-Uniform Filter Bank (which is designed using optimization) instead of a MDCT filter banks of 32 filters. The two techniques are evaluated and compared with each other by computing bits before and after compression. They are tested and applied to different speech signals. The simulation results obtained from the computation of the compressed files size and the Compression Ratios (CR), show that the proposed technique outperforms the second one. In term of perceptual speech quality, the outputs speech signals of the proposed compression system are with good quality. This is justified by the computation of SNR (Signal to Noise Ratio), PSNR (Peak Signal to Noise Ratio), NRMSE (Normalized Root Mean Square Error) and PESQ (Perceptual evaluation of speech quality). We have also compared the proposed technique to one previous research work which is a speech compression technique based on Discrete Wavelet Transform (DWT) and integrating a Voice Activity Detection (VAD) Module. This comparison is also based on the computation of SNR, PSNR, NRMSE, PESQ and CR and the obtained results show that the proposed technique outperforms this third technique based on DWT and VAD.

Keywords:

speech compression, psychoacoustic model, Filter Bank, optimization, bits before/bits after compression

1. Introduction

2. The Proposed Technique

3. Results and Discussion

4. Conclusion

Acknowledgment

References

[1] Xie N, Dong G, Zhang T. (2011). Using lossless data compression in data storage systems: not for saving space. IEEE Transactions on Computers 60(3): 335–345.

[2] Gibson JD. (2005). Speech coding methods, standards, and applications. IEEE Circuits and Systems Magazine 5(4): 30–49.

[3] Riadh A, Salim S, Said G, Ali C, Taleb Ahmed A. (2014). Novel detection algorithm of speech activity and the impact of speech codecs on remote speaker recognition system. WSEAS Transactions on Signal Processing. E-ISSN: 2224-3488.

[4] Junejo N, Ahmed N, Unar MA, Rajput AQK. (2005). Speech and image compression using discrete wavelet transform. In IEEE Symposium on Advances in Wired and Wireless Communication, pp. 45–48.

[5] Arif M, Anand RS. (2012). Turning point algorithm for speech signal compression. International Journal of Speech Technology. 10.1007/s10772-012-9151-7, 2012.

[6] Gersho A. (1992). Speech coding. In A. N. Ince (Ed.), Digital Speech Processing, pp. 73–100. Boston: Kluwer Academic.

[7] Gersho A. (1994.). Advance in speech and audio compression. Proceedings of the IEEE 82(6): 900–918.

[8] Agbinya JI. (1996). Discrte wavelet transform techniques in speech processing. In IEEE Tencon Digital Signal Processing Applications Proceedings, New York: IEEE, pp. 514–519.

[9] Gershikov E, Porat M. (2007). On color transforms and bit allocation for optimal subband image compression. Signal Processing. Image Communication 22: 1–18.

[10] W.A.V.S. Compression. http://www.aamusings.com/project documentation/wavs/index.html

[11] http://www.360doc.com/content/05/0810/17/641_5382.shtml

[12] Giron-Sierra JM. (2017). ‘Digital signal processing with matlab examples. http://www.springer.com/us/book/9789811025334

[13] Moazzen I, Agathoklis P. (2014). A general approach for filter bank design using optimization. Technical Report. http://www.ece.uvic.ca/~imanmoaz/publications.htm

[14] Aloui N, Boussalmi S, Cherif A. (2015). Optimized speech compression algorithm based on wavelets techniques and its real time implementation on DSP. I.J. Information Technology and Computer Science 3: 33-41.

[15] Suryavanshi HE, et al. (2013). Digital image watermarking in wavelet domain. International Journal of Electrical and Computer Engineering (IJECE) 3(1): 1– 6.

[16] Narmadha D, et al. (2014). An optimal HSI image compression using DWT and CP. International Journal of Electrical and Computer Engineering (IJECE) 4(3): 411– 421.

[17] Boussalmi S, Aloui N, Cherif A. (2016). Adaptive speech compression based on discrete wave atoms transform. International Journal of Electrical and Computer Engineering (IJECE) 6(5): 2150-2157.

[18] Talbi M, Barnoussa C, Cherif A. (2013). Speech compression based on psychoacoustic model and a general approach for filter bank design using optimization. ACIT’2013.

[19] Talbi M. (2017). Speech enhancement based on stationary bionic wavelet transform and maximum a posterior estimator of magnitude-squared spectrum. International Journal of Speech Technology 20(1): 75–88.

[20] Chattopadhyaya A, Chattopadhyay S, Bera JN, Sengupta S. (2016). Wavelet decomposition based skewness and kurtosis analysis for assessment of stator current harmonics in a PWM – fed induction motor drive during single phasing condition. Advances in Modelling and Analysis B 59(1): 1-14.

[21] Sun Q, Zhao X. (2016). A new subspace based speech enhancement algorithm with low complexity. Advances in Modelling and Analysis B 59(1): 164-176.

IJHT
MMEP
ACSM
EJEE
ISI
I2M
JESA
RCMA
RIA
TS
IJSDP
IJSSE
IJDNE
JNMES
IJES
EESRJ
RCES
AMA_A
AMA_B
AMA_C
AMA_D
MMC_A
MMC_B
MMC_C
MMC_D

Username
Password
Remember me

Search form

A Psychoacoustic Model and a Filter Bank Design Using Optimization for Speech Compression