Downloads

Xiong, S., Chen, X., & Zhang , H. . (2023). Deep Learning-Based Multifunctional End-to-End Model for Optical Character Classification and Denoising. Journal of Computational Methods in Engineering Applications, 3(1), 1–13. https://doi.org/10.62836/jcmea.v3i1.030103

Deep Learning-Based Multifunctional End-to-End Model for Optical Character Classification and Denoising

Optical Character Recognition (OCR) has revolutionized document processing by converting scanned documents, PDFs, and images captured by cameras into editable and searchable text. This technology is crucial for digitizing historical documents, streamlining data entry processes, and improving accessibility for the visually impaired through text-to-speech technologies. Despite its widespread application, OCR faces significant challenges, especially in accurately recognizing text in noisy or degraded images. Traditionally, OCR systems have treated noise reduction and character classification as separate stages, which can compromise the overall effectiveness of text recognition. Our research introduces a groundbreaking Multifunctional End-to-End Model for Optical Character Classification and Denoising, which integrates these functions within a unified framework. By employing a dual-output autoencoder, our model concurrently denoises images and recognizes characters, thereby enhancing both the efficiency and accuracy of OCR. This paper outlines the model's development and implementation, explores the interplay between denoising and classification, and presents compelling experimental results that demonstrate marked improvements over conventional OCR methods.

component; Optical Character Classification; denoising; autoencoder; deep learning

References

  1. Chaudhuri A, et al. Optical Character Recognition Systems; Springer: Berlin, Germany, 2017. DOI: https://doi.org/10.1007/978-3-319-50252-6_2
  2. Eikvil L. Optical Character Recognition. Research Report, NorskRegnesentral, Blindern, 1993; 26.
  3. Nagy G, Nartker TA, Rice SV. Optical Character Recognition: An Illustrated Guide to the Frontier. Document Recognition and Retrieval VII 1999; 3967: 58–69. DOI: https://doi.org/10.1117/12.373511
  4. Sun G, Zhan T, Owusu BG, Daniel A–M, Liu G, Jiang W. Revised Reinforcement Learning Based on Anchor Graph Hashing for Autonomous Cell Activation in Cloud–RANs. Future Generation Computer Systems 2020; 104: 60–73. DOI: https://doi.org/10.1016/j.future.2019.09.044
  5. Horne J, Beddingfield E, Knapp M, Mitchell S, Crawford L, Mills SB, Wrist A, Zhang S, Summers RM. Caffeine and Theophylline Inhibit β–Galactosidase Activity and Reduce Expression in Escherichia coli. ACS Omega 2020; 5(50): 32250–32255. DOI: https://doi.org/10.1021/acsomega.0c03909
  6. Mock MB, Zhang S, Pniak B, Belt N, Witherspoon M, Summers RM. Substrate Promiscuity of the NdmCDE N7–Demethylase Enzyme Complex. Biotechnology Notes 2021; 2: 18–25. DOI: https://doi.org/10.1016/j.biotno.2021.05.001
  7. Deng X, Kawano Y. Surface Plasmon Polariton Graphene Midinfrared Photodetector with Multifrequency Resonance. Journal of Nanophotonics 2018; 12(2): 026017–026017. DOI: https://doi.org/10.1117/1.JNP.12.026017
  8. Zhou Y, Osman A, Willms M, Kunz A, Philipp S, Blatt J, Eul S. Semantic Wireframe Detection. In Proceedings of the DACH-Jahrestagung 2023, Friedrichshafen, Germany, 15–17 May 2023.
  9. Liu Y, Liu L, Yang L, Hao L, Bao Y. Measuring Distance Using Ultra–Wideband Radio Technology Enhanced by Extreme Gradient Boosting Decision Tree (XGBoost). Automation in Construction 2021; 126: 103678. DOI: https://doi.org/10.1016/j.autcon.2021.103678
  10. Yu F, Milord J, Orton S, Flores L, Marra R. Students’ Evaluation Toward Online Teaching Strategies for Engineering Courses during COVID. In Proceedings of the 2021 ASEE Midwest Section Conference, Virtual, 13–15 September 2021.
  11. Liu Y, Bao Y. Real–Time Remote Measurement of Distance Using Ultra–Wideband (UWB) Sensors. Automation in Construction 2023; 150: 104849. DOI: https://doi.org/10.1016/j.autcon.2023.104849
  12. Liu Y, Bao Y. Review of Electromagnetic Waves–Based Distance Measurement Technologies for Remote Monitoring of Civil Engineering Structures. Measurement 2021; 176: 109193. DOI: https://doi.org/10.1016/j.measurement.2021.109193
  13. Chen H, Chen P, Qiu Y, Ling N. FARNet: Fragmented Affinity Reasoning Network of Text Instances for Arbitrary Shape Text Detection. IET Image Processing 2023; 17(6): 1959–1977. DOI: https://doi.org/10.1049/ipr2.12769
  14. Wu P, Liu A, Fu J, Ye X, Zhao Y. Autonomous Surface Crack Identification of Concrete Structures Based on an Improved One–Stage Object Detection Algorithm. Engineering Structures 2022; 272: 114962. DOI: https://doi.org/10.1016/j.engstruct.2022.114962
  15. Wang B, Shen Y, Zhai L, Xia X, Gu H-M, Wang M, Zhao Y, Chang X, Alabi A, Xing S, Deng S, Liu B, Wang G, Qin S, Zhang D-W. Atherosclerosis–Associated Hepatic Secretion of VLDL but Not PCSK9 Is Dependent on Cargo Receptor Protein Surf4. Journal of Lipid Research 2021; 62: 100091. DOI: https://doi.org/10.1016/j.jlr.2021.100091
  16. Zhao F, Yu F, Trull T, Shang Y. A New Method Using LLMs for Keypoints Generation in Qualitative Data Analysis. In Proceedings of the 2023 IEEE Conference on Artificial Intelligence (CAI), Santa Clara, CA, USA, 5–6 June 2023. DOI: https://doi.org/10.1109/CAI54212.2023.00147
  17. Hao Y, Chen Z, Jin J, Sun X. Joint Operation Planning of Drivers and Trucks for Semi–Autonomous Truck Platooning. Transportmetrica A: Transport Science 2023; 1–37. DOI: 10.1080/23249935.2023.2266041. DOI: https://doi.org/10.1080/23249935.2023.2266041
  18. Frid–Adar M, Diamant I, Klang E, Amitai M, Goldberger J, Greenspan H. GAN–Based Synthetic Medical Image Augmentation for Increased CNN Performance in Liver Lesion Classification. Neurocomputing 2018; 321: 321–331. DOI: https://doi.org/10.1016/j.neucom.2018.09.013
  19. Yu F, Milord J, Orton SL, Flores L, Marra R. The Concerns and Perceived Challenges Students Faced When Traditional in–Person Engineering Courses Suddenly Transitioned to Remote Learning. In Proceedings of the 2022 ASEE Annual Conference, Minneapolis, MN, USA, 26–29 June 2022.
  20. Yu F, Strobel J. Work–in–Progress: Pre–college Teachers’ Metaphorical Beliefs about Engineering. In Proceedings of the 2021 IEEE Global Engineering Education Conference (EDUCON), Vienna, Austria, 21–23 April 2021. DOI: https://doi.org/10.1109/EDUCON46332.2021.9454049
  21. Milord J, Yu F, Orton S, Flores L, Marra R. Impact of COVID Transition to Remote Learning on Engineering Self–Efficacy and Outcome Expectations. In Proceedings of the 2021 ASEE Virtual Annual Conference, Virtual Conference, 26–29 July 2021.
  22. Li S, Singh K, Riedel N, Yu F, Jahnke I. Digital Learning Experience Design and Research of a Self–Paced Online Course for Risk–Based Inspection of Food Imports. Food Control 2022; 135: 108698. DOI: https://doi.org/10.1016/j.foodcont.2021.108698
  23. Yu F, Milord JO, Flores LY, Marra R. Work in Progress: Faculty Choice and Reflection on Teaching Strategies to Improve Engineering Self–Efficacy. In Proceedings of the 2022 ASEE Annual Conference, Minneapolis, MN, USA, 26–29 June 2022.
  24. Orton S, Yu F, Flores L, Marra R. Student Perceptions of Confidence in Learning and Teaching Before and After Teaching Improvements. In Proceedings of the 2023 ASEE Annual Conference, Baltimore, MD, USA, 25–28 June 2023.
  25. Yi C, Tian Y, Arditi A. Portable Camera–Based Assistive Text and Product Label Reading from Hand–Held Objects for Blind Persons. IEEE/ASME Transactions On Mechatronics 2013; 19(3): 808–817. DOI: https://doi.org/10.1109/TMECH.2013.2261083
  26. Freund Y, Schapire RE. Experiments with a New Boosting Algorithm. In Proceedings of the Proceedings of the Thirteenth International Conference (ICML '96), Bari, Italy, 3–6 July 1996.
  27. Bai J, Chen Z, Feng B, Xu B. Image Character Recognition Using Deep Convolutional Neural Network Learned from Different Languages. In Proceedings of the 2014 IEEE International Conference on Image Processing (ICIP), Paris, France, 27–30 October 2014. DOI: https://doi.org/10.1109/ICIP.2014.7025518
  28. Patil V, Sanap RV, Kharate RB. Optical Character Recognition Using Artificial Neural Network. Int J Eng Res Gen Sci 2015; 3(1): 7.
  29. Shrivastava V, Sharma N. Artificial Neural Network Based Optical Character Recognition. 2012. arXiv:1211.4385. DOI: https://doi.org/10.5121/sipij.2012.3506
  30. Satyanarayana P, Sujitha K, Sai Anitha Kiron V, Ajitha Reddy P, Ganesh M. Assistance Vision for Blind People Using K–NN Algorithm and Raspberry Pi. In Proceedings of the 2nd International Conference on Micro–Electronics, Electromagnetics and Telecommunications: ICMEET 2016; Springer: Berlin, Germany, 2018. DOI: https://doi.org/10.1007/978-981-10-4280-5_12
  31. Li Y, Zheng Y, Doermann D. Detecting Text Lines in Handwritten Documents. In Proceedings of the 18th International Conference on Pattern Recognition (ICPR'06), Hong Kong, China, 20–24 August 2006.
  32. Shiravale SS, Jayadevan R, Sannakki SS. Recognition of Devanagari Scene Text Using Autoencoder CNN. ELCVIA: Electronic Letters on Computer Vision and Image Analysis 2021; 20(1): 0055–69. DOI: https://doi.org/10.5565/rev/elcvia.1344
  33. Gidaris S, Komodakis N. Generating Classification Weights with Gnn Denoising Autoencoders for Few–Shot Learning. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. DOI: https://doi.org/10.1109/CVPR.2019.00011
  34. Ahmad I, Wang X, Li R, Rasheed S. Offline Urdu Nastaleeq Optical Character Recognition Based on Stacked Denoising Autoencoder. China Communications 2017; 14(1): 146–157. DOI: https://doi.org/10.1109/CC.2017.7839765
  35. Alamsyah N, Fauzan MN, Putrada AG, Pane SF. Autoencoder Image Denoising to Increase Optical Character Recognition Performance in Text Conversion. In Proceedings of the 2022 International Conference on Advanced Creative Networks and Intelligent Systems (ICACNIS), Bandung, Indonesia, 23 November 2022. DOI: https://doi.org/10.1109/ICACNIS57039.2022.10054885
  36. Qiu Y, Wang J, Jin Z, Chen H, Zhang M, Guo L. Pose–Guided Matching Based on Deep Learning for Assessing Quality of Action on Rehabilitation Training. Biomedical Signal Processing and Control 2022; 72: 103323. DOI: https://doi.org/10.1016/j.bspc.2021.103323
  37. Deng X, Simanullang M, Kawano Y. Ge–Core/a–Si–Shell Nanowire–Based Field–Effect Transistor for Sensitive Terahertz Detection. Photonics 2018; 5(2): 13. DOI: https://doi.org/10.3390/photonics5020013
  38. Liu Y, Yang H, Wu C. Unveiling Patterns: A Study on Semi–Supervised Classification of Strip Surface Defects. IEEE Access 2023; 11: 119933–119946. DOI: https://doi.org/10.1109/ACCESS.2023.3326843
  39. Liu Y, Bao Y. Automatic Interpretation of Strain Distributions Measured from Distributed Fiber Optic Sensors for Crack Monitoring. Measurement 2023; 211: 112629. DOI: https://doi.org/10.1016/j.measurement.2023.112629
  40. Ismail WN, Hassan MM, Alsalamah HA, Fortino G. CNN–Based Health Model for Regular Health Factors Analysis in Internet–of–Medical Things Environment. IEEE Access 2020; 8: 52541–52549. DOI: https://doi.org/10.1109/ACCESS.2020.2980938
  41. Kayalibay B, Jensen G, van der Smagt P. CNN–Based Segmentation of Medical Imaging Data. 2017. arXiv:1701.03056.
  42. Salehi AW, et al. A Study of CNN and Transfer Learning in Medical Imaging: Advantages, Challenges, Future Scope. Sustainability 2023; 15(7): 5930. DOI: https://doi.org/10.3390/su15075930
  43. Wenjun D, Fatahizadeh M, Touchaei HG, Moayedi H, Foong LK. Application of Six Neural Network–Based Solutions on Bearing Capacity of Shallow Footing on Double–Layer Soils. Steel and Composite Structures 2023; 49(2): 231–244.
  44. Wang M, et al. Identification of Amino Acid Residues in the MT–Loop of MT1–MMP Critical for Its Ability to Cleave Low–Density Lipoprotein Receptor. Frontiers in Cardiovascular Medicine 2022; 9: 917238. DOI: https://doi.org/10.3389/fcvm.2022.917238
  45. Qiu Y. Estimation of Tail Risk Measures in Finance: Approaches to Extreme Value Mixture Modeling; Johns Hopkins University: Baltimore, MD, USA, 2019.
  46. Qiu Y, Yang Y, Lin Z, Chen P, Luo Y, Huang W. Improved Denoising Autoencoder for Maritime Image Denoising and Semantic Segmentation of USV. China Communications 2020; 17(3): 46–57. DOI: https://doi.org/10.23919/JCC.2020.03.005
  47. Deng X, Oda S, Kawano Y. Graphene–Based Midinfrared Photodetector with Bull’S Eye Plasmonic Antenna. Optical Engineering 2023; 62(9): 097102–097102. DOI: https://doi.org/10.1117/1.OE.62.9.097102
  48. Sugaya T, Deng X. Resonant Frequency Tuning of Terahertz Plasmonic Structures Based on Solid Immersion Method. In Proceedings of the 2019 44th International Conference on Infrared, Millimeter, and Terahertz Waves (IRMMW–THz), Paris, France, 1–6 September 2019. DOI: https://doi.org/10.1109/IRMMW-THz.2019.8874404
  49. Tao G, et al. Surf4 (Surfeit Locus Protein 4) Deficiency Reduces Intestinal Lipid Absorption and Secretion and Decreases Metabolism in Mice. Arterioscler Thromb Vasc Biol 2023; 43(4): 562–580. DOI: https://doi.org/10.1161/ATVBAHA.123.318980
  50. Shen Y, Gu H–m, Zhai L, Wang B, Qin S, Zhang D–w. The Role of Hepatic Surf4 in Lipoprotein Metabolism and the Development of Atherosclerosis in apoE-/- mice. Biochimica et Biophysica Acta (BBA)–Molecular and Cell Biology of Lipids 2022; 1867(10): 159196. DOI: https://doi.org/10.1016/j.bbalip.2022.159196
  51. Deng X, Li L, Enomoto M, Kawano Y. Continuously Frequency–Tuneable Plasmonic Structures for Terahertz Bio–Sensing and Spectroscopy. Scientific Reports 2019; 9(1): 3498. DOI: https://doi.org/10.1038/s41598-019-39015-6
  52. Gu Y, Sharma S, Chen K. Demo: Image Disguising for Scalable GPU–accelerated Confidential Deep Learning. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, in CCS ’23. New York, NY, USA: Association for Computing Machinery, Copenhagen, Denmark, 26–30 November 2023. DOI: https://doi.org/10.1145/3576915.3624364
  53. Gu Y, Chen K. GAN–Based Domain Inference Attack. Proceedings of the AAAI Conference on Artificial Intelligence 2023; 37(12): 14214–14222. DOI: https://doi.org/10.1609/aaai.v37i12.26663