Data Security Identification Based on Full-Dimensional Dynamic Convolution and Multi-Modal CLIP

Qinyi Zhu; Dan Shao

doi:10.62836/jitp.2023.430

Data Security Identification Based on Full-Dimensional Dynamic Convolution and Multi-Modal CLIP

This paper addresses key challenges in data security and privacy protection in multimodal recognition within the current field of artificial intelligence. We propose a data security recognition method that integrates Omni-Dimensional Dynamic Convolution (ODConv) with a multimodal CLIP model. The method targets three biometric modalities—face, voiceprint, and behavior—by constructing a unified multimodal recognition framework. To effectively mask and protect users’ sensitive information, a Variational Autoencoder (VAE) is introduced to perturb and compress the raw modality data. In the feature extraction and fusion stage, ODConv replaces traditional convolutional structures, enhancing the model’s adaptive capability to semantic heterogeneity across different modalities. Meanwhile, leveraging CLIP’s cross-modal alignment mechanism, semantic-level fusion of face, voice, and behavior is achieved, improving the model’s understanding and recognition of identity information in complex scenarios. Experiments conducted on multiple public multimodal datasets systematically evaluate reconstruction error, recognition accuracy, and robustness against adversarial attacks. Results demonstrate that the proposed method maintains recognition performance while effectively reducing sensitive information leakage risks during model inversion and reconstruction attacks, validating its practicality and robustness in data security scenarios. This study provides a feasible pathway and technical reference for the trustworthy deployment of multimodal biometric recognition systems under privacy protection constraints.

Keywords: multimodal recognition; data security; privacy protection; ODConv; CLIP; VAE

References

Huang Y, Li YJ, Cai Z. Security and Privacy in Metaverse: A Comprehensive Survey. Big Data Mining Analytics 2023; 6: 234–247.
Deep S, Zheng, X, Jolfaei A, et al. A survey of Security and Privacy Issues in the Internet of Things from the Layered Context. Transactions on Emerging Telecommunications Technologies 2022; 33: e3935.
Kumar S, Chaube MK, Nenavath SN, et al. Privacy Preservation and Security Challenges: A New Frontier Multimodal Machine Learning Research. International Journal of Sensor Networks 2022; 39: 227–245.
Zhang C, Yang Z, He X, et al. Multimodal Intelligence: Representation Learning, Information Fusion, and Applications. IEEE Journal of Selected Topics in Signal Processing 2020; 14: 478–493.
Jabeen S, Li X, Amin MS, et al. A Review on Methods and Applications in Multimodal Deep Learning. ACM Transactions on Multimedia Computing, Communications 2023; 19: 1–41.
Sun Z, Ke Q, Rahmani H, et al. Human Action Recognition from Various Data Modalities: A Review. IEEE Transactions on Pattern Analysis Machine Intelligence 2022; 45: 3200–3225.
Dizaji MS, Mao Z, Haile M. A Hybrid-Attention-ConvLSTM-Based Deep Learning Architecture to Extract Modal Frequencies from Limited Data Using Transfer Learning. Mechanical Systems Signal Processing 2023; 187: 09949.
Zhang J, Liu Y, Wang B, et al. A Hierarchical Fusion SAR Image Change-Detection Method Based on HF-CRF Model. Remote Sensing 2023; 15: 2741.
Yanamala AKY, Suryadevara S. Advances in Data Protection and Artificial Intelligence: Trends and Challenges. International Journal of Advanced Engineering Technologies 2023; 1: 294–319.
Sarker IH. Multi-Aspects AI-Based Modeling and Adversarial Learning for Cybersecurity Intelligence and Robustness: A Comprehensive Overview. Security 2023; 6: e295.
Lu S, Liu M, Yin L, et al. The Multi-Modal Fusion in Visual Question Answering: A Review of Attention Mechanisms. PeerJ Computer Science 2023; 9: e1400.
Zhang H, Koh JY, Baldridge J, et al. Cross-Modal Contrastive Learning for Text-to-Image Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 833–842.
Liu C, Lou C, Wang R, et al. Deep Neural Network Fusion via Graph Matching with Applications to Model Ensemble and Federated Learning. In Proceedings of the International Conference on Machine Learning, Baltimore, MD, USA, 25–27 July 2022; pp. 13857–13869.
Jiang F, Fu Y, Gupta BB, et al. Deep Learning Based Multi-Channel Intelligent Attack Detection for Data Security. IEEE transactions on Sustainable Computing 2018; 5: 204–212.
Xu G, Li H, Ren H, et al. Data Security Issues in Deep Learning: Attacks, Countermeasures, and Opportunities. IEEE Communications Magazine 2019; 57: 116–122.
Yi D, Lei Z, Liao S, et al. Learning Face Representation from Scratch. arXiv 2014; arXiv:1411.7923.
Nagrani A, Chung JS, Zisserman A. Voxceleb: A Large-Scale Speaker Identification Dataset. arXiv 2017; arXiv:1706.08612.
Zhu H, Wu W, Zhu W, et al. CelebV-HQ: A Large-Scale Video Facial Attributes Dataset. In Proceedings of the European Conference on Computer Vision, Shenzhen, China, 18–21 February 2022; pp. 650–667.
Hukkelås H, Mester R, Lindseth F. Deepprivacy: A Generative Adversarial Network for Face Anonymization. In Proceedings of the International Symposium on Visual Computing, Lake Tahoe, NV, USA, 7–9 October 2019; pp. 565–578.
ALRikabi H, Hazim HT. Enhanced Data Security of Communication System Using Combined Encryption and Steganography. IJIM 2021; 15: 145.
Liang W, Yang Y, Yang C, et al. PDPChain: A Consortium Blockchain-Based Privacy Protection Scheme for Personal Data. IEEE Transactions on Reliability 2022; 72: 586–598.
Thabit F, Alhomdy S, Jagtap S. A New Data Security Algorithm for the Cloud Computing Based on Genetics Techniques and Logical-Mathematical Functions. International Journal of Intelligent Networks 2021; 2: 18–33.
Hua B, Wang Z, Meng J, et al. Big Data Security and Privacy Protection Model Based on Image Encryption Algorithm. Soft Computing 2023; 1–13.
Zhang Y, Hart JD. The Effect of Prior Parameters in a Bayesian Approach to Inferring Material Properties from Experimental Measurements. Journal of Engineering Mechanics 2023; 149: 04023007. https://doi.org/10.1061/JENMDT.EMENG-6687.
Zhang Y, Needleman A. On the Identification of Power-Law Creep Parameters from Conical Indentation. Royal Society A: Mathematical, Physical and Engineering Sciences 2021; 477: 20210233. https://doi.org/10.1098/rspa.2021.0233.
Luo Z, Yan H, Pan X. Optimizing Transformer Models for Resource-Constrained Environments: A Study on Model Compression Techniques. Journal of Computational Methods in Engineering Applications 2023; 3: 1–12. https://doi.org/10.62836/jcmea.v3i1.030107.
Yan H. Real-Time 3D Model Reconstruction through Energy-Efficient Edge Computing. Optimizations in Applied Machine Learning 2022; 2: 1.
Zhu Z. Tumor Purity Predicted by Statistical Methods. In AIP Conference Proceedings; AIP Publishing: College Park, MD, USA, 2022.
Zhao Z, Ren P, Tang M. Analyzing the Impact of Anti-Globalization on the Evolution of Higher Education Internationalization in China. Journal of Linguistics and Education Research 2022; 5: 15–31.
Tang Y, Li C. Exploring the Factors of Supply Chain Concentration in Chinese A-Share Listed Enterprises. Journal of Computational Methods in Engineering Applications 2023; 3: 1–17.
Li C, Tang Y. The Factors of Brand Reputation in Chinese Luxury Fashion Brands. Journal of Integrated Social Sciences and Humanities 2023; 1: 1–14.
Tang CY, Li C. Examining the Factors of Corporate Frauds in Chinese A-share Listed Enterprises. OAJRC Social Science 2023; 4: 63–77.
Ma J, Xu K, Qiao Y, et al. An Integrated Model for Social Media Toxic Comments Detection: Fusion of High-Dimensional Neural Network Representations and Multiple Traditional Machine Learning Algorithms. Journal of Computational Methods in Engineering Applications 2022; 2: 1–12.

Supporting Agencies

Funding: This research received no external funding.

Downloads

Data Security Identification Based on Full-Dimensional Dynamic Convolution and Multi-Modal CLIP

References

Supporting Agencies

Information