Klasifikasi Jenis Kelamin Berbasis Citra Mata Menggunakan Vision Transformer ViT dengan Strategi Discriminative Fine-Tuning
Abstract
Face-based biometric identification systems have significant limitations when a subject’s face is covered, whether due to mask usage after the COVID-19 pandemic or face veils for cultural and religious reasons. This creates real security gaps, as evidenced by the gender-disguise infiltration incident at Masjid Jannatul Firdaus in Makassar. In such situations, the eyes remain the only consistently exposed biometric feature. This study proposes the application of Vision Transformer (ViT-B/16) pretrained on ImageNet-21K with a progressive fine-tuning strategy based on the discriminative learning rate principle to classify gender from eye images. The Female and Male Eyes dataset from Kaggle consists of 11,525 eye images divided into training (64%), validation (16%), and testing (20%) sets. Experiments were conducted in two series: Series B tested variations in the number of unfrozen transformer blocks (0–6), and Series C tested discriminative learning rate ratios between the classifier and encoder (5:1, 10:1, 3:1). The optimal configuration with 6 unfrozen blocks and a 3:1 ratio achieved 95.70% accuracy, 97.67% precision, 92.69% recall, and 0.9569 weighted F1-score, surpassing MobileNet (93.90%) and K-Nearest Neighbor (68.81%). These results indicate that ViT with discriminative fine-tuning is effective for gender classification from eye images and has potential for biometric security applications.
Downloads
References
M. Ngan, P. Grother, dan K. Hanaoka, "Ongoing Face Recognition Vendor Test (FRVT) Part 6A: Face Recognition Accuracy with Masks Using Pre-COVID-19 Algorithms," NIST Interagency Report 8311, National Institute of Standards and Technology, Gaithersburg, MD, USA, Jul. 2020. DOI: 10.6028/NIST.IR.8311
Detik News, "Pria Bercadar Menyusup ke Jemaah Wanita di Masjid Makassar Diamankan," Detik.com, 2024. [Online]. Available: https://news.detik.com/berita/d-7259609. [Accessed: Apr. 20, 2026]
K. Nguyen, H. Proença, dan F. Alonso-Fernandez, "Deep Learning for Iris Recognition: A Survey," ACM Computing Surveys, vol. 56, no. 9, Art. no. 223, 2024. DOI: 10.1145/3637525
S. Minaee, A. Abdolrashidi, H. Su, M. Bennamoun, dan D. Zhang, "Biometrics Recognition Using Deep Learning: A Survey," Artificial Intelligence Review, vol. 56, no. 8, hlm. 8647–8695, 2023. DOI: 10.1007/s10462-022-10237-x
D. Kwasny dan D. Hemmerling, "Gender and Age Estimation Methods Based on Speech Using Deep Neural Networks," Sensors, vol. 21, no. 14, Art. no. 4785, Jul. 2021. DOI: 10.3390/s21144785
S. Haseena et al., "Prediction of the Age and Gender Based on Human Face Images Based on Deep Learning Algorithm," Computational Intelligence and Neuroscience, vol. 2022, Art. no. 1413597, 2022. DOI: 10.1155/2022/1413597
C.-T. Hsiao, C.-Y. Lin, P.-S. Wang, dan Y.-T. Wu, "Application of Convolutional Neural Network for Fingerprint-Based Prediction of Gender, Finger Position, and Height," Entropy, vol. 24, no. 4, Art. no. 475, Mar. 2022. DOI: 10.3390/e24040475
S. Zhang, X. Wang, A. Liu, C. Zhao, J. Wan, S. Escalera, H. Shi, Z. Wang, dan S. Z. Li, "A Dataset and Benchmark for Large-Scale Multi-Modal Face Anti-Spoofing," dalam Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, hlm. 919–928. DOI: 10.1109/CVPR.2019.00101
B. M. S. Hasan dan R. J. Mstafa, "A Study of Gender Classification Techniques Based on Iris Images: A Deep Survey and Analysis," Science Journal of University of Zakho, vol. 10, no. 4, hlm. 222–234, 2022. DOI: 10.25271/sjuoz.2022.10.4.1039
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, dan N. Houlsby, "An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale," dalam Proc. 9th International Conference on Learning Representations (ICLR 2021), May 2021. [Online]. Available: https://arxiv.org/abs/2010.11929
J. Howard dan S. Ruder, “Universal Language Model Fine-tuning for Text Classification,” dalam Proc. 56th Annual Meeting of the Association for Computational Linguistics (ACL), Melbourne, Australia, 2018, hlm. 328–339. DOI: 10.18653/v1/P18-1031
B. Pavel, "Female and Male Eyes," Kaggle, 2022. [Online]. Available: https://www.kaggle.com/datasets/burakbey0/female-and-male-eyes. [Accessed: Apr. 20, 2026]
C. Kurniawan dan H. Irsyad, "Perbandingan Metode K-Nearest Neighbor Dan Naïve Bayes Untuk Klasifikasi Gender Berdasarkan Mata," Jurnal Algoritme, vol. 2, no. 2, hlm. 82–91, Apr. 2022. DOI: 10.35957/algoritme.v2i2.2358
N. Aini dan D. Y. Liliana, "Prediksi Gender Berdasarkan Citra Mata Menggunakan Metode Convolutional Neural Network, Inception dan MobileNet," Buletin Poltanesa, vol. 23, no. 1, hlm. 226–232, Jun. 2022. DOI: 10.51967/tanesa.v23i1.1272
A. I. Pradana dan W. Wijiyanto, "Identifikasi Jenis Kelamin Otomatis Berdasarkan Mata Manusia Menggunakan Convolutional Neural Network (CNN) dan Haar Cascade Classifier," G-Tech: Jurnal Teknologi Terapan, vol. 8, no. 1, hlm. 502–511, Jan. 2024. DOI: 10.33379/gtech.v8i1.3814
H. Touvron et al., “Training Data-Efficient Image Transformers & Distillation Through Attention,” dalam Proc. International Conference on Machine Learning (ICML), PMLR vol. 139, 2021, hlm. 10347–10357. [Online]. Available: https://arxiv.org/abs/2012.12877
I. Loshchilov dan F. Hutter, "Decoupled Weight Decay Regularization," dalam Proc. International Conference on Learning Representations (ICLR), New Orleans, LA, USA, 2019. [Online]. Available: https://arxiv.org/abs/1711.05101
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, dan I. Polosukhin, “Attention Is All You Need,” dalam Advances in Neural Information Processing Systems (NeurIPS), vol. 30, 2017, hlm. 5998–6008. [Online]. Available: https://arxiv.org/abs/1706.03762
Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, dan B. Guo, “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,” dalam Proc. IEEE/CVF International Conference on Computer Vision (ICCV), 2021, hlm. 10012–10022. DOI: 10.1109/ICCV48922.2021.00986
K. He, X. Chen, S. Xie, Y. Li, P. Dollar, dan R. Girshick, “Masked Autoencoders Are Scalable Vision Learners,” dalam Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, hlm. 16000–16009. DOI: 10.1109/CVPR52688.2022.01553
M. Tan dan Q. V. Le, “EfficientNetV2: Smaller Models and Faster Training,” dalam Proc. International Conference on Machine Learning (ICML), PMLR vol. 139, 2021, hlm. 10096–10106. [Online]. Available: https://arxiv.org/abs/2104.00298
A. Radford et al., “Learning Transferable Visual Models From Natural Language Supervision,” dalam Proc. International Conference on Machine Learning (ICML), PMLR vol. 139, 2021, hlm. 8748–8763. [Online]. Available: https://arxiv.org/abs/2103.00020
V. K. Suravarapu dan H. Y. Patil, "Performance Evaluation of Enhanced Deep Learning Classifiers for Person Identification and Gender Classification," Scientific Reports, vol. 15, Art. no. 28182, Aug. 2025. DOI: 10.1038/s41598-025-12474-w
K. Han, Y. Wang, H. Chen, X. Chen, J. Guo, Z. Liu, Y. Tang, A. Xiao, C. Xu, Y. Xu, Z. Yang, Y. Zhang, dan D. Tao, "A Survey on Vision Transformer," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 1, hlm. 87–110, Jan. 2023. DOI: 10.1109/TPAMI.2022.3152247
J. Yosinski, J. Clune, Y. Bengio, dan H. Lipson, "How Transferable Are Features in Deep Neural Networks?" dalam Advances in Neural Information Processing Systems 27 (NIPS), 2014, hlm. 3320–3328. [Online]. Available: https://arxiv.org/abs/1411.1792
B. J. Ferrell, "Fine-tuning Strategies for Classifying Community-Engaged Research Studies Using Transformer-Based Models: Algorithm Development and Improvement Study," JMIR Formative Research, vol. 7, Art. no. e41137, Feb. 2023. DOI: 10.2196/41137
M. Raghu, T. Unterthiner, S. Kornblith, C. Zhang, dan A. Dosovitskiy, "Do Vision Transformers See Like Convolutional Neural Networks?" dalam Advances in Neural Information Processing Systems (NeurIPS), vol. 34, 2021, hlm. 12116–12128. [Online]. Available: https://arxiv.org/abs/2108.08810
M. Hossin dan M. N. Sulaiman, "A Review on Evaluation Metrics for Data Classification Evaluations," International Journal of Data Mining & Knowledge Management Process, vol. 5, no. 2, hlm. 1–11, 2015. DOI: 10.5121/ijdkp.2015.5201
C. Bisogni, L. Cascone, dan F. Narducci, "Periocular Data Fusion for Age and Gender Classification," Journal of Imaging, vol. 8, no. 11, Art. no. 307, Nov. 2022. DOI: 10.3390/jimaging8110307
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Klasifikasi Jenis Kelamin Berbasis Citra Mata Menggunakan Vision Transformer ViT dengan Strategi Discriminative Fine-Tuning
Pages: 108-117
Copyright (c) 2026 Gde Made Hanura, Putu Hendra Suputra

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).






















