IJIET 2026 Vol.16(6): 1518-1527
doi: 10.18178/ijiet.2026.16.6.2617
doi: 10.18178/ijiet.2026.16.6.2617
Artificial Intelligence-Driven Personalized Learning Assistants: Combining Large Language Models and Computer Vision for Tailored Education
Osama Hosam1,2
1. Computer Information Science (CIS) Department, Higher Colleges of Technology, United Arab Emirates
2. City of Scientific Research and Technological Applications (SRTA-City), IRI, Alexandria, Egypt
Email: mohandesosama@yahoo.com (O.H.)
2. City of Scientific Research and Technological Applications (SRTA-City), IRI, Alexandria, Egypt
Email: mohandesosama@yahoo.com (O.H.)
Manuscript received September 30, 2025; revised October 27, 2025; accepted December 22, 2025; published June 16, 2026
Abstract—The integration of Artificial Intelligence (AI) in educational technology has advanced significantly, yet a critical gap persists in seamlessly combining Large Language Models (LLMs) and Computer Vision (CV) to create truly adaptive, multimodal learning systems. This paper addresses this research void by presenting a comprehensive architectural framework for AI-driven personalized learning assistants that synergistically combine multimodal perception, contextual reasoning, adaptive planning, and interactive presentation capabilities. Our methodology employs a four-layer architecture where CV components handle visual perception and behavioral analysis through convolutional neural networks, while transformer-based LLMs manage contextual understanding and pedagogical reasoning. The research instruments included a mixed-methods approach with 250 participants across diverse educational contexts, utilizing pre-post assessments, multimodal data analytics, engagement metrics, and structured interviews. Experimental evaluation demonstrates statistically significant improvements in learning outcomes, with a 37.2% increase in knowledge retention and 32.8% improvement in engagement metrics compared to traditional e-learning systems. The paper contributes both a detailed architectural blueprint and empirical validation of a truly multimodal AI educational system that bridges the critical gap between theoretical potential and practical implementation in AI-enhanced education.
Keywords—Artificial Intelligence (AI), personalized learning, Large Language Models (LLMs), Computer Vision (CV), multimodal learning, educational technology, adaptive systems
Copyright © 2026 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
Keywords—Artificial Intelligence (AI), personalized learning, Large Language Models (LLMs), Computer Vision (CV), multimodal learning, educational technology, adaptive systems
Cite: Osama Hosam, "Artificial Intelligence-Driven Personalized Learning Assistants: Combining Large Language Models and Computer Vision for Tailored Education," International Journal of Information and Education Technology, vol. 16, no. 6, pp. 1518-1527, 2026.
Copyright © 2026 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).