Feedback on learning with generative artificial intelligence in university students
DOI:
https://doi.org/10.5944/ried.45547Keywords:
feedback, formative evaluation, generative artificial intelligence, ChatGPT, Ladder of Feedback, university studentsAbstract
Assessment for learning has become increasingly important in university teaching, particularly regarding the feedback process. However, there is still a perception of student dissatisfaction with the quality of feedback provided by faculty, highlighting the need to innovate in feedback strategies. This study aimed to explore the pedagogical and technological relevance of integrating Wilson's Feedback Ladder with generative artificial intelligence, specifically GPT-4o, to strengthen formative feedback in university students. The study was conducted using a qualitative and exploratory approach in two phases. First, a prompt was designed and validated using the Delphi method with the participation of eight experts in assessment and artificial intelligence, applying it to seven state-of-the-art language models. In the second phase, the validated prompt was implemented in two university courses of different nature, Assessment for Learning and Data Structures, integrating automatic feedback into the Moodle platform. The results showed that the experts agreed on the suitability of AI-mediated Wilson’s Ladder and highlighted the superior performance of GPT-4o. At the classroom level, students valued the clarity, usefulness, and immediacy of the feedback, although they identified limitations in the tool's lack of contextualization and impersonal tone. It is concluded that the integration of Wilson’s Ladder with generative artificial intelligence represents a promising innovation, but one that requires disciplinary adjustments, teacher supervision, and careful attention to the human dimension of feedback in e-learning contexts.
Downloads
References
Al-Azawei, A., Abdullah, A. A., Mohammed, M. K., & Abod, Z. A. (2023). Predicting online learning success based on learners’ perceptions: The integration of the information system success model and the security triangle framework. International Review of Research in Open and Distributed Learning, 24(2), 7295. https://doi.org/10.19173/irrodl.v24i2.6895
Andrade, H. L., & Brookhart, S. M. (2019). Classroom assessment as the co-regulation of learning. Assessment in Education: Principles, Policy & Practice, 26(1), 103-117. https://doi.org/10.1080/0969594X.2019.1571992
Anthropic. (2024). Claude 3.7 [Large language model]. https://www.anthropic.com
Ayeni, O. O., Al Hamad, N. M., Chisom, O. N., Osawaru, B., & Adewusi, O. E. (2024). AI in education: A review of personalized learning and educational technology. GSC Advanced Research and Reviews, 18(2), 261-271. https://doi.org/10.30574/gscarr.2024.18.2.0062
Baral, S., Worden, E., Lim, W.-C., Luo, Z., Santorelli, C., Gurung, A., & Heffernan, N. (2024). Automated feedback in math education: A comparative analysis of LLMs for open-ended responses. arXiv. https://doi.org/10.48550/arXiv.2411.08910
Black, P., & Wiliam, D. (2018). Classroom assessment and pedagogy. Assessment in Education: Principles, Policy & Practice, 25(6), 551-575. https://doi.org/10.1080/0969594X.2018.1441807
Bonales-Daimiel, G., Martínez-Estrella, E. C., & Sierra-Sánchez, J. (2025). Evolución del perfil docente y surgimiento de nuevos roles profesionales en la era de la inteligencia artificial (IA). Pixel-Bit. Revista de Medios y Educación, 73, Article 3. https://doi.org/10.12795/pixelbit.109085
Booth, W. C., Colomb, G. G., & Williams, J. M. (2008). The craft of research (3rd ed.). University of Chicago Press. https://doi.org/10.7208/chicago/9780226062648.001.0001
Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., ... Amodei, D. (2020). Language models are few-shot learners. arXiv. https://doi.org/10.48550/arXiv.2005.14165
Carless, D., & Boud, D. (2018). The development of student feedback literacy: Enabling uptake of feedback. Assessment & Evaluation in Higher Education, 43(8), 1315-1325. https://doi.org/10.1080/02602938.2018.1463354
Carless, D., & Winstone, N. (2023). Teacher feedback literacy and its interplay with student feedback literacy. Teaching in Higher Education, 28(1), 150-163. https://doi.org/10.1080/13562517.2020.1782372
Dai, W., Lin, J., Jin, F., Li, T., Tsai, Y. S., Gašević, D., & Chen, G. (2023). Can large language models provide feedback to students? A case study on ChatGPT. EdArXiv. https://doi.org/10.35542/osf.io/hcgzj
DeepSeek. (2024). DeepSeek R1 [Generative AI model]. https://www.deepseek.com/
European Commission. (2023). Teaching with AI – Assessment, feedback and personalisation. Briefing report No. 7 (European Digital Education Hub). Erasmus+ Programme. https://resitve.sio.si/wp-content/uploads/sites/7/2023/11/AI-squad-output_briefing-report-7.pdf
Galindo-Domínguez, H., Delgado, N., Losada, D., & Etxabe, J. M. (2023). An analysis of the use of artificial intelligence in education in Spain: The in-service teacher’s perspective. Journal of Digital Learning in Teacher Education, 40(1), 41-56. https://doi.org/10.1080/21532974.2023.2284726
García-Peñalvo, F. J. (2023). La percepción de la inteligencia artificial en contextos educativos tras el lanzamiento de ChatGPT: ¿Disrupción o pánico? Education in the Knowledge Society (EKS), 24, e31279-e31279. https://doi.org/10.14201/eks.31279
García Peñalvo, F. J., Llorens-Largo, F., & Vidal, J. (2024). La nueva realidad de la educación ante los avances de la inteligencia artificial generativa. RIED-Revista Iberoamericana de Educación a Distancia, 27(1), 9-39. https://doi.org/10.5944/ried.27.1.37716
Goodrich, H. (2011). Ladder of Feedback [Adaptation based on Ron Berger’s Ladder of Feedback]. In A. Goodrich (Ed.), Protocols in the classroom. Harvard Project Zero.
Google DeepMind. (2024). Gemini 2.5 [Multimodal AI model]. https://deepmind.google/
Hao, Y., Sun, Y., Dong, L., Han, Z., Gu, Y., & Wei, F. (2022). Structured prompting: Scaling in-context learning to 1,000 examples. arXiv. https://doi.org/10.48550/arXiv.2212.06713
Holmes, W., Bialik, M., & Fadel, C. (2019). Artificial intelligence in education: Promises and implications for teaching and learning. Center for Curriculum Redesign.
Jauhiainen, J. S., & Garagorry Guerra, A. (2024). Generative AI in education: ChatGPT-4 in evaluating students’ written responses. Innovations in Education and Teaching International, 1-18. https://doi.org/10.1080/14703297.2024.2422337
Landeta, J. (1999). El método Delphi: Una técnica de previsión para la incertidumbre. Ariel.
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., & Neubig, G. (2023). Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9), 1-35. https://doi.org/10.1145/3560815
LLaMa. (2024). LLaMA4 [Large language model]. https://www.llama.com/
López Regalado, O., Núñez-Rojas, N., López Gil, O. R., & Sánchez-Rodríguez, J. (2024). El análisis del uso de la inteligencia artificial en la educación universitaria: Una revisión sistemática. Pixel-Bit. Revista de Medios y Educación, 70, 97-122. https://doi.org/10.12795/pixelbit.106336
Luckin, R., & Holmes, W. (2016). Intelligence unleashed: An argument for AI in education. Pearson. https://www.pearson.com/content/dam/one-dot-com/one-dot-com/global/Files/about-pearson/innovation/open-ideas/IntelligenceUnleashedSPANISH.pdf
Mayring, P. (2000). Qualitative content analysis. Forum: Qualitative Social Research, 1(2), 1-10. https://doi.org/10.17169/fqs-1.2.1089
Ministry of Education of Chile. (2019). Orientaciones para la implementación del Decreto 67/2018 de evaluación, calificación y promoción. MINEDUC. https://bibliotecadigital.mineduc.cl/bitstream/handle/20.500.12365/14279/orientaciones%20decreto%2067.pdf
Mistral AI. (2024). Mistral [Large language model]. https://mistral.ai/
Molloy, E., Boud, D., & Henderson, M. (2020). Developing a learning-centred framework for feedback literacy. Assessment & Evaluation in Higher Education, 45(4), 527-540. https://doi.org/10.1080/02602938.2019.1667955
OpenAI. (2023). Best practices for prompt engineering with the OpenAI API. https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-the-openai-api
OpenAI. (2024). GPT-4o, GPT-4.5 model cards. https://openai.com
Ossa, C., & Willatt, C. (2023). Uso de inteligencia artificial generativa para retroalimentar escritura académica en procesos de formación inicial docente. European Journal of Education and Psychology, 16(2), 1-16. https://doi.org/10.32457/ejep.v16i2.2412
Puertas, E., & Cano, E. (2024). ¿Puede la inteligencia artificial proporcionar un feedback más sostenible? Digital Education Review, 45(1), 50-58.
Quezada, S., & Salinas, C. (2021). Modelo de retroalimentación para el aprendizaje: Una propuesta basada en la revisión de literatura. Revista Mexicana de Investigación Educativa, 26(88), 225-251.
Reynolds, L., & McDonell, K. (2021, May). Prompt programming for large language models: Beyond the few-shot paradigm. In Extended abstracts of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1-7). https://doi.org/10.1145/3411763.3451760
Romero Alonso, R., Araya Carvajal, K., & Reyes Acevedo, N. (2025). Rol de la inteligencia artificial en la personalización de la educación a distancia: Una revisión sistemática. RIED-Revista Iberoamericana de Educación a Distancia, 28(1), 9-36. https://doi.org/10.5944/ried.28.1.41538
Sahoo, S. S., Plasek, J. M., Xu, H., Uzuner, Ö., Cohen, T., Yetisgen, M., & Wang, Y. (2024). Large language models for biomedicine: Foundations, opportunities, challenges, and best practices. Journal of the American Medical Informatics Association, 31(9), 2114-2124. https://doi.org/10.1093/jamia/ocae074
Schulhoff, S., Ilie, M., Balepur, N., Kahadze, K., Liu, A., Si, C., Li, Y., Gupta, A., Han, H., Schulhoff, S., Dulepet, P. S., Vidyadhara, S., Ki, D., Agrawal, S., Pham, C., Kroiz, G., Li, F., Tao, H., Srivastava, A., ... Resnik, P. (2025). The prompt report: A systematic survey of prompt engineering techniques. arXiv. https://doi.org/10.48550/arXiv.2406.06608
Shute, V. J., & Rahimi, S. (2017). Review of computer-based assessment for learning in elementary and secondary education. Journal of Computer Assisted Learning, 33(1), 1-19. https://doi.org/10.1111/jcal.12172
Steurer, J. (2011). The Delphi method: An efficient procedure to generate knowledge. Skeletal Radiology, 40, 959-961. https://doi.org/10.1007/s00256-011-1145-z
Valenzuela Caico, R., & Pérez Carvajal, A. (2025). Inteligencia artificial en educación superior: ¿Un reemplazo para los profesores o una herramienta de apoyo? Revista Iberoamericana de Investigación en Educación, (9). https://doi.org/10.58663/riied.vi9.221
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30.
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., & Zhou, D. (2023). Chain-of-thought prompting elicits reasoning in large language models. arXiv. https://doi.org/10.48550/arXiv.2201.11903
Weng, X., Xia, Q., Gu, M., Rajaram, K., & Chiu, T. K. (2024). Assessment and learning outcomes for generative AI in higher education: A scoping review on current research status and trends. Australasian Journal of Educational Technology, 40(6), 37-55. https://doi.org/10.14742/ajet.9540
Wiliam, D. (2011). What is assessment for learning? Studies in Educational Evaluation, 37(1), 3-14. https://doi.org/10.1016/j.stueduc.2011.03.001
Wilson, D. (2013). Ladder of feedback [Working paper]. Project Zero, Harvard Graduate School of Education. https://pz.harvard.edu/resources/ladder-of-feedback
Winstone, N. E., Boud, D., Dawson, P., & Heron, M. (2022). From feedback-as-information to feedback-as-process: A linguistic analysis of the feedback literature. Assessment & Evaluation in Higher Education, 47(2), 213-230. https://doi.org/10.1080/02602938.2021.1902467
xAI. (2024). Grok 3 [Large language model]. https://grok.com/
Zhang, Z., Dong, Z., Shi, Y., Price, T., Matsuda, N., & Xu, D. (2024). Students’ perceptions and preferences of generative artificial intelligence feedback for programming. In Proceedings of the AAAI Conference on Artificial Intelligence, 38(21), 23250–23258. https://doi.org/10.1609/aaai.v38i21.30372
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 María Verónica Leiva-Guerrero, Ignacio Araya Zamorano, Rafael Escobar Collins, Francisca Silva Castro

This work is licensed under a Creative Commons Attribution 4.0 International License.
The articles that are published in this journal are subject to the following terms:
1. The authors grant the exploitation rights of the work accepted for publication to RIED, guarantee to the journal the right to be the first publication of research understaken and permit the journal to distribute the work published under the license indicated in point 2.
2. The articles are published in the electronic edition of the journal under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. You can copy and redistribute the material in any medium or format, adapt, remix, transform, and build upon the material for any purpose, even commercially. You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
3. Conditions for self-archiving. Authors are encouraged to disseminate electronically the OnlineFirst version (assessed version and accepted for publication) of its articles before publication, always with reference to its publication by RIED, favoring its circulation and dissemination earlier and with this a possible increase in its citation and reach among the academic community.

