Comparación de modelos generativos compactos para respuesta automática  en español mediante RAG

Jean Phol A. Curi Garrafa; Victor R. Ortega Marocho; Wilson Mamani Rodrigo

doi:10.57166/riqchary.v7.n2.2025.2

PDF (Spanish)

Published: Aug 19, 2025

DOI: https://doi.org/10.57166/riqchary.v7.n2.2025.2

Keywords:

spanish, evaluation, Compact models, rag

Jean Phol A. Curi Garrafa

Micaela Bastidas National University of Apurímac

https://orcid.org/0009-0006-5536-7055

Victor R. Ortega Marocho

Micaela Bastidas National University of Apurímac

https://orcid.org/0009-0006-7868-5507

Wilson Mamani Rodrigo

Micaela Bastidas National University of Apurímac

https://orcid.org/0000-0003-3901-0268

Abstract

This study compares five compact generative models (≤ 8 billion parameters) for Spanish question answering under a retrieval-augmented generation (RAG) pipeline executed locally. We assess response quality using F1, BLEU-4, and an external semantic judge (LLM-Judge), alongside efficiency indicators (P95 latency, memory, GPU/CPU). Results show Mistral 7B achieves the highest average F1 and semantic scores, whereas OpenHermes 7B attains nearly identical accuracy with the lowest memory footprint. Zephyr 7B-β performs well on very long documents, and Phi-3 Mini minimizes tail latency under adverse conditions. A Pareto analysis of F1–RAM identifies Mistral 7B and OpenHermes 7B as non-dominated solutions, yielding practical guidelines depending on operational goals (maximum accuracy vs. resource efficiency). The paper contributes a reproducible Spanish-language comparison under RAG and actionable criteria for local deployments.

How to Cite

Comparison of Compact Generative Models for Automatic Question Answering in Spanish via Retrieval-Augmented Generation. (2025). C&T Riqchary Science and Technology Research Magazine, 7(2), 9-18. https://doi.org/10.57166/riqchary.v7.n2.2025.2

Issue

Vol. 7 No. 2 (2025): COINCITEC 2025

Section

Artículos

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

When an author creates an article and publishes it in a journal, the copyright passes to the journal as part of the publishing agreement. Therefore, the journal becomes the owner of the rights to reproduce, distribute and sell the article. The author retains some rights, such as the right to be recognized as the creator of the article and the right to use the article for his or her own scholarly or research purposes, unless otherwise agreed in the publication agreement.

Author Biographies

Jean Phol A. Curi Garrafa, Micaela Bastidas National University of Apurímac

He is a Computer Science and Systems Engineering student at the Micaela Bastidas National University of Apurímac. His training focuses on the development of information systems and the optimization of academic processes through the use of technological tools. He has participated in academic activities related to the design and implementation of IT solutions for university management.

Victor R. Ortega Marocho, Micaela Bastidas National University of Apurímac

He iis a Computer Science and Systems Engineering student at the Micaela Bastidas National University of Apurímac. His academic interests focus on software engineering, process automation and data analysis applied to education. He has contributed to research projects focused on improving academic management systems.

Wilson Mamani Rodrigo, Micaela Bastidas National University of Apurímac

He is a Systems Engineer and Civil Engineer, with a Master's degree in Systems Engineering and a PhD in Environmental Civil Engineering Sciences. He is a teaching assistant at the Micaela Bastidas National University of Apurímac and has taught at the National University of the Altiplano. He has extensive experience in the preparation of technical files, feasibility studies, and civil infrastructure projects, in addition to serving as a consultant and researcher in civil engineering projects.is a Systems Engineer and Civil Engineer, with a Master's degree in Systems Engineering and a PhD in Environmental Civil Engineering Sciences. He is a teaching assistant at the Micaela Bastidas National University of Apurimac and has taught at the National University of the Altiplano. He has extensive experience in the preparation of technical files, feasibility studies, and civil infrastructure projects, in addition to serving as a consultant and researcher in civil engineering projects.

How to Cite

Comparison of Compact Generative Models for Automatic Question Answering in Spanish via Retrieval-Augmented Generation. (2025). C&T Riqchary Science and Technology Research Magazine, 7(2), 9-18. https://doi.org/10.57166/riqchary.v7.n2.2025.2

Download Citation

References

J. Cañete, G. Chaperon, R. Fuentes, J.-H. Ho, H. Kang, and J. Pérez, “Spanish Pre-trained BERT Model and Evaluation Data,” Aug. 2023, Accessed: Aug. 11, 2025. [Online]. Available: https://arxiv.org/pdf/2308.02976

J. Cañete, S. Donoso, F. Bravo-Marquez, A. Carval-lo, and V. Araujo, “ALBETO and DistilBETO: Lightweight Spanish Language Models,” 2022 Lan-guage Resources and Evaluation Conference, LREC 2022, pp. 4291–4298, Apr. 2022, Accessed: Aug. 11, 2025. [Online]. Available: https://arxiv.org/pdf/2204.09145

A. Gutiérrez-Fandiño et al., “MarIA: Spanish Lan-guage Models,” Procesamiento del Lenguaje Natural, vol. 68, pp. 39–60, Apr. 2022, doi: 10.26342/2022-68-3.

P. Lewis et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” Adv Neural Inf Process Syst, vol. 2020-December, May 2020, Ac-cessed: Aug. 11, 2025. [Online]. Available: https://arxiv.org/pdf/2005.11401

K. Guu, K. Lee, Z. Tung, P. Pasupat, and M. W. Chang, “REALM: Retrieval-Augmented Language Model Pre-Training,” 37th International Conference on Machine Learning, ICML 2020, vol. PartF168147-6, pp. 3887–3896, Feb. 2020, Accessed: Aug. 11, 2025. [Online]. Available: https://arxiv.org/pdf/2002.08909

P. Lewis, B. Oguz, R. Rinott, S. Riedel, and H. Schwenk, “MLQA: Evaluating Cross-lingual Extrac-tive Question Answering,” Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 7315–7330, Oct. 2019, doi: 10.18653/v1/2020.acl-main.653.

A. Grattafiori et al., “The Llama 3 Herd of Models,” Jul. 2024, Accessed: Aug. 11, 2025. [Online]. Availa-ble: https://arxiv.org/pdf/2407.21783

A. Q. Jiang et al., “Mistral 7B,” Oct. 2023, Accessed: Aug. 11, 2025. [Online]. Available: https://arxiv.org/pdf/2310.06825

“HuggingFaceH4/zephyr-7b-beta · Hugging Face.” Accessed: Aug. 11, 2025. [Online]. Available: https://huggingface.co/HuggingFaceH4/zephyr-7b-beta

M. Abdin et al., “Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone,” Apr. 2024, Accessed: Aug. 11, 2025. [Online]. Avail-able: https://arxiv.org/pdf/2404.14219

“teknium/OpenHermes-7B · Hugging Face.” Ac-cessed: Aug. 11, 2025. [Online]. Available: https://huggingface.co/teknium/OpenHermes-7B

T. Dettmers, M. Lewis, Y. Belkada, and L. Zettle-moyer, “LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale,” Adv Neural Inf Process Syst, vol. 35, Aug. 2022, Accessed: Aug. 11, 2025. [Online]. Available: https://arxiv.org/pdf/2208.07339

J. Johnson, M. Douze, and H. Jegou, “Billion-scale similarity search with GPUs,” IEEE Trans Big Data, vol. 7, no. 3, pp. 535–547, Feb. 2017, doi: 10.1109/TBDATA.2019.2921572.

M. Douze et al., “The Faiss library,” Jan. 2024, Ac-cessed: Aug. 11, 2025. [Online]. Available: https://arxiv.org/pdf/2401.08281

K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “BLEU,” in Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL ’02, Morristown, NJ, USA: Association for Computation-al Linguistics, 2001, p. 311. doi: 10.3115/1073083.1073135.

T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “BERTScore: Evaluating Text Generation with BERT,” 8th International Conference on Learning Representations, ICLR 2020, Apr. 2019, Accessed: Aug. 11, 2025. [Online]. Available: https://arxiv.org/pdf/1904.09675

C.-Y. Lin, “ROUGE: A Package for Automatic Eval-uation of Summaries,” 2004. Accessed: Aug. 11, 2025. [Online]. Available: https://aclanthology.org/W04-1013/

Y. Gao et al., “Retrieval-Augmented Generation for Large Language Models: A Survey,” Proceedings - 2024 Conference on AI, Science, Engineering, and Tech-nology, AIxSET 2024, pp. 166–169, Dec. 2023, doi: 10.1109/AIxSET62544.2024.00030.

H. Yu, A. Gan, K. Zhang, S. Tong, Q. Liu, and Z. Liu, “Evaluation of Retrieval-Augmented Genera-tion: A Survey,” Communications in Computer and In-formation Science, vol. 2301, pp. 102–120, Jul. 2024, doi: 10.1007/978-981-96-1024-2_8.

E. Frantar, S. Ashkboos, T. Hoefler, and D. Alistarh, “GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers,” 11th Interna-tional Conference on Learning Representations, ICLR 2023, Oct. 2022, Accessed: Aug. 11, 2025. [Online]. Available: https://arxiv.org/pdf/2210.17323

Article Sidebar

Main Article Content

Abstract

Article Details

Jean Phol A. Curi Garrafa, Micaela Bastidas National University of Apurímac

Victor R. Ortega Marocho, Micaela Bastidas National University of Apurímac

Wilson Mamani Rodrigo, Micaela Bastidas National University of Apurímac

How to Cite

References