Hybrid Zero-Shot NLP Pipeline for Text Summarization and Question Generation

Osibajo, Inioluwa Daniel; Olorunshola, Oluwaseyi Ezekiel; Adamu-Fika, Fatimah; Samson, Tsentob Joy

Hybrid Zero-Shot NLP Pipeline for Text Summarization and Question Generation

Inioluwa Daniel Osibajo, Oluwaseyi Ezekiel Olorunshola, Fatimah Adamu-Fika and Tsentob Joy Samson
Additional contact information
Inioluwa Daniel Osibajo: Department of Computer Science, Faculty of Computing, Air Force Institute of Technology, Kaduna, Nigeria.
Oluwaseyi Ezekiel Olorunshola: Department of Computer Science, Faculty of Computing, Air Force Institute of Technology, Kaduna, Nigeria.
Fatimah Adamu-Fika: Department of Cyber Security, Faculty of Computing, Air Force Institute of Technology, Kaduna, Nigeria.
Tsentob Joy Samson: Department of Computer Science, Faculty of Computing, Air Force Institute of Technology, Kaduna, Nigeria.

International Journal of Research and Innovation in Applied Science, 2025, vol. 10, issue 7, 342-354

Abstract: This study presents a sophisticated hybrid zero-shot Natural Language Processing (NLP) pipeline for text summarization and multiple-choice question (MCQ) generation, specifically designed for low-resource educational environments. The system integrates Bidirectional Encoder Representations from Transformers (BERT) for extractive summarization, Bidirectional and Auto-Regressive Transformers (BART) for abstractive summarization, and the Text-to-Text Transfer Transformer (T5) for MCQ generation. Built using the Hugging Face Transformers library, Natural Language Toolkit (NLTK), Spa Cy, and Sentence Transformers, the pipeline operates efficiently on a 12 GB Graphics Processing Unit (GPU) without the need for model fine-tuning. The workflow involves preprocessing academic texts, identifying key sentences through BERT and TextRankâ€”a graph-based ranking algorithmâ€”generating coherent and concise summaries with BART, and producing diverse, contextually relevant MCQs using T5. Evaluations were conducted on user-generated academic texts and the CNN/Daily Mail dataset for benchmarking. The system achieved a BERT Score F1 of 0.87, Recall-Oriented Understudy for Gisting Evaluation (ROUGE)-1 and ROUGE-L of 0.54, Bilingual Evaluation Understudy (BLEU) of 0.20, Metric for Evaluation of Translation with Explicit OR dering (METEOR) of 0.35, a compression ratio of 0.37, coherence score of 0.50, and 80% human-rated MCQ relevanceâ€”outperforming Generative Pre-trained Transformer (GPT-3) baselines. To assess educational impact, a study was conducted with 20 students of average academic standing using a 25-mark test generated by the pipeline. Results showed that 13 students scored above 20, 4 scored between 15â€“20, and 3 scored between 10â€“15, indicating that 85% of participants exceeded a 60% proficiency threshold. Qualitative analysis revealed minor factual inaccuracies in 10% of summaries and relevance drift in 15% of MCQs, highlighting areas for further enhancement. Overall, the study demonstrates the practical potential of transformer-based hybrid NLP pipelines for scalable, accessible educational content creation in resource-constrained contexts.

Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.rsisinternational.org/journals/ijrias/ ... -issue-7/342-354.pdf (application/pdf)
https://rsisinternational.org/journals/ijrias/arti ... question-generation/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:bjf:journl:v:10:y:2025:i:7:p:342-354

Access Statistics for this article

International Journal of Research and Innovation in Applied Science is currently edited by Dr. Renu Malsaria

More articles in International Journal of Research and Innovation in Applied Science from International Journal of Research and Innovation in Applied Science (IJRIAS)
Bibliographic data for series maintained by Dr. Renu Malsaria ().