Trustworthy generative AI for computing systems: A review of safety, evaluation, and governance mechanisms

Authors

DOI:

https://doi.org/10.17977/um031v13i12026p061

Abstract

More and more generative AI systems which includes large language models, diffusion models, and multimodal foundations are being integrated into crucial computing infrastructure, including cloud orchestration, code synthesis pipeline, healthcare decision support, and financial risk assessment. Consequently, there is greater demand for frameworks that can evaluate, guarantee, and regulate the trustworthiness of these systems. This article reviewed the development of trustworthy AI research from 2015 to 2025, and the evidence generated across four primary areas: safety and alignment, robustness and reliability, evaluation, and governance. We delivered distinctive comparative assessments of safety benchmarks, alignment methodologies (RLHF, RLAIF, DPO, Constitutional AI), and formal governance frameworks worldwide, pinpointing the critical discrepancies between regulated objectives and actual technical capability. A key finding is the Evaluation Paradox: The benchmarks most commonly relied on to certify systems as “AI safe” are, in fact, the systems least robust to distributional shift and adversarial manipulation. There is an institutional misalignment between the speed of generative AI deployment and the maturity of the governance mechanisms proposed to regulate it. We documented seven priority research challenges for the field. Researchers, system engineers, policymakers, and practitioners pursuing an evidence-based understanding of the current state-of-trustworthiness will benefit from this review.

Author Biographies

Saif Safaa Shakir, University of Al-Qadisiyah

College of Computer Science and Information Technology

Hasan Fadhil Qasim, University of Misan

College of Agriculture

Huda Najim Abdulwahed, University of Al-Qadisiyah

College of Arts, University  of  Al-Qadisiyah, Iraq

Downloads

Published

2026-05-20

How to Cite

Shakir, S. S., Qasim, H. F., & Abdulwahed, H. N. (2026). Trustworthy generative AI for computing systems: A review of safety, evaluation, and governance mechanisms. Jurnal Inovasi Dan Teknologi Pembelajaran, 13(1), 61–73. https://doi.org/10.17977/um031v13i12026p061

Issue

Section

Articles