Attached Paper In-person November Annual Meeting 2026

AI Moral and Religious Benchmarks

Papers Session: Experiments in Artificial Intelligence

Abstract for Online Program Book (maximum 150 words)

In artificial intelligence (AI) research and development, a benchmark combines data and evaluation criteria to measure how well AI systems perform specific tasks or demonstrate particular abilities. Developing meaningful benchmarks represents a significant opportunity for religious and ethical scholars to shape AI’s future in a positive direction. With Generative AI, evaluation methods alone can orient AI training processes toward moral improvement and religious awareness. Recent efforts to evaluate the ethical dimensions of large language models (LLMs) set the stage for extending this frontier as do well-recognized challenges in constructing models that capture diverse human cultures. Developing benchmarks that more broadly measure human and other suffering, compassionate responses, and richer conceptions of flourishing and well-being provide timely and impactful means to affect AI development. Efforts toward these state-of-the-art ethical and religious benchmarks are reviewed, discussed, and situated within a broader framework oriented toward flourishing.

Authors

Mark Graves

markgraves@fuller.edu