In artificial intelligence (AI) research and development, a benchmark combines data and evaluation criteria to measure how well AI systems perform specific tasks or demonstrate particular abilities. Developing meaningful benchmarks represents a significant opportunity for religious and ethical scholars to shape AI’s future in a positive direction. With Generative AI, evaluation methods alone can orient AI training processes toward moral improvement and religious awareness. Recent efforts to evaluate the ethical dimensions of large language models (LLMs) set the stage for extending this frontier as do well-recognized challenges in constructing models that capture diverse human cultures. Developing benchmarks that more broadly measure human and other suffering, compassionate responses, and richer conceptions of flourishing and well-being provide timely and impactful means to affect AI development. Efforts toward these state-of-the-art ethical and religious benchmarks are reviewed, discussed, and situated within a broader framework oriented toward flourishing.
Attached Paper
In-person November Annual Meeting 2026
AI Moral and Religious Benchmarks
Papers Session: Experiments in Artificial Intelligence
Abstract for Online Program Book (maximum 150 words)
Authors
