According to Anthropic, this is the first time the public has been involved in determining the behavior of a language model via an online deliberation process.
Anthropic, a leading artificial intelligence (AI) firm, is pioneering a novel approach to AI development. The approach, known as the ‘Collective Constitutional AI’ project aims to democratize the behavior of AI systems. It does this by soliciting user values and then incorporating them into training a large language model (LLM).
Traditional LLM Training Under Fire
Previously, generative AI tools have come under fire from critics for their responses in specific situations. While trained to give acceptable responses to human queries, critics suggest the acceptable isn’t always useful, and the useful isn’t always acceptable.
Again, there are suggestions that canning the responses of the AI models has removed user agency. Likewise, there are arguments about the variations in morality and values across cultures, populaces and periods. To bridge this divide, Anthropic launched Constitutional AI in May. Constitutional AI was the company’s attempt to “align general purpose language models to high-level normative principles written into a constitution.”
Much like the constitution lays down fundamental principles and rules that govern a nation, Constitutional AI provides guidelines that an AI system must adhere to. The model takes its inspiration from the United Nations Universal Declaration of Human Rights and the experience of its developers. Anthropic argues that Constitutional AI responds to shortcomings by using AI feedback to evaluate outputs.
The Collective Constitutional AI Project
While Constitutional AI builds upon the traditional method of training LLMs, it still shows the extensive influence of developers on the AI output. Consequently, the Collective Constitutional AI project improves on that by using feedback from several people outside Anthropic.
Anthropic collaborated with Polis and the Collective Intelligence Project to conduct a poll among 1,000 American users from diverse demographics. The users answered a series of value-based questions. Thereafter, the responses helped fine-tune the AI model’s value judgments.
According to Anthropic, this is the first time the public has been involved in determining the behavior of a language model via an online deliberation process. Further, it noted the experiment was a scientific success. It also claimed that the results illuminated the challenges and potential solutions for aligning AI models with user values.
“We hope that sharing our very preliminary and imperfect findings will help others interested in democratic inputs to AI to learn from our successes and failures,” it concluded.