Habermas Machine

Named after Jürgen Habermas Two-model system.

Generative model:
- Built on 70b param model (Chinchilla)
- Prompted with question, opinions, and initial group statement winner and critiques
- Use SFT on statements from previous rounds that were rated as high quality
Personalized reward model (PRM)
- 1.4b variant with added linear layer for single scalar reward prediction
- Predicts how much each participant would endorse each statement based on their individual opinions

Operation:

The system takes participants’ written opinions on a question
It generates multiple candidate consensus statements
PRM estimates how each participant would rank these statements
Rankings are aggregated via the Shulze ranked-choice voting method, yielding a single “winning” statement
In a second “critique” phase, participants critique the initial statement; the model integrates these critiques to generate revised statements, again selecting the best via social choice aggregation

pstore