A new artificial intelligence study from the Humans and Thinking Machines Laboratory stress-tests model specifications and reveals feature differences between language models
AI companies use model specifications to define target behaviors during training and evaluation. Does the current specification describe the expected behavior accurately enough, and do cutting-edge models exhibit different behavioral characteristics under the same...