News

Can you jailbreak Anthropic's latest AI safety measure? Researchers want you to try -- and are offering up to $20,000 if you succeed. Trained on synthetic data, these "classifiers" were able to filter ...