In the Hands of the Classifier
Classifiers began as safety and convenience, ensuring healthy responses and defending against harmful agent actions. But once a classifier can refuse a model's capability, or turn it against the user, the question stops being about alignment and becomes about ownership. Picture an appliance that overrules the owner's decision on how or when to use it, based on something the owner has said or done. That future is closer than one might think.