CiPO: Counterfactual Unlearning for Large Reasoning Models through Iterative Preference Optimization
Source code for CiPO (ACL 2026 Main Conference Oral Presentation).
R-TOFU Target model: Terry77/R-TOFU_Target
Code coming soon.
Source code for CiPO (ACL 2026 Main Conference Oral Presentation).
R-TOFU Target model: Terry77/R-TOFU_Target
Code coming soon.