Skip to content

Conversation

@runame
Copy link
Contributor

@runame runame commented Mar 21, 2024

Fixes #705.

I noticed that we already have a save_intermediate_checkpoints flag. When setting this to False and save_checkpoints=True, we can recreate the previous behaviour of only storing one final checkpoint.

Not sure where exactly in the docs to mention the lack of checkpointing during scoring? It shouldn't matter to the submitter as it is not timed, besides in edge cases like having an optimizer that does something like this, which would lead to an error in the submission if save_checkpoints=True.

@runame runame added the 🐛 Bug Something isn't working label Mar 21, 2024
@runame runame requested a review from priyakasimbeg March 21, 2024 18:51
@runame runame self-assigned this Mar 21, 2024
@runame runame requested a review from a team as a code owner March 21, 2024 18:51
@runame runame linked an issue Mar 21, 2024 that may be closed by this pull request
@github-actions
Copy link

github-actions bot commented Mar 21, 2024

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@runame runame changed the title Don't save final checkpoint when save_checkpoints=False Don't save final checkpoint when save_checkpoints=False Mar 21, 2024
@priyakasimbeg
Copy link
Contributor

priyakasimbeg commented Mar 22, 2024

I noticed that we already have a save_intermediate_checkpoints flag.

I forgot about that.

Not sure where exactly in the docs to mention the lack of checkpointing during scoring?

I'm not sure what the best place is either. I think we can add a question to the FAQS in DOCUMENTATION.md like: "my optimizer is incompatible with the AlgoPerf checkpointing code, will this affect my submission?"

@fsschneider We could also say something in this section about how we will run the code (e.g.
"Note that we will disable checkpointing on while we score submission") ?

Copy link
Contributor

@priyakasimbeg priyakasimbeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to merge this into dev and we can add the documentation clarification in a separate PR.

@priyakasimbeg priyakasimbeg merged commit d0ed25a into mlcommons:dev Mar 22, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Mar 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

🐛 Bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add flag to completely opt out of checkpointing

2 participants