-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Avoid version corruption by passing loose option to semver.clean in the tool-cache package #698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@joshmgross @dhadka Can you take a look at this? Who should I ask to review toolkit related PRs? Your response is too slow every time. |
|
If necessary, I can review PR as an outside collaborator. |
Signed-off-by: Sora Morimoto <sora@morimoto.io>
1de8062 to
93dd1c5
Compare
|
@smorimoto Thanks for the ping! It's a good idea to @ mention us or assign us as reviewers...otherwise this can get lost in the noise. I'm not familiar with the tool cache code (which is different from the cache action), but looking at commit history I'd add @thboop as a reviewer. |
|
cc @bryanmacfarlane for the feedback on semver ☝️ |
|
Thank you for telling me who I should mention. This wrong implementation is causing some problems now. I hope we can discuss and find the ideal solution as soon as possible. |
|
Any updates? |
|
@smorimoto I'l take a look at this later this week and see if we can get a release out, sorry for the delay and thanks for being patient! |
|
Okay, thanks. If possible, please assign appropriate reviewers to other PR I have open to help me move things forward. |
| let toolPath = '' | ||
| if (versionSpec) { | ||
| versionSpec = semver.clean(versionSpec) || '' | ||
| versionSpec = semver.clean(versionSpec, {loose: true}) || '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This really only works for explicit versions, for example:
await tc.cacheFile(downloadPath, 'foo', 'foo', '1.1.0-beta')
const toolPath: string = tc.find('foo', '1.1.0-beta')
Ideally, we would support beta/alpha versions, and let users opt in or out when using find so you could do something like
await tc.cacheFile(downloadPath, 'foo', 'foo', '1.1.0-beta')
const toolPath: string = tc.find('foo', '1.1.x' {allowPreview: true})
I'll go ahead and file an issue to see if we can get started on supporting non-semvar versions and preview versions, but I don't want to partially support beta versions at this time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I've said before, the problem here is that not every programming language, ecosystem, software, etc... in the world is fully compliant with Semver (Here is a good thread to explain that: https://twitter.com/_craigfe/status/1357277471364829184), and most software in the world does not strictly adhere to Semver. (Most are just acting as if they are.) In addition, passing non-explicit versions is a wrong thing. Do we really need to care? The reason this is wrong is that unintentional cache calls are often confusing and often take a long time to debug because no one thinks they are the cause. In other words, we should not manage our cache loosely unless we are sure that it's really safe. This is the same thing as why we recommend caching yarn or npm cache directory, rather than node_modules itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still waiting for your response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the world is fully compliant with Semver ... and most software in the world does not strictly adhere to Semver.
I agree 100% , its why I filed #709, we need a better way to support non-semver items into the tool cache.
The reason this is wrong is that unintentional cache calls are often confusing and often take a long time to debug because no one thinks they are the cause. In other words
I don't think user's are calling the cache unintentionally, it's a pretty explicit interface to call the cache.
This is the same thing as why we recommend caching yarn or npm cache directory, rather than node_modules itself.
The tool cache serves a different purpose from the cache action. It is used to speed up workflow execution by having dependencies available locally, and is available runner wide rather than job wide. Since having every version of something like node installed locally may be unfeasible, semver allows users to state a range of versions that work for them.
For example, take a look at our cached tools, notice only three versions of node.
If users had to pin to exact versions, they would likely never hit the cache on the tool-cache. Furthermore, caching every version may quickly cause runners to run out of disk space, as these cache's are persisted and runners can be assigned to orgs, rather than individual repos. Typically our hosted runners just cache the latest major/minor versions of major dependencies actions users may use.
While this pr works for self hosted runners who need to do exactly this flow, but by merging it we commit to supporting this interface in future versions. The current interface supports semver because it makes sense for the use case of the tool-cache, while we should support other types of versioning, any additions we add should also support semver to be consistent with how this feature currently works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree 100% , its why I filed #709, we need a better way to support non-semver items into the tool cache.
I'm glad to hear that.
I don't think user's are calling the cache unintentionally, it's a pretty explicit interface to call the cache.
For example, imagine that you have configured a tool to compile to that location that is not semver-compliant and is not relocatable (such that paths will be hardcoded). Let's suppose it's 1.0.0+musl. However, it's stored as 1.0.0 and the directory is handed over unintentionally. When this happens, it's a big mistake. When it's called, the tool tries to look at the 1.0.0+musl directory, but it actually stores 1.0.0, and it will fail.
While this pr works for self hosted runners who need to do exactly this flow, but by merging it we commit to supporting this interface in future versions. The current interface supports semver because it makes sense for the use case of the tool-cache, while we should support other types of versioning, any additions we add should also support semver to be consistent with how this feature currently works.
That's a good point, but don't the toolkit adhere semver? Is it not enough to bump a major version? I think it's better to unify the handling of all functions with no extra options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to write in the comment above; I think it would be better to stop even using semver.clean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smorimoto , Could you describe the scenario more, I don't understand how this PR addresses it.
In semver, there are two formats you can expect outside of the general 2.0.0
- build version
2.0.0+{build metadata} - prerelease
2.0.0-{prerelease identifier}
According to semver, the build metadata must be ignored, so your example 1.0.0+musl needs to be treated the same as 1.0.0.
If you had a done a prerelease identifier 1.0.0-musl. The tool cache already handles that. It creates a folder named 1.0.0-musl
See the below code:
await tc.cacheFile(downloadPath, 'foo', 'foo', '1.1.0-musl')
const toolPath: string = tc.find('foo', '1.1.0-musl') // this works
// the below would fail because it tries to find 1.1.0, not the prerelease version
const toolPath: string = tc.find('foo', '1.1.0')
However, it's stored as 1.0.0
I want to be very clear, there is no version corruption for prerelease identifiers. The corruption you are seeing for build identifiers it not actually corruption, that is how semver is intended to work.
You can also already store and access prerelease versions, and they don't result in version corruption.
All the clean option does it help scenarios where the input is invalid semver, like
await tc.cacheFile(downloadPath, 'foo', 'foo', '1.1.0musl')
await tc.cacheFile(downloadPath, 'foo', 'foo', '1.1.0musl')
I'm definitely willing to take enhancements for this feature, particularly around how we can expose includePrerelease from the semver library to the tool cache and allow users to do semver matching on prerelease tags, which may help your use case, but I don't think this PR solves it.
That being said, I appreciate your diligence in bringing these items to our attention, I'm planning to make it through the rest of your prs this week, thanks for the contributions!
I'm going to close out this pr until its ready for review again.
|
@dhadka Could you ask him to check this again, or reassign someone? |
|
Why did you close this? This is still fundamentally unsolved, right? |
Please see my comment here: #698 (comment) I think there is an opportunity to do better regarding non-semver versioning here, but this pr is to solve version corruption that doesn't appear to be happening.
Could you describe what is still unsolved? Happen to reopen if this was premature! Lets take further discussions of non-semver implementations here |
|
Oh, I missed that comment! |
|
Okay, I'm no longer facing this problem (to be precise, I stopped using this) and it's not that urgent, so the response could be a bit slow. |
This is probably an unintended implementation. I think it's faster to read the documentation for details, but the problem here is that if you use
semver.cleanwith beta release, the cache will be stored as a stable release.https://github.com/npm/node-semver#clean
In general, alpha and beta releases are often used before stable releases, and even if you intend to use a stable release in the future, it's clear that this issue can lead to inadvertently misuse previous alpha and beta releases.
In the first place, using the semver library itself is a wrong choice, given that not all software versions in the world are strictly semver-compliant. (That said, it would be a relatively reasonable choice by adding a loose option.) What do you think?