Skip to content

Conversation

@smorimoto
Copy link
Contributor

@smorimoto smorimoto commented Jan 25, 2021

This is probably an unintended implementation. I think it's faster to read the documentation for details, but the problem here is that if you use semver.clean with beta release, the cache will be stored as a stable release.
https://github.com/npm/node-semver#clean
In general, alpha and beta releases are often used before stable releases, and even if you intend to use a stable release in the future, it's clear that this issue can lead to inadvertently misuse previous alpha and beta releases.
In the first place, using the semver library itself is a wrong choice, given that not all software versions in the world are strictly semver-compliant. (That said, it would be a relatively reasonable choice by adding a loose option.) What do you think?

@smorimoto
Copy link
Contributor Author

@joshmgross @dhadka Can you take a look at this? Who should I ask to review toolkit related PRs? Your response is too slow every time.

@smorimoto
Copy link
Contributor Author

If necessary, I can review PR as an outside collaborator.

Signed-off-by: Sora Morimoto <sora@morimoto.io>
@dhadka
Copy link
Contributor

dhadka commented Jan 27, 2021

@smorimoto Thanks for the ping! It's a good idea to @ mention us or assign us as reviewers...otherwise this can get lost in the noise. I'm not familiar with the tool cache code (which is different from the cache action), but looking at commit history I'd add @thboop as a reviewer.

@dhadka dhadka requested a review from thboop January 27, 2021 15:23
@joshmgross
Copy link
Contributor

cc @bryanmacfarlane for the feedback on semver ☝️

@smorimoto
Copy link
Contributor Author

Thank you for telling me who I should mention. This wrong implementation is causing some problems now. I hope we can discuss and find the ideal solution as soon as possible.

@smorimoto
Copy link
Contributor Author

Any updates?

@thboop
Copy link
Collaborator

thboop commented Feb 3, 2021

@smorimoto I'l take a look at this later this week and see if we can get a release out, sorry for the delay and thanks for being patient!

@smorimoto
Copy link
Contributor Author

Okay, thanks. If possible, please assign appropriate reviewers to other PR I have open to help me move things forward.

let toolPath = ''
if (versionSpec) {
versionSpec = semver.clean(versionSpec) || ''
versionSpec = semver.clean(versionSpec, {loose: true}) || ''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really only works for explicit versions, for example:

    await tc.cacheFile(downloadPath, 'foo', 'foo', '1.1.0-beta')
    const toolPath: string = tc.find('foo', '1.1.0-beta')

Ideally, we would support beta/alpha versions, and let users opt in or out when using find so you could do something like

    await tc.cacheFile(downloadPath, 'foo', 'foo', '1.1.0-beta')
    const toolPath: string = tc.find('foo', '1.1.x' {allowPreview: true})

I'll go ahead and file an issue to see if we can get started on supporting non-semvar versions and preview versions, but I don't want to partially support beta versions at this time.

Copy link
Contributor Author

@smorimoto smorimoto Feb 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've said before, the problem here is that not every programming language, ecosystem, software, etc... in the world is fully compliant with Semver (Here is a good thread to explain that: https://twitter.com/_craigfe/status/1357277471364829184), and most software in the world does not strictly adhere to Semver. (Most are just acting as if they are.) In addition, passing non-explicit versions is a wrong thing. Do we really need to care? The reason this is wrong is that unintentional cache calls are often confusing and often take a long time to debug because no one thinks they are the cause. In other words, we should not manage our cache loosely unless we are sure that it's really safe. This is the same thing as why we recommend caching yarn or npm cache directory, rather than node_modules itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still waiting for your response.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

Copy link
Collaborator

@thboop thboop Mar 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the world is fully compliant with Semver ... and most software in the world does not strictly adhere to Semver.

I agree 100% , its why I filed #709, we need a better way to support non-semver items into the tool cache.

The reason this is wrong is that unintentional cache calls are often confusing and often take a long time to debug because no one thinks they are the cause. In other words

I don't think user's are calling the cache unintentionally, it's a pretty explicit interface to call the cache.

This is the same thing as why we recommend caching yarn or npm cache directory, rather than node_modules itself.

The tool cache serves a different purpose from the cache action. It is used to speed up workflow execution by having dependencies available locally, and is available runner wide rather than job wide. Since having every version of something like node installed locally may be unfeasible, semver allows users to state a range of versions that work for them.

For example, take a look at our cached tools, notice only three versions of node.
If users had to pin to exact versions, they would likely never hit the cache on the tool-cache. Furthermore, caching every version may quickly cause runners to run out of disk space, as these cache's are persisted and runners can be assigned to orgs, rather than individual repos. Typically our hosted runners just cache the latest major/minor versions of major dependencies actions users may use.

While this pr works for self hosted runners who need to do exactly this flow, but by merging it we commit to supporting this interface in future versions. The current interface supports semver because it makes sense for the use case of the tool-cache, while we should support other types of versioning, any additions we add should also support semver to be consistent with how this feature currently works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree 100% , its why I filed #709, we need a better way to support non-semver items into the tool cache.

I'm glad to hear that.

I don't think user's are calling the cache unintentionally, it's a pretty explicit interface to call the cache.

For example, imagine that you have configured a tool to compile to that location that is not semver-compliant and is not relocatable (such that paths will be hardcoded). Let's suppose it's 1.0.0+musl. However, it's stored as 1.0.0 and the directory is handed over unintentionally. When this happens, it's a big mistake. When it's called, the tool tries to look at the 1.0.0+musl directory, but it actually stores 1.0.0, and it will fail.

While this pr works for self hosted runners who need to do exactly this flow, but by merging it we commit to supporting this interface in future versions. The current interface supports semver because it makes sense for the use case of the tool-cache, while we should support other types of versioning, any additions we add should also support semver to be consistent with how this feature currently works.

That's a good point, but don't the toolkit adhere semver? Is it not enough to bump a major version? I think it's better to unify the handling of all functions with no extra options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to write in the comment above; I think it would be better to stop even using semver.clean.

Copy link
Collaborator

@thboop thboop Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smorimoto , Could you describe the scenario more, I don't understand how this PR addresses it.

In semver, there are two formats you can expect outside of the general 2.0.0

  • build version 2.0.0+{build metadata}
  • prerelease 2.0.0-{prerelease identifier}

According to semver, the build metadata must be ignored, so your example 1.0.0+musl needs to be treated the same as 1.0.0.

If you had a done a prerelease identifier 1.0.0-musl. The tool cache already handles that. It creates a folder named 1.0.0-musl
See the below code:

    await tc.cacheFile(downloadPath, 'foo', 'foo', '1.1.0-musl')
    const toolPath: string = tc.find('foo', '1.1.0-musl') // this works
    // the below would fail because it tries to find 1.1.0, not the prerelease version
   const toolPath: string = tc.find('foo', '1.1.0')

However, it's stored as 1.0.0

I want to be very clear, there is no version corruption for prerelease identifiers. The corruption you are seeing for build identifiers it not actually corruption, that is how semver is intended to work.

You can also already store and access prerelease versions, and they don't result in version corruption.

All the clean option does it help scenarios where the input is invalid semver, like

    await tc.cacheFile(downloadPath, 'foo', 'foo', '1.1.0musl')
    await tc.cacheFile(downloadPath, 'foo', 'foo', '1.1.0musl')

I'm definitely willing to take enhancements for this feature, particularly around how we can expose includePrerelease from the semver library to the tool cache and allow users to do semver matching on prerelease tags, which may help your use case, but I don't think this PR solves it.

That being said, I appreciate your diligence in bringing these items to our attention, I'm planning to make it through the rest of your prs this week, thanks for the contributions!

I'm going to close out this pr until its ready for review again.

@smorimoto
Copy link
Contributor Author

@dhadka Could you ask him to check this again, or reassign someone?

@thboop thboop closed this Jun 1, 2021
@smorimoto
Copy link
Contributor Author

Why did you close this? This is still fundamentally unsolved, right?

@thboop
Copy link
Collaborator

thboop commented Jun 1, 2021

Why did you close this?

Please see my comment here: #698 (comment)

I think there is an opportunity to do better regarding non-semver versioning here, but this pr is to solve version corruption that doesn't appear to be happening.

This is still fundamentally unsolved, right?

Could you describe what is still unsolved? Happen to reopen if this was premature!

Lets take further discussions of non-semver implementations here

@smorimoto
Copy link
Contributor Author

Oh, I missed that comment!

@smorimoto
Copy link
Contributor Author

Okay, I'm no longer facing this problem (to be precise, I stopped using this) and it's not that urgent, so the response could be a bit slow.

@smorimoto smorimoto deleted the fix-tool-cache branch June 4, 2021 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants