replace modelOverridesRecommendedSettings with selfHostedModels#64164

emidoots · 2024-07-30T22:26:32Z

Previously, for providing self-hosted model' configuration (the models we've tested and believe work well), a site admin would use configuration like this:

"modelConfiguration": {
    ...
    "modelOverridesRecommendedSettings": [
      "mistral::v1::mixtral-8x7b-instruct",
      "bigcode::v1::starcoder2-7b"
    ],
}

A few problems with this:

If you are NOT self-hosting models, you probably really should not be using this option, as it would set serverSideConfig options specific to self-hosting, but it's naming "recommended settings" which kind of suggests otherwise!
When self-hosting models, there is almost a 1:1 correlation of provider to actual API endpoint (because you have a single endpoint per model) - so not being able to configure the mistral or bigcode parts of the modelref above is problematic (restricts you to hosting 'only one model per provider'). The only escape for this currently is to abandon the defaults we provide with modelOverridesRecommendedSettings and rewrite it using modelOverrides fully yourself.
When self-hosting models, needing to configure the serverSideConfig.openaicompatible.apiModel is a really common need - the most common option probably - but again there's no way to configure it here, only option is to abandon defaults and rewrite it yourself.
If we improve the default values - such as if we learn that a higher context window size for mixtral-8x7b-instruct is better - we currently don't have a good way to 'release a new version of the defaults' because the string is a model ref mistral::v1::mixtral-8x7b-instruct we'd have to do this by appending -v2 to the model name or something. Having versioning here is important because there are both:

Breaking changes: if we increase the context window at all, site admins hosting these models may need to increase limits in their hosted model deployment - or else the API may just return a hard error ('you sent me too many tokens')
Non-breaking changes: if we decrease the context window, Cody responses will get faster, and it's fine to do. Similarly, adding new stop sequences may be fine for example.

This PR fixes all of these^ issues by deprecating modelOverridesRecommendedSettings and introducing a new selfHostedModels field which looks like:

"modelConfiguration": {
    ...
    "selfHostedModels": [
      {
        "provider": "mistral",
        "model": "mixtral-8x7b-instruct@v1",
        "override": {
          "serverSideConfig": {
            "type": "openaicompatible",
            "apiModel": "mixtral-8x7b-instruct-custom!"
          }
        }
      },
      {
        "provider": "bigcode",
        "model": "starcoder2-7b@v1",
        "override": {
          "serverSideConfig": {
            "type": "openaicompatible",
            "apiModel": "starcoder2-7b-custom!"
          }
        }
      }
    ],
}

Notably:

The provider part of the model ref is now configurable, enabling self-hosting more than one model per provider while still benefitting from our default model configurations.
"model": "starcoder2-7b@v1", is no longer a model ref, but rather a 'default model configuration name' - and has a version associated with it.
override allows overriding properties of the default "model": "starcoder2-7b@v1", configuration, like the serverSideConfig.apiModel.

Importance

I'm hoping to ship this to a few customers asap;

Unblocks customer https://linear.app/sourcegraph/issue/PRIME-447
Fixes https://linear.app/sourcegraph/issue/PRIME-454 (you can see some alternatives I considered here before settling on this approach.)

Test plan

Manually tested for now. Regression tests will come in the near future and are being tracked on Linear.

Changelog

Improved configuration functionality for Cody Enterprise with Self-hosted models.

Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>

chrsmith

Adding an approval to unblock you. But will need to take a closer look to internalize what you are saying later.

However, since the "preferred model" or "self-hosted models" aren't on the critical path for the LLM model selection or other sort of "standard" config code path, I'm not too worried about the breaking changes/regressing things here.

Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>

replace modelOverridesRecommendedSettings with selfHostedModels

b3daf82

Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>

emidoots requested review from a team and chrsmith July 30, 2024 22:26

cla-bot bot added the cla-signed label Jul 30, 2024

Merge branch 'main' into sg/selfhosted

ee2c613

chrsmith approved these changes Jul 30, 2024

View reviewed changes

sg lint --fix=format

061b1c5

Signed-off-by: Stephen Gutekanst <stephen@sourcegraph.com>

emidoots merged commit 544d261 into main Jul 31, 2024

emidoots deleted the sg/selfhosted branch July 31, 2024 03:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

replace modelOverridesRecommendedSettings with selfHostedModels#64164