Skip to content

Conversation

@jh-block
Copy link
Collaborator

@jh-block jh-block commented Jan 21, 2026

Summary

The list of supported models was being fetched from Databricks every time the provider was initialised, taking around 1.6s on my machine. This shows up to the user as a 1.6s delay before the first model response can start to be received.

We only needed this list to determine if our preferred "fast model", used for compacting and for generating the titles of sessions (currently Gemini 2.5 Flash) was supported. If it isn't supported, we fall back to the main model. But we already have a fallback to the main model implemented if calling the fast model fails (e.g. because it's not found), so there's no need to check upfront.

This PR removes the fetch of supported models and assumes the fast model is available, relying on the already implemented fallback to the main model if it turns out not to be. On my machine, it reduces the time to first token after sending the first prompt from 1.6s to around 200ms.

I also updated the Playwright tests (which were failing due to drift between the actual UI structure and what the tests expected) and added some performance tests which were used to find the issue being fixed in the PR.

It may be easier to review by commit.

Type of Change

  • Feature
  • Bug fix
  • Refactor / Code quality
  • Performance improvement
  • Documentation
  • Tests
  • Security fix
  • Build / Release
  • Other (specify below)

AI Assistance

  • This PR was created or reviewed with AI assistance (Claude Code)

Testing

I manually tested to verify the performance improvement.

@jh-block jh-block force-pushed the jhugo/faster-databricks-init branch from c848b8c to c67af95 Compare January 22, 2026 08:42
@jh-block jh-block marked this pull request as ready for review January 22, 2026 09:35
@jh-block jh-block requested a review from DOsinga January 22, 2026 09:35
@zanesq
Copy link
Collaborator

zanesq commented Jan 22, 2026

Thanks! I tested the desktop app locally and its working well and does seem faster to load a chat.

I also tested the e2e changes and there are still failing tests for me but overall its an improvement so I don't think it should block this PR and we can follow up with fixing the remaining tests.

A major change is it launches the electron app from scratch with each test whereas before it would use the same running electron app so that has tradeoffs like taking longer to run the entire suite. Ultimately I think we should pair these tests down to only what we need for a release smoke test from our checklist so we can automate them on releases next. The rest of the functionality can be covered by our vitest unit/functional tests.

@jh-block
Copy link
Collaborator Author

A major change is it launches the electron app from scratch with each test whereas before it would use the same running electron app so that has tradeoffs like taking longer to run the entire suite. Ultimately I think we should pair these tests down to only what we need for a release smoke test from our checklist so we can automate them on releases next. The rest of the functionality can be covered by our vitest unit/functional tests.

Yeah, one advantage of the isolation is that we could potentially allow playwright to run the tests in parallel without the tests interfering with each other. It is necessary for the performance test to have a fresh instance to give more repeatable results (so that the test is not sometimes testing a cold launch and sometimes a warm instance) but we could allow the other tests to reuse instances if desired.

Copy link
Collaborator

@DOsinga DOsinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd give the LLM generated comments a look over and drop some but otherwise looks good

try {
// Assign a unique debug port for this test to enable parallel execution
// Base port 9222, offset by worker index * 100 + parallel slot
const debugPort = 9222 + (testInfo.parallelIndex * 10);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we not just find a free port here? we're in node land, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're not the ones binding the port, that's happening in electron after we spawn it (but we need to choose it here so that we know the port to connect to later).

so, even if we wrote the code here to find a free port, there'd be a race condition as someone could bind it before electron starts up and devtools binds it. I'm not sure it would be overall better.

I wanted to use Playwright to take some performance measurements as part
of making the time to first token faster on first use. The existing
tests were stale so this updates them to work with the UI as it is
currently. I also extracted the setup and teardown into a fixture.
This takes >1s and we are only using it to determine if the "fast model"
used for compacting and for generating the titles for sessions
(currently Gemini 2.5 Flash) is available. We can instead assume that
it's available, because there is already a fallback behaviour
implemented if we try to use it and it's not available.
@jh-block jh-block force-pushed the jhugo/faster-databricks-init branch from c67af95 to a90e0b0 Compare January 26, 2026 14:18
@jh-block jh-block merged commit 82ba07f into main Jan 26, 2026
19 checks passed
@jh-block jh-block deleted the jhugo/faster-databricks-init branch January 26, 2026 15:51
zanesq added a commit that referenced this pull request Jan 26, 2026
* origin/main:
  fix: dispatch ADD_ACTIVE_SESSION event before navigating from "View All" (#6679)
  Speed up Databricks provider init by removing fetch of supported models (#6616)
  fix: correct typos in documentation and Justfile (#6686)
  docs: frameDomains and baseUriDomains for mcp apps (#6684)
  docs: add Remotion video creation tutorial (#6675)
  docs: export recipe and copy yaml (#6680)
  Test against fastmcp (#6666)
  docs: mid-session changes (#6672)
  Fix MCP elicitation deadlock and improve UX (#6650)
  chore: upgrade to rmcp 0.14.0 (#6674)
  [docs] add MCP-UI to MCP Apps blog (#6664)
  ACP get working dir from args.cwd (#6653)
  Optimise load config in UI (#6662)

# Conflicts:
#	ui/desktop/src/components/Layout/AppLayout.tsx
zanesq added a commit that referenced this pull request Jan 27, 2026
…upport

* origin/main: (79 commits)
  fix[format/openai]: return error on empty msg. (#6511)
  Fix: ElevenLabs API Key Not Persisting (#6557)
  Logging uplift for model training purposes (command injection model) [Small change] (#6330)
  fix(goose): only send agent-session-id when a session exists (#6657)
  BERT-based command injection detection in tool calls (#6599)
  chore: [CONTRIBUTING.md] add Hermit to instructions (#6518)
  fix: update Gemini context limits (#6536)
  Document r slash command (#6724)
  Upgrade GitHub Actions to latest versions (#6700)
  fix: Manual compaction does not update context window. (#6682)
  Removed the Acceptable Usage Policy (#6204)
  Document spellcheck toggle (#6721)
  fix: docs workflow cleanup and prevent cancellations (#6713)
  Docs: file bug directly (#6718)
  fix: dispatch ADD_ACTIVE_SESSION event before navigating from "View All" (#6679)
  Speed up Databricks provider init by removing fetch of supported models (#6616)
  fix: correct typos in documentation and Justfile (#6686)
  docs: frameDomains and baseUriDomains for mcp apps (#6684)
  docs: add Remotion video creation tutorial (#6675)
  docs: export recipe and copy yaml (#6680)
  ...

# Conflicts:
#	ui/desktop/src/hooks/useChatStream.ts
katzdave added a commit that referenced this pull request Jan 27, 2026
…ovider

* 'main' of github.com:block/goose:
  fix: Manual compaction does not update context window. (#6682)
  Removed the Acceptable Usage Policy (#6204)
  Document spellcheck toggle (#6721)
  fix: docs workflow cleanup and prevent cancellations (#6713)
  Docs: file bug directly (#6718)
  fix: dispatch ADD_ACTIVE_SESSION event before navigating from "View All" (#6679)
  Speed up Databricks provider init by removing fetch of supported models (#6616)
  fix: correct typos in documentation and Justfile (#6686)
  docs: frameDomains and baseUriDomains for mcp apps (#6684)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants