Skip to content

Restrict short queries at API level#11485

Merged
cdrini merged 2 commits intointernetarchive:masterfrom
cdrini:perf/solr-short-queries
Nov 26, 2025
Merged

Restrict short queries at API level#11485
cdrini merged 2 commits intointernetarchive:masterfrom
cdrini:perf/solr-short-queries

Conversation

@cdrini
Copy link
Collaborator

@cdrini cdrini commented Nov 18, 2025

Closes #11271 . Was noticing a good number of these in our solr logs for The or THE or a or * taking 10s+ and hitting the timeout, coming from the search.json api.

Technical

Testing

Put on prod before the second commit, and monitoring performance. No longer seeing these in the logs!

NOTE: This is will prevent some real usecases, like searching for Stephen King's It via the API -- which doesn't time out.

Screenshot

Stakeholders

Copilot AI review requested due to automatic review settings November 18, 2025 19:52
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds API-level validation to restrict short search queries (less than 3 characters) and block specific queries like "the" from being processed by the search API. This helps reduce load on the search infrastructure by preventing ineffective or resource-intensive queries.

  • Query validation is added to both the legacy web.py endpoint (/search) and the FastAPI endpoint (/search.json)
  • Short queries (< 3 characters) and blocked queries return HTTP 422 with descriptive error messages
  • Frontend autocomplete logic is updated to use case-insensitive comparison for blocked queries

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
openlibrary/plugins/worksearch/code.py Adds query validation to the legacy search API endpoint, checking for query length and blocked queries, moves Content-Type header setting before validation
openlibrary/fastapi/search.py Adds query validation to the FastAPI search endpoint with the same checks for query length and blocked queries
openlibrary/plugins/openlibrary/js/SearchBar.js Fixes the blocked query check to use case-insensitive comparison, matching the backend behavior

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@cdrini cdrini added the Patch Deployed This PR has been deployed to production independently, outside of the regular deploy cycle. label Nov 18, 2025
@cdrini cdrini marked this pull request as draft November 18, 2025 21:01
@cdrini cdrini marked this pull request as ready for review November 19, 2025 00:04
@cdrini cdrini force-pushed the perf/solr-short-queries branch from 37c5810 to 2ac1305 Compare November 19, 2025 00:07
@cdrini cdrini added the Theme: Performance Issues related to UI or Server performance. [managed] label Nov 19, 2025
@cdrini cdrini force-pushed the perf/solr-short-queries branch from 4945eac to fbb34f8 Compare November 19, 2025 00:21
@cdrini cdrini assigned RayBB and unassigned mekarpeles Nov 26, 2025
@github-project-automation github-project-automation bot moved this to Waiting Review/Merge from Staff in Ray's Project Nov 26, 2025
Copy link
Collaborator

@RayBB RayBB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly what we want. I like that we give a nice error too!

@cdrini cdrini merged commit 9d34bc8 into internetarchive:master Nov 26, 2025
5 checks passed
@github-project-automation github-project-automation bot moved this from Waiting Review/Merge from Staff to Done in Ray's Project Nov 26, 2025
@cdrini cdrini deleted the perf/solr-short-queries branch November 26, 2025 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Patch Deployed This PR has been deployed to production independently, outside of the regular deploy cycle. Theme: Performance Issues related to UI or Server performance. [managed]

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Fail-fast for single-letter and stopword queries in search endpoint

4 participants