Conversation
There was a problem hiding this comment.
Pull request overview
This PR reduces the aggressiveness of rate limits in response to issue #11611 by restructuring the nginx rate limiting logic and increasing the global crawler rate limit threshold.
- Increases the global non-identifying crawler rate limit from 17r/s to 20r/s
- Refactors referer checking logic by renaming variables for clarity and splitting path patterns into strict requirements vs. probable requirements
- Moves certain paths (subjects, authors, search) from strict referer requirements to a new "probably requires referer" category
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| docker/web_nginx.conf | Renames nginx map variables (is_empty_referer → no_referer, is_logged_out → logged_out), removes the combined is_sus_referer map, and creates new probably_requires_referer map with a subset of previously restricted paths |
| docker/nginx.conf | Increases the rate limit for non-identifying crawlers from 17r/s to 20r/s |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| map $request_uri $probably_requires_referer { | ||
| "~*^/subjects/" 1; | ||
| "~*^/authors/OL" 1; | ||
| "~*^/search\?" 1; | ||
| } |
There was a problem hiding this comment.
The new map probably_requires_referer is missing a default value. Unlike the other maps which explicitly set default 0;, this map will return an empty string for non-matching URIs instead of 0. While this may work depending on how it's used in the external tagger.js, it's inconsistent with the patterns established in the other map definitions (requires_referer, no_referer, logged_out, is_sus_user_agent, etc.) which all explicitly define their default values.
In response to #11611
Technical
Testing
Screenshot
Stakeholders