Skip to content

Tweak tagger/nginx rate limit logic#11605

Merged
jimchamp merged 4 commits intointernetarchive:masterfrom
cdrini:fix/counter-link-improvements
Dec 18, 2025
Merged

Tweak tagger/nginx rate limit logic#11605
jimchamp merged 4 commits intointernetarchive:masterfrom
cdrini:fix/counter-link-improvements

Conversation

@cdrini
Copy link
Copy Markdown
Collaborator

@cdrini cdrini commented Dec 18, 2025

See also https://github.com/internetarchive/olsystem/pull/292

Technical

Testing

Screenshot

Stakeholders

@cdrini cdrini added the Patch Deployed This PR has been deployed to production independently, outside of the regular deploy cycle. label Dec 18, 2025
@cdrini cdrini marked this pull request as ready for review December 18, 2025 17:03
Copilot AI review requested due to automatic review settings December 18, 2025 17:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refines the nginx rate limiting and request filtering logic to better detect and handle suspicious traffic patterns, with coordinated changes to the tagger system in the related olsystem repository.

Key changes:

  • Enhanced referer detection to catch clients sending literal "-" as referer value
  • Expanded rate limiting to include suspicious IPs in addition to non-identifying crawlers
  • Added dedicated cover image rate limits for non-identifying crawlers to prevent abuse while allowing legitimate batch cover loading

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
docker/web_nginx.conf Enhanced referer detection to recognize "-" as empty referer; expanded paths requiring referer to include subjects, authors, and search pages; removed direct $is_sus_referer enforcement (delegated to tagger.js)
docker/nginx.conf Updated file extension from .map to .conf for consistency; added is_sus_ip.conf include; extended crawler rate limit logic to incorporate suspicious IPs; increased global crawler rate limit from 15r/s to 17r/s; added dedicated 150r/s cover rate limit for non-identifying crawlers
docker/covers_nginx.conf Applied new global_crawler_cover_limit zone to cover endpoints

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jimchamp jimchamp merged commit d76dd03 into internetarchive:master Dec 18, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Patch Deployed This PR has been deployed to production independently, outside of the regular deploy cycle.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants