Skip to content

Add Long-Running Scan Infrastructure for Async External Scanners#1565

Merged
netomi merged 3 commits intoeclipse:security-improvementsfrom
yeeth-security:yeeth/security-improvements
Jan 27, 2026
Merged

Add Long-Running Scan Infrastructure for Async External Scanners#1565
netomi merged 3 commits intoeclipse:security-improvementsfrom
yeeth-security:yeeth/security-improvements

Conversation

@janbro
Copy link
Contributor

@janbro janbro commented Jan 26, 2026

Add Long-Running Scan Infrastructure for Async External Scanners

#1396

This PR extends the scan administration capability with infrastructure for asynchronous, long-running external scanners, enabling parallel scanning without blocking the publish API.

Architecture Overview

PUBLISH ──▶ QUICK CHECKS ──▶ ASYNC SCANS ──▶ ACTIVATE or QUARANTINE
                │                 │
                ▼                 ▼
          - Name squatting   - Antivirus
          - Blocklist check  - Pattern rules
          - Secret detection - Other (optional)

Backend / Scanning Infrastructure

Scanner Framework

  • Scanner / RemoteScanner: Abstraction for HTTP-based external scanners with configurable request/response templates.
  • ScannerRegistry: Holds all registered scanner instances at runtime.
  • RemoteScannerRegistrar: Registers scanners from YAML configuration at startup.
  • RemoteScannerProperties: Configuration binding for scanner definitions.

HTTP Infrastructure

  • HttpTemplateEngine: Builds HTTP requests from templates with variable substitution.
  • HttpResponseExtractor: Extracts job IDs, status, and threats from JSON responses via JSONPath.
  • HttpAuthHandler: Supports API key, Bearer, Basic, and OAuth2 client credentials authentication.
  • HttpClientExecutor: Configurable Apache HttpClient with per-scanner connection pools.

Job Execution (JobRunr)

  • ScannerInvocationHandler: Executes scanner invocations, handles sync/async responses.
  • ScannerPollHandler: Polls async scanners with configurable backoff until completion.
  • ScannerInvocationRequest / ScannerPollRequest: JobRunr job request DTOs.

Completion & Recovery

  • ExtensionScanCompletionService: Monitors job completion, activates or quarantines extensions based on threat analysis.
  • ExtensionScanJobRecoveryService: Recovers stuck scans from server crashes, network failures, or race conditions.
  • ScannerFileService: Manages temporary file downloads with automatic cleanup.

Publish Checks (Synchronous)

  • PublishCheck / PublishCheckRunner: Interface and orchestrator for pre-scan validation.
  • SecretCheckService: Detects hardcoded secrets using Aho-Corasick + regex. (Renamed secret scanning -> secret detection to distinguish publish checks from long running scans)
  • BlocklistCheckService: Checks file hashes against admin-managed blocklist.
  • SimilarityCheckService: Detects name squatting via Levenshtein distance.

Domain Model

New Entities

  • ScannerJob: Tracks individual scanner job state (PENDING → PROCESSING → COMPLETE/FAILED).
  • ScanCheckResult: Records results from synchronous publish checks.

Entity Updates

  • ExtensionScan: Added startedAt, completedAt timestamps for lifecycle tracking.
  • ExtensionThreat: Added jobId foreign key, enforced flag for threat classification.

Database Migration

New schema for scanner job tracking:

  • scanner_job: Job state, scanner type, external job ID, file hashes, retry count.
  • scan_check_result: Results from synchronous validation checks.
  • Indexes for efficient filtering by scan ID, status, and scanner type.

Migration: V1_60__Scanning_Infrastructure.sql


Configuration

Scanners are defined declaratively in application.yml. Supports:

  • Sync and async (polling) scanners
  • Per-scanner HTTP client configuration
  • Multiple authentication methods
  • Configurable threat extraction via JSONPath

Note

Specific scanner definitions will be provided through agreed channels and are not included in this PR.


Web UI Updates

Review Status Display

  • Extensions show review status in user's extension list: "Under review", "Published", "Rejected".
  • Status reflects admin decisions for quarantined extensions.

Scan Card Updates

  • Displays scanner job results with duration and threat counts.
  • Shows "not enforced" badge for warning-only threats.

Key Features

Feature Description
Parallel scanning Multiple scanners execute simultaneously via JobRunr workers
Enforced vs warnings Non-enforced threats log warnings but allow activation
Automatic recovery Stuck scans recovered every x minutes
Configurable scanners YAML-driven scanner definitions with no code changes (server restart required)
Admin allowlist/blocklist File-level decisions propagate to future scans

Scope Discipline

  • All scanning infrastructure under server/src/main/java/org/eclipse/openvsx/scanning/**.
  • Changes to existing publish flow (PublishExtensionVersionHandler).
  • Web UI changes additive under webui/src/**.

@chrisguindon chrisguindon requested a review from netomi January 26, 2026 19:46
@netomi netomi merged commit a59e52b into eclipse:security-improvements Jan 27, 2026
1 check passed
netomi pushed a commit that referenced this pull request Jan 29, 2026
* fix broken extension icons on scan cards

* Fix line endings

* Add long running scans. Refactored publish checks

---------

Co-authored-by: Alejandro Rivera <alejandro.rivera1996@gmail.com>
netomi pushed a commit that referenced this pull request Feb 5, 2026
* fix broken extension icons on scan cards

* Fix line endings

* Add long running scans. Refactored publish checks

---------

Co-authored-by: Alejandro Rivera <alejandro.rivera1996@gmail.com>
janbro added a commit to yeeth-security/openvsx that referenced this pull request Feb 11, 2026
…ipse#1565)

* fix broken extension icons on scan cards

* Fix line endings

* Add long running scans. Refactored publish checks

---------

Co-authored-by: Alejandro Rivera <alejandro.rivera1996@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants