fix: CSS files not being indexed (#7072)#7375
Conversation
CSS files were not being indexed because codeChunker expects code structures (classes/functions) that don't exist in CSS. The chunker would return zero chunks, preventing CSS files from being indexed. This fix routes CSS, HTML, JSON and similar non-code files to basicChunker instead of codeChunker, ensuring they are properly indexed while maintaining intelligent chunking for actual code files.
|
@RomneyDa Initially, this appeared to be a retrieval issue - CSS files were in the database but not being returned by @codebase queries. However, CSS files were never being chunked/indexed at all. Route CSS, HTML, JSON and similar non-code files to basicChunker instead of codeChunker. These files have structure but not code constructs, so they need simple line-based chunking rather than AST-based chunking. Can you review it and Provide if any changes is needed. |
RomneyDa
left a comment
There was a problem hiding this comment.
This seems right, can't nontrivially remove from supportedLanguages since it is used for autocomplete, etc.
|
🎉 This PR is included in version 1.10.0 🎉 The release is available on: Your semantic-release bot 📦🚀 |
|
🎉 This PR is included in version 1.11.0 🎉 The release is available on: Your semantic-release bot 📦🚀 |
Description
CSS files were not being indexed because codeChunker expects code structures (classes/functions) that don't exist in CSS. The chunker would return zero chunks, preventing CSS files from being indexed.
This fix routes CSS, HTML, JSON and similar non-code files to basicChunker instead of codeChunker, ensuring they are properly indexed while maintaining intelligent chunking for actual code files.
Root Cause's Found
Screen recording or screenshot
Tests
• Manually verified CSS files are now indexed and retrievable via @codebase
• Confirmed all existing chunk tests pass (10/10)
• TypeScript compilation succeeds with no errors
• Prettier formatting applied
Summary by cubic
Fixes CSS files not being skipped during indexing by sending non-code files to the basic chunker so they produce chunks and can be retrieved. Code files still use the code chunker.