-
Notifications
You must be signed in to change notification settings - Fork 41
Add embedding generation support #131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds comprehensive embedding generation support to the AI Client library, enabling users to generate embeddings for documents/text using any compatible AI provider.
Key Changes:
- Introduces new DTOs (
Embedding,EmbeddingResult,EmbeddingOperation) for representing embedding data structures - Adds embedding generation interfaces and OpenAI implementation with model detection via metadata directory
- Extends
PromptBuilderandAiClientwith embedding-specific methods likewithEmbeddingInputs(),generateEmbeddingsResult(), andgenerateEmbeddings() - Updates CLI with
--capability=embeddingsflag and embedding-specific output formats
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/Results/DTO/EmbeddingResultTest.php | Comprehensive test coverage for EmbeddingResult DTO including validation, helpers, and serialization |
| tests/unit/Providers/ProviderRegistryTest.php | Adds test for discovering embedding-capable models in the registry |
| tests/unit/Providers/Models/DTO/ModelRequirementsTest.php | Tests that embeddings don't incorrectly require chat history capability |
| tests/unit/ProviderImplementations/OpenAi/OpenAiEmbeddingModelTest.php | Tests OpenAI embedding model implementation with mocked HTTP responses |
| tests/unit/Operations/DTO/EmbeddingOperationTest.php | Tests for embedding operation lifecycle and state management |
| tests/unit/Embeddings/DTO/EmbeddingTest.php | Tests embedding vector validation, normalization, and serialization |
| tests/unit/Builders/PromptBuilderTest.php | Tests embedding generation methods including input handling and fallback behavior |
| tests/unit/AiClientTest.php | Tests static helper methods for embedding generation |
| tests/traits/MockModelCreationTrait.php | Adds helper methods for creating mock embedding models and results in tests |
| tests/mocks/MockProvider.php | Registers mock embedding model for testing |
| src/Results/DTO/EmbeddingResult.php | New DTO representing embedding generation results with vector helper methods |
| src/Providers/Models/EmbeddingGeneration/Contracts/EmbeddingGenerationOperationModelInterface.php | Interface for models supporting asynchronous embedding operations |
| src/Providers/Models/EmbeddingGeneration/Contracts/EmbeddingGenerationModelInterface.php | Interface for models supporting synchronous embedding generation |
| src/Providers/Models/DTO/ModelRequirements.php | Prevents embeddings from requiring chat history when multiple messages provided |
| src/Providers/Models/Contracts/WithEmbeddingOperationsInterface.php | Interface for retrieving embedding operations by ID |
| src/ProviderImplementations/OpenAi/OpenAiProvider.php | Adds factory logic to instantiate OpenAI embedding models |
| src/ProviderImplementations/OpenAi/OpenAiModelMetadataDirectory.php | Detects and configures embedding models from OpenAI's model list |
| src/ProviderImplementations/OpenAi/OpenAiEmbeddingModel.php | Complete OpenAI embeddings API implementation with message-to-text conversion |
| src/Operations/DTO/EmbeddingOperation.php | DTO for tracking long-running embedding operations with state management |
| src/Embeddings/DTO/Embedding.php | Core DTO representing a single embedding vector with validation |
| src/Builders/PromptBuilder.php | Adds embedding input handling, capability resolution, and generation methods |
| src/AiClient.php | Adds static helper methods for embedding generation workflows |
| cli.php | Extends CLI with embedding capability support and output format options |
| README.md | Documents embedding generation API and CLI usage patterns |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| use WordPress\AiClient\AiClient; | ||
|
|
||
| $vectors = AiClient::prompt() | ||
| ->withEmbeddingInputs('Summarize this document', 'Summarize that document') |
Copilot
AI
Nov 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The example text 'Summarize this document' and 'Summarize that document' could be confusing as they sound like instructions rather than documents to be embedded. Consider using more representative example text like 'This is the first document' or 'Product description text' to make it clearer that these are documents being embedded, not instructions.
| ->withEmbeddingInputs('Summarize this document', 'Summarize that document') | |
| ->withEmbeddingInputs('This is the first document', 'This is the second document') |
|
Closing in favor of new PR based on the add/embeddings-support branch. |
Fixes #130
Summary
Testing