[feat] optimize aiTokenLimiterPlugin for streaming tokens#6055
[feat] optimize aiTokenLimiterPlugin for streaming tokens#6055Aias00 merged 3 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR optimizes the AI token limiter plugin to better handle streaming tokens by implementing real-time token counting during streaming responses instead of waiting for the complete response. The changes add support for Server-Sent Events (SSE) parsing and include usage information in streaming requests.
- Enhanced token counting for streaming AI responses with real-time processing
- Added support for
writeAndFlushWithmethod to handle streaming data properly - Modified OpenAI configuration to include usage information in streaming requests
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| AiTokenLimiterPlugin.java | Implements streaming token counting with SSE parsing and real-time usage tracking |
| OpenAI.java | Adds stream_options configuration to include usage data in streaming requests |
| Constants.java | Defines new constants for streaming options and usage inclusion |
Comments suppressed due to low confidence (1)
shenyu-plugin/shenyu-plugin-ai/shenyu-plugin-ai-token-limiter/src/main/java/org/apache/shenyu/plugin/ai/token/limiter/AiTokenLimiterPlugin.java:310
- [nitpick] The variable name 'last' is ambiguous. Consider renaming it to 'lastCompletionTokens' or 'finalTokenCount' to clarify its purpose.
long last = 0L;
| int offset = 0; | ||
| int len = inBytes.length; | ||
| if (!headerSkipped.get()) { | ||
| if (!headerSkipped.getAndSet(true)) { |
There was a problem hiding this comment.
Using getAndSet(true) in a conditional check can lead to race conditions. Consider using a more explicit atomic operation or synchronization mechanism to ensure thread safety.
| if (!headerSkipped.getAndSet(true)) { | |
| if (headerSkipped.compareAndSet(false, true)) { |
| } catch (Exception e) { | ||
| LOG.error("parse ai resp error", e); | ||
| } | ||
| } | ||
| writer.write(ByteBuffer.wrap(processedBytes)); | ||
| }); | ||
| } catch (Exception e) { | ||
| LOG.error("read dataBuffer error", e); |
There was a problem hiding this comment.
Catching generic Exception is too broad. Consider catching specific exceptions like JsonProcessingException or IOException to handle different error scenarios appropriately.
| } catch (Exception e) { | |
| LOG.error("parse ai resp error", e); | |
| } | |
| } | |
| writer.write(ByteBuffer.wrap(processedBytes)); | |
| }); | |
| } catch (Exception e) { | |
| LOG.error("read dataBuffer error", e); | |
| } catch (JsonProcessingException e) { | |
| LOG.error("JSON processing error while parsing AI response", e); | |
| } catch (IOException e) { | |
| LOG.error("IO error while parsing AI response", e); | |
| } | |
| } | |
| writer.write(ByteBuffer.wrap(processedBytes)); | |
| }); | |
| } catch (IOException e) { | |
| LOG.error("IO error while reading dataBuffer", e); |
| private long extractUsageTokensFromSse(final String sse) { | ||
| Pattern p = Pattern.compile("\"completion_tokens\"\\s*:\\s*(\\d+)"); | ||
| Matcher m = p.matcher(sse); |
There was a problem hiding this comment.
The regex pattern is compiled on every method call. Consider making the Pattern a static final field to improve performance.
| private long extractUsageTokensFromSse(final String sse) { | |
| Pattern p = Pattern.compile("\"completion_tokens\"\\s*:\\s*(\\d+)"); | |
| Matcher m = p.matcher(sse); | |
| private static final Pattern COMPLETION_TOKENS_PATTERN = Pattern.compile("\"completion_tokens\"\\s*:\\s*(\\d+)"); | |
| private long extractUsageTokensFromSse(final String sse) { | |
| Matcher m = COMPLETION_TOKENS_PATTERN.matcher(sse); |
| streamingUsageRecorded.set(true); | ||
| } | ||
| } catch (Exception e) { | ||
| LOG.error("parse ai resp error", e); |
There was a problem hiding this comment.
The error message 'parse ai resp error' is unclear and uses abbreviations. Consider a more descriptive message like 'Failed to parse AI response JSON payload'.
| LOG.error("parse ai resp error", e); | |
| LOG.error("Failed to parse AI response JSON payload", e); |
* fix: optimize aiTokenLimiterPlugin for streaming tokens * chore: java format * chore: code review by copilot
optimize aiTokenLimiterPlugin for streaming tokens
Make sure that:
./mvnw clean install -Dmaven.javadoc.skip=true.