-
Notifications
You must be signed in to change notification settings - Fork 14.1k
Allow passing grammar to completion endpoint #2532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@SlyEcho thanks for reviewing my PR. I just added a second commit that should cover your review comments. I'm not a C++ programmer but still hope these changes make sense. Any guidance how to further improve this PR, if necessary, is highly appreciated! |
|
Looks good now. |
|
Thanks again for reviewing @SlyEcho. I just fixed the reported trailing whitespace error in another commit. Sorry for the extra round, checks should now pass. |
|
Very nice |
|
I modified the example code to just one type of schema and one prompt and I am getting error The schema (functions.json) {
"type": "object",
"properties": {
"function": {"const": "create_event"},
"arguments": {
"type": "object",
"properties": {
"title": {"type": "string"},
"date": {"type": "string"},
"time": {"type": "string"}
}
}
}
}The python code is this import requests
import json
import importlib
schema = importlib.import_module("json-schema-to-grammar")
with open("functions.json", "r") as f:
functions_schema = json.load(f)
# Convert functions schema to grammar
prop_order = {name: idx for idx, name in enumerate(["function", "arguments"])}
converter = schema.SchemaConverter(prop_order)
converter.visit(functions_schema, '')
grammar = converter.format_grammar()
prompt = "Schedule a birthday party on Aug 14th 2023 at 8pm."
data_json = { "prompt": prompt, "temperature": 0.1, "n_predict": 512, "stream": False, "grammar": grammar }
resp = requests.post(
url="http://localhost:7860/completion",
headers={"Content-Type": "application/json"},
json=data_json,
)
result = json.loads(resp.json()['content'].strip().replace("\n", "\\n"))
print(f"Prompt: {prompt}")
print(f"Result: {result}\n")The error I get on llama.cpp server is |
|
@razorback16 Thanks for reporting. I can can reproduce the issue and will investigate. |
|
@razorback16 I just submitted #2566 which fixes the issue you reported. |
|
Thank you for the fix. It works now :)
…On Wed, Aug 9, 2023 at 6:37 AM Martin Krasser ***@***.***> wrote:
@razorback16 <https://github.com/razorback16> I just submitted #2566
<#2566> which fixes the issue
you reported.
—
Reply to this email directly, view it on GitHub
<#2532 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABGYD7SPNKAPOTAFQCUO4NDXUOHAFANCNFSM6AAAAAA3F2THZ4>
.
You are receiving this because you were mentioned.
|
Description
This PR extend
server.cppto support submitting a grammar withcompletionrequests (equivalent to the--grammarcommand line option formain.cpp)Usage example
Start a server with a Llama-2-13B-GGML model:
Define schema of two functions in
functions.json:{ "oneOf": [ { "type": "object", "properties": { "function": {"const": "create_event"}, "arguments": { "type": "object", "properties": { "title": {"type": "string"}, "date": {"type": "string"}, "time": {"type": "string"} } } } }, { "type": "object", "properties": { "function": {"const": "image_search"}, "arguments": { "type": "object", "properties": { "query": {"type": "string"} } } } } ] }A Python program that uses the served model for grammar-based sampling:
When running, it produces the following output: