Allow passing grammar to completion endpoint #2532

krasserm · 2023-08-06T10:46:51Z

Description

This PR extend server.cpp to support submitting a grammar with completion requests (equivalent to the --grammar command line option for main.cpp)

Usage example

Start a server with a Llama-2-13B-GGML model:

./server -m llama-2-13b.ggmlv3.q4_0.bin -eps 1e-5 --n-gpu-layers 43 --host 0.0.0.0 --port 8080

Define schema of two functions in functions.json:

{
    "oneOf": [
        {
            "type": "object",
            "properties": {
                "function": {"const": "create_event"},
                "arguments": {
                    "type": "object",
                    "properties": {
                        "title": {"type": "string"},
                        "date": {"type": "string"},
                        "time": {"type": "string"}
                    }
                }
            }
        },
        {
            "type": "object",
            "properties": {
                "function": {"const": "image_search"},
                "arguments": {
                    "type": "object",
                    "properties": {
                        "query": {"type": "string"}
                    }
                }
            }
        }
    ]
}

A Python program that uses the served model for grammar-based sampling:

import importlib
import json

import requests

# import examples/json-schema-to-grammar.py
schema = importlib.import_module("examples.json-schema-to-grammar")

with open("functions.json", "r") as f:
    functions_schema = json.load(f)

# Convert functions schema to grammar
prop_order = {name: idx for idx, name in enumerate(["function", "arguments"])}
converter = schema.SchemaConverter(prop_order)
converter.visit(functions_schema, '')
grammar = converter.format_grammar()

prompts = [
    "Schedule a birthday party on Aug 14th 2023 at 8pm.",
    "Find an image of a dog."
]

for prompt in prompts:
    data_json = { "prompt": prompt, "temperature": 0.1, "n_predict": 512, "stream": False, "grammar": grammar }

    resp = requests.post(
        url="http://127.0.0.1:8080/completion",
        headers={"Content-Type": "application/json"},
        json=data_json,
    )

    result = json.loads(resp.json()["content"].strip().replace("\n", "\\n"))

    print(f"Prompt: {prompt}")
    print(f"Result: {result}\n")

When running, it produces the following output:

Prompt: Schedule a birthday party on Aug 14th 2023 at 8pm.
Result: {'function': 'create_event', 'arguments': {'date': '2023-08-14T20:00:00Z', 'time': '20:00', 'title': 'Birthday Party'}}

Prompt: Find an image of a dog.
Result: {'function': 'image_search', 'arguments': {'query': 'dog'}}

examples/server/server.cpp

krasserm · 2023-08-07T09:51:59Z

@SlyEcho thanks for reviewing my PR. I just added a second commit that should cover your review comments. I'm not a C++ programmer but still hope these changes make sense. Any guidance how to further improve this PR, if necessary, is highly appreciated!

SlyEcho · 2023-08-08T07:19:52Z

Looks good now.

krasserm · 2023-08-08T07:51:07Z

Thanks again for reviewing @SlyEcho. I just fixed the reported trailing whitespace error in another commit. Sorry for the extra round, checks should now pass.

tobi · 2023-08-08T12:22:29Z

Very nice

razorback16 · 2023-08-08T22:00:28Z

I modified the example code to just one type of schema and one prompt and I am getting error

The schema (functions.json)

{
    "type": "object",
    "properties": {
        "function": {"const": "create_event"},
        "arguments": {
            "type": "object",
            "properties": {
                "title": {"type": "string"},
                "date": {"type": "string"},
                "time": {"type": "string"}
            }
        }
    }
}

The python code is this

import requests
import json
import importlib

schema = importlib.import_module("json-schema-to-grammar")

with open("functions.json", "r") as f:
    functions_schema = json.load(f)

# Convert functions schema to grammar
prop_order = {name: idx for idx, name in enumerate(["function", "arguments"])}
converter = schema.SchemaConverter(prop_order)
converter.visit(functions_schema, '')
grammar = converter.format_grammar()

prompt = "Schedule a birthday party on Aug 14th 2023 at 8pm."

data_json = { "prompt": prompt, "temperature": 0.1, "n_predict": 512, "stream": False, "grammar": grammar }

resp = requests.post(
    url="http://localhost:7860/completion",
    headers={"Content-Type": "application/json"},
    json=data_json,
)

result = json.loads(resp.json()['content'].strip().replace("\n", "\\n"))

print(f"Prompt: {prompt}")
print(f"Result: {result}\n")

The error I get on llama.cpp server is

{"timestamp":1691531624,"level":"INFO","function":"log_server_request","line":1144,"message":"request","remote_addr":"17.190.192.114","remote_port":60190,"status":200,"method":"POST","path":"/completion","params":{}}
{"timestamp":1691531689,"level":"INFO","function":"log_server_request","line":1144,"message":"request","remote_addr":"17.190.192.114","remote_port":60211,"status":404,"method":"POST","path":"/completion","params":{}}
space ::= space_1
space_1 ::= [ ] |
0-function ::= ["] [c] [r] [e] [a] [t] [e] [_] [e] [v] [e] [n] [t] ["]
string ::= ["] string_6 ["] space
string_4 ::= [^"\] | [\] string_5
string_5 ::= ["\/bfnrt] | [u] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]
string_6 ::= string_4 string_6 |
0-arguments ::= [{] space ["] [d] [a] [t] [e] ["] space [:] space string [,] space ["] [t] [i] [m] [e] ["] space [:] space string [,] space ["] [t] [i] [t] [l] [e] ["] space [:] space string [}] space
0 ::= [{] space ["] [f] [u] [n] [c] [t] [i] [o] [n] ["] space [:] space 0-function [,] space ["] [a] [r] [g] [u] [m] [e] [n] [t] [s] ["] space [:] space 0-arguments [}] space
1-function ::= ["] [i] [m] [a] [g] [e] [_] [s] [e] [a] [r] [c] [h] ["]
1-arguments ::= [{] space ["] [q] [u] [e] [r] [y] ["] space [:] space string [}] space
1 ::= [{] space ["] [f] [u] [n] [c] [t] [i] [o] [n] ["] space [:] space 1-function [,] space ["] [a] [r] [g] [u] [m] [e] [n] [t] [s] ["] space [:] space 1-arguments [}] space
root ::= 0 | 1

llama_print_timings:        load time =   221.62 ms
llama_print_timings:      sample time =   171.83 ms /    50 runs   (    3.44 ms per token,   290.98 tokens per second)
llama_print_timings: prompt eval time =   189.00 ms /    22 tokens (    8.59 ms per token,   116.40 tokens per second)
llama_print_timings:        eval time =   915.24 ms /    49 runs   (   18.68 ms per token,    53.54 tokens per second)
llama_print_timings:       total time =  1308.12 ms
{"timestamp":1691531717,"level":"INFO","function":"log_server_request","line":1144,"message":"request","remote_addr":"17.190.192.114","remote_port":60220,"status":200,"method":"POST","path":"/completion","params":{}}
space ::= space_1
space_1 ::= [ ] |
function ::= ["] [c] [r] [e] [a] [t] [e] [_] [e] [v] [e] [n] [t] ["]
string ::= ["] string_6 ["] space
string_4 ::= [^"\] | [\] string_5
string_5 ::= ["\/bfnrt] | [u] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]
string_6 ::= string_4 string_6 |
arguments ::= [{] space ["] [d] [a] [t] [e] ["] space [:] space string [,] space ["] [t] [i] [m] [e] ["] space [:] space string [,] space ["] [t] [i] [t] [l] [e] ["] space [:] space string [}] space
root ::= [{] space ["] [f] [u] [n] [c] [t] [i] [o] [n] ["] space [:] space function [,] space ["] [a] [r] [g] [u] [m] [e] [n] [t] [s] ["] space [:] space arguments [}] space
LLAMA_ASSERT: llama.cpp:2135: is_positive_char || pos->type == LLAMA_GRETYPE_CHAR_NOT
Aborted (core dumped)

krasserm · 2023-08-09T07:21:51Z

@razorback16 Thanks for reporting. I can can reproduce the issue and will investigate.

krasserm · 2023-08-09T13:36:54Z

@razorback16 I just submitted #2566 which fixes the issue you reported.

razorback16 · 2023-08-09T16:11:24Z

Thank you for the fix. It works now :)

…

On Wed, Aug 9, 2023 at 6:37 AM Martin Krasser ***@***.***> wrote: @razorback16 <https://github.com/razorback16> I just submitted #2566 <#2566> which fixes the issue you reported. — Reply to this email directly, view it on GitHub <#2532 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABGYD7SPNKAPOTAFQCUO4NDXUOHAFANCNFSM6AAAAAA3F2THZ4> . You are receiving this because you were mentioned.

Allow passing grammar to completion endpoint

d9f75f3

SlyEcho suggested changes Aug 7, 2023

View reviewed changes

examples/server/server.cpp Outdated Show resolved Hide resolved

examples/server/server.cpp Outdated Show resolved Hide resolved

Include review comments

b652498

gshields01 mentioned this pull request Aug 7, 2023

Using --grammar with ./server #2542

Closed

krasserm requested a review from SlyEcho August 8, 2023 05:55

SlyEcho approved these changes Aug 8, 2023

View reviewed changes

Fix trailing whitespace

8b73356

SlyEcho merged commit f5bfea0 into ggml-org:master Aug 8, 2023

krasserm mentioned this pull request Aug 9, 2023

Fix grammar-based sampling issue in server #2566

Merged

krasserm deleted the wip-server-grammar branch August 9, 2023 13:38

jhen0409 mentioned this pull request Aug 12, 2023

server : implement json-schema-to-grammar.mjs & add grammar param in the UI #2588

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow passing grammar to completion endpoint #2532

Allow passing grammar to completion endpoint #2532

Uh oh!

krasserm commented Aug 6, 2023

Uh oh!

Uh oh!

Uh oh!

krasserm commented Aug 7, 2023

Uh oh!

SlyEcho commented Aug 8, 2023

Uh oh!

krasserm commented Aug 8, 2023

Uh oh!

tobi commented Aug 8, 2023

Uh oh!

razorback16 commented Aug 8, 2023 •

edited

Loading

Uh oh!

krasserm commented Aug 9, 2023

Uh oh!

krasserm commented Aug 9, 2023

Uh oh!

razorback16 commented Aug 9, 2023 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Allow passing grammar to completion endpoint #2532

Allow passing grammar to completion endpoint #2532

Uh oh!

Conversation

krasserm commented Aug 6, 2023

Description

Usage example

Uh oh!

Uh oh!

Uh oh!

krasserm commented Aug 7, 2023

Uh oh!

SlyEcho commented Aug 8, 2023

Uh oh!

krasserm commented Aug 8, 2023

Uh oh!

tobi commented Aug 8, 2023

Uh oh!

razorback16 commented Aug 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

krasserm commented Aug 9, 2023

Uh oh!

krasserm commented Aug 9, 2023

Uh oh!

razorback16 commented Aug 9, 2023 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

razorback16 commented Aug 8, 2023 •

edited

Loading