Skip to content

Conversation

@HanClinto
Copy link
Contributor

@HanClinto HanClinto commented Jun 6, 2024

This adds a number of FAILING tests to exercise JSON schema conversion. These failures are good, as they show bugs in our current schema-to-grammar system (as noted by #7703 and #7789 ).

I originally started writing these tests to help me dig into the root causes behind #7703 , but I ran into some other issues along the way (hence #7789 ).

Of the failing tests, in particular, there are a number of "pass" test cases that are marked with TODO that SHOULD pass and currently do NOT.

The "fail" test cases that are improperly passing are not as big of a deal (these are things that we can't really expect to support, such as numeric minimums or ensuring uniqueness of array entries), so feel free to ignore those TODO entries.

This is meant as an aide to anyone else who wants to chip in on any of the json-schema features / bugs and wants an easier way to debug the json schema generation and test it against sample generations. Very helpfully the json-schema.org documentation often includes examples of matching and non-matching strings for example schemas in their documentation, and I have begun copying those into these integration tests directly.

@github-actions github-actions bot added the testing Everything test related label Jun 6, 2024
@HanClinto
Copy link
Contributor Author

@ochafik Check out these latest tests -- I think you might like them. :)

@ochafik
Copy link
Collaborator

ochafik commented Jun 6, 2024

Awesome! Was plotting something similar in #7797 :-)

I'll take a deeper look in the coming days!

@HanClinto
Copy link
Contributor Author

Awesome! Was plotting something similar in #7797 :-)

I'll take a deeper look in the coming days!

Haha -- we went down nearly identical roads. Your framework is much nicer than mine -- I'll pull your framework into this branch and update my tests accordingly. Then you should be able to pull it into your branch more easily if you want. Nicely done!!

@mofosyne mofosyne added the Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level label Jun 12, 2024
@HanClinto HanClinto force-pushed the tests-json-schema-integration branch from bd3ebb6 to cfccb6c Compare June 12, 2024 17:42
@HanClinto
Copy link
Contributor Author

HanClinto commented Jun 12, 2024

Okay, ready for review / merge.

I incorporated the improvements from @ochafik 's test harness in #7797 , and also added a #define INCLUDE_FAILING_TESTS 1 at the top that can be commented / uncommented to exclude / include the failing test cases. This is so that those test cases don't break CI, but the PR can still be merged into master so that other PRs can build on it (#7840, #7797, #7863, etc).

@HanClinto HanClinto force-pushed the tests-json-schema-integration branch from 307619d to 488cbfa Compare June 12, 2024 23:45
Copy link
Collaborator

@ochafik ochafik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing, thanks @HanClinto !

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now ./llama-gbnf-validator :-D

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just indent across string boundary?

        R"""({
          "type": "string",
          "minLength": 3
        })""",

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"""",
"true"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would split "data", "uuid", "time" and "date-time" test cases

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

""",
""a",
"a"",
""\"",

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for preparing these!

I would just move them down to the failed branch (w/ caveat header), hopefully I'll move them back up very soon :-) (not a fan of #ifdef dead code)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: TBH I'd be tempted to use R"""({...})""" as soon as any backslash is needed, to make it easier to copy-paste things.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: exclusiveMinimum for number is on my roadmap :-)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uncomment & move it up / to passing?

@HanClinto HanClinto force-pushed the tests-json-schema-integration branch from 488cbfa to 8713e57 Compare June 22, 2024 02:47
…n validation against auto-generated JSON-schema grammars.
…ng the ability to automatically output improperly failing grammars to debug output files so they can more easily be examined in the gbnf-validator program.
…n pass CI, but still be useful for other PRs that want to leverage the framework.
@HanClinto HanClinto force-pushed the tests-json-schema-integration branch from 8713e57 to 3bdf7c3 Compare June 22, 2024 02:54
@HanClinto HanClinto merged commit c5a8d4b into ggml-org:master Jun 22, 2024
@HanClinto
Copy link
Contributor Author

@ochafik Thank you very much for the review! I implemented most of your suggestions, but there were a couple that I didn't bother with (that I figure can get resolved in #7797 and others).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants