Support Adept Persimmon 8b#3410

Merged

ggerganov merged 30 commits intoggml-org:masterfrom

phillip-kravtsov:phillip-kravtsov/support-adept-persimmon-8b

Oct 7, 2023

Contributor

phillip-kravtsov commented Sep 29, 2023 •

edited

Loading

Adds Persimmon 8B which is, architecturally, a standard dense transformer with:
- Q/K layernorm
- Squared ReLU activations
- partial RoPE
- very large vocab size (most unused for text)

To support Partial RoPE & Squared ReLU, this PR adds concat & square kernels for metal.
I've confirmed agreement between the GGML & HF implementation up to tensor values in the last layer.

phillip-kravtsov added 18 commits

September 20, 2023 17:24


          Produces garbage output

7cdc3ea


          wip: correct tensors up to RoPE

4bcf412


          correct tensors thru RoPE

c9e1446


          Correct outputs through masked & softmax'd KQ

d1b40ef


          fp32 works

db2181a


          Rename adept->persimmon

3f31799


          Merge branch 'master' of github.com:phillip-kravtsov/llama.cpp into p…

720503b

…hillip-kravtsov/support-adept-persimmon-8b


          Produces correct outputs

d61eed0


          Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-…

d0a7143

…kravtsov/support-adept-persimmon-8b


          clean up convert scripts

fa92f6e


          remove printing logic from ggml.c

c28a6c5


          remove prints from llama.cpp & fix merge

47dcb9f


          Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-…

…kravtsov/support-adept-persimmon-8b


          trivial cleanups

d904aff


          Add offload funcs

ec0ce97


          update conversion script to directly take adept artifacts rather than…

3db04db

… .saftensors file


          Fix norm eps bug

f28f52c


          Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-…

d93cf1e

…kravtsov/support-adept-persimmon-8b

goerch reviewed

View reviewed changes

convert-persimmon-to-gguf.py Outdated Show resolved Hide resolved

ggerganov added high priority model labels

phillip-kravtsov added 2 commits

September 30, 2023 13:24


          Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-…

574a9e1

…kravtsov/support-adept-persimmon-8b


          Support sqr and concat on metal, persimmon-8b-q4 runs correctly

2b56591

ggerganov approved these changes

View reviewed changes

ggml-metal.m Outdated Show resolved Hide resolved

ggml-metal.m Outdated Show resolved Hide resolved

ggml-metal.m Show resolved Hide resolved

ggml-metal.m Outdated Show resolved Hide resolved

gguf-py/gguf/gguf.py Outdated Show resolved Hide resolved

gguf-py/gguf/gguf.py Outdated Show resolved Hide resolved

llama.cpp Outdated Show resolved Hide resolved

llama.cpp Outdated Show resolved Hide resolved

llama.cpp Show resolved Hide resolved

phillip-kravtsov added 3 commits

October 2, 2023 10:21


          Small changes from review

e6bf87f


          Formatting changes

cd4d3df


          Minor changes to conversion script

422b110

phillip-kravtsov commented

View reviewed changes

ggml-metal.m Show resolved Hide resolved

phillip-kravtsov added 2 commits

October 2, 2023 14:00


          Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-…

5a0990c

…kravtsov/support-adept-persimmon-8b


          Remove old script

7a279fe

Member

ggerganov commented Oct 3, 2023

Let's resolve the CI fails and merge


          Fix editorconfig formatting

c90ed9f

cebtenzzre reviewed

View reviewed changes

gguf-py/gguf/gguf.py Outdated Show resolved Hide resolved


          Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-…

5d259d3

…kravtsov/support-adept-persimmon-8b. ggml-ci

phillip-kravtsov force-pushed the phillip-kravtsov/support-adept-persimmon-8b branch from 92acb44 to 5d259d3 Compare

October 5, 2023 18:04


          Fix build

1d518d6

ggerganov reviewed

View reviewed changes

llama.cpp Outdated Show resolved Hide resolved

phillip-kravtsov added 2 commits

October 6, 2023 12:39


          Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-…

0c1a8f6

…kravtsov/support-adept-persimmon-8b


          add overlooked offload code ggml-ci

485a471

ggerganov merged commit 0e797c2 into ggml-org:master

Member

slaren commented Oct 7, 2023

The switches in llm_load_hparams and llama_build_graph are missing breaks, so it should be using the refact graph. Does this work currently?

Member

ggerganov commented Oct 7, 2023

@phillip-kravtsov PTAL at @slaren's comment and fix as necessary

Contributor

KerfuffleV2 commented Oct 7, 2023

I got tired of seeing the compiler warning and created #3535 (not sure if there are any other issues, haven't had a chance to test it yet).

Contributor Author

phillip-kravtsov commented Oct 8, 2023

Thanks for the fix @KerfuffleV2 -- that PR should be sufficient.

joelkuiper added a commit to vortext/llama.cpp that referenced this pull request


          Merge branch 'master' of github.com:ggerganov/llama.cpp into grammar-…

f7b9bf1

…example

* 'master' of github.com:ggerganov/llama.cpp:
  py : change version of numpy requirement to 1.24.4 (ggml-org#3515)
  quantize : fail fast on write errors (ggml-org#3521)
  metal : support default.metallib load & reuse code for swift package (ggml-org#3522)
  llm : support Adept Persimmon 8B (ggml-org#3410)
  Fix for ggml-org#3454 (ggml-org#3455)
  readme : update models, cuda + ppl instructions (ggml-org#3510)
  server : docs fix default values and add n_probs (ggml-org#3506)

leo-gan mentioned this pull request

Update llama.cpp integration langchain-ai/langchain#11864

Merged

Galunid mentioned this pull request

Unbreak persimmon after #3837 #4010

Merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

high priority model