SYCL backend error PI_ERROR_INVALID_WORK_GROUP_SIZE on iGPU UHD 770

When offloading to iGPU UHD 770 in a docker from https://github.com/mudler/LocalAI  (b2128) llama.cpp crashes with the following error:

`The number of work-items in each dimension of a work-group cannot exceed {512, 512, 512} for this device -54 (PI_ERROR_INVALID_WORK_GROUP_SIZE)Exception caught at file:/build/backend/cpp/llama/llama.cpp/ggml-sycl.cpp, line:12708
`

From trial and error it happens if I have number of tokens predicted >256. I mean that if I limit the tokens with 256 it does not happen.

Tested with multiple 7b mistral models with both Q6 and Q8 quantization

Intel oneAPI version 2024.0


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SYCL backend error PI_ERROR_INVALID_WORK_GROUP_SIZE on iGPU UHD 770 #5467

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SYCL backend error PI_ERROR_INVALID_WORK_GROUP_SIZE on iGPU UHD 770 #5467

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions