-
Notifications
You must be signed in to change notification settings - Fork 14.1k
Closed
Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [YES] I am running the latest code. bc9d3e3
- [YES] I carefully followed the README.md.
- [YES] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [YES] I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
Correct uploading of contiguous 3D tensor data to GPU.
Current Behavior
ggml_cl_h2d_tensor_2d uses offset argument as byte offset in a call to clEnqueueWriteBuffer. ggml_cl_transform_tensor passes element count as offset to ggml_cl_h2d_tensor_2d. This corresponds to byte offset only if element size is exactly 1.
Also, I don't understand why ggml_cl_mul_f32 passes non-zero offset to ggml_cl_h2d_tensor_2d.
Environment and Context
AMD GPU
Linux
Steps to Reproduce
- Pass 3D tensor with contiguous
GGML_TYPE_F16orGGML_TYPE_F32data toggml_cl_transform_tensor. - Read data back from GPU memory or perform
ggml_cl_mul_maton that tensor. - Observe incorrect data or result.
Ping
Metadata
Metadata
Assignees
Labels
No labels