Skip to content

Conversation

@ggerganov
Copy link
Member

fix #5070

MoE models now support batch size of up to 4096 with Metal

@ggerganov ggerganov merged commit bb6d00b into master Mar 10, 2024
@ggerganov ggerganov deleted the gg/metal-mm-id-shared branch March 10, 2024 21:12
NeoZhangJianyu pushed a commit to NeoZhangJianyu/llama.cpp that referenced this pull request Mar 12, 2024
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ggml : support bs > 512 for Metal ggml_mul_mat_id

2 participants