Skip to content
View junhuihe-hjh's full-sized avatar

Block or report junhuihe-hjh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. CHESS CHESS Public

    [EMNLP 2024] CHESS: Optimizing LLM Inference via Channel-Wise Thresholding and Selective Sparsification

    Python 4 1

  2. A2ATS A2ATS Public

    [ACL 2025 Findings] A2ATS: Retrieval-Based KV Cache Reduction via Windowed Rotary Position Embedding and Query-Aware Vector Quantization

    Python 5

  3. microsoft/BitNet microsoft/BitNet Public

    Official inference framework for 1-bit LLMs

    Python 28.4k 2.3k

  4. Eddie-Wang1120/llama.cpp Eddie-Wang1120/llama.cpp Public

    Forked from ggml-org/llama.cpp

    LLM inference in C/C++

    C++ 25 25