ggml : add Q5_0 and Q5_1 quantization (#1187)
* ggml : add Q5_0 quantization (cuBLAS only) * ggml : fix Q5_0 qh -> uint32_t * ggml : fix q5_0 histogram stats * ggml : q5_0 scalar dot product * ggml : q5_0 ARM NEON dot * ggml : q5_0 more efficient ARM NEON using uint64_t masks * ggml : rename Q5_0 -> Q5_1 * ggml : adding Q5_0 mode * quantize : add Q5_0 and Q5_1 to map * ggml : AVX2 optimizations for Q5_0, Q5_1 (#1195) --------- Co-authored-by: Stephan Walter <stephan@walter.name>
This commit is contained in:
parent
87a6f846d3
commit
574406dc7e
8 changed files with 711 additions and 30 deletions
1
.gitignore
vendored
1
.gitignore
vendored
|
@ -15,6 +15,7 @@ build-em/
|
|||
build-debug/
|
||||
build-release/
|
||||
build-static/
|
||||
build-cublas/
|
||||
build-no-accel/
|
||||
build-sanitize-addr/
|
||||
build-sanitize-thread/
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue