vulkan: scale caching for k quants + misc fixes (#11081)
* q6_k scale caching * 16 bit unpack * q4_k test (slow) * revert it * q3_k * q2_k * little stuff * try precalculating products of a and q2_k scales * Revert "try precalculating products of a and q2_k scales" This reverts commit 65110b81f23f66331a50c6e889a7c1ab9470a86b. * unpack should be u16, add vim swap to gitignore (about time) * better q4_k scales * q5_k * better q6_k with separate paths for all threads and partial threads in use, plus some more optimizations * q2_k better dequant * q3_k optimizations * q3_k use hmask simd from cpu avx version * make the caches happy * q3_k separate out calculation * q2_k separate out * little stuff * use calc_superblock everywhere * q2_k optimize scale calculation * more barriers
This commit is contained in:
parent
f11cfdfd7f
commit
adc5dd92e8
6 changed files with 454 additions and 384 deletions
1
.gitignore
vendored
1
.gitignore
vendored
|
@ -18,6 +18,7 @@
|
|||
*.metallib
|
||||
*.o
|
||||
*.so
|
||||
*.swp
|
||||
*.tmp
|
||||
|
||||
# IDE / OS
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue