![]() * Add attention and final logit softcapping. * fix * Add custom add_ functions * Disable flash attention for Gemma2 * Update src/llama.cpp Co-authored-by: slaren <slarengh@gmail.com> * Add default value for attention and final logit softcap value * Add custom kq scaling from Gemma2Attention * Remove custom pre attention scaling and use computed value instead. --------- Co-authored-by: slaren <slarengh@gmail.com> |
||
---|---|---|
.. | ||
CMakeLists.txt | ||
llama.cpp | ||
unicode-data.cpp | ||
unicode-data.h | ||
unicode.cpp | ||
unicode.h |