llama.cpp

ver4a/llama.cpp

Fork 0

Commit graph

f1782c68de

quantize : fail fast on write errors (#3521) cebtenzzre 2023-10-07 04:41:52 -04:00
c26765a0a1

metal : support default.metallib load & reuse code for swift package (#3522) Jhen-Jie Hong 2023-10-07 03:40:27 -05:00
0e797c2fc5

llm : support Adept Persimmon 8B (#3410) Phillip Kravtsov 2023-10-07 00:12:43 -07:00
3a716b4dae

Fix for #3454 (#3455) goerch 2023-10-07 06:57:01 +02:00
1faaae8c2b

readme : update models, cuda + ppl instructions (#3510) BarfingLemurs 2023-10-06 15:13:36 -04:00
cb13d73a72

server : docs fix default values and add n_probs (#3506) Mihai 2023-10-06 21:39:33 +03:00
9ca79d5cbb

kv cache slot search improvements (#3493) Kerfuffle 2023-10-06 10:10:13 -06:00
0c731ca403

prompts : fix editorconfig checks after #3416 Georgi Gerganov 2023-10-06 16:35:55 +03:00
a8777ad84e

parallel : add option to load external prompt file (#3416) pudepiedj 2023-10-06 14:16:38 +01:00
97af49fa39

server : reuse llama_sample_token common util (#3494) Jhen-Jie Hong 2023-10-06 07:44:24 -05:00
16820a5a0d

llama : correct hparams comparison (#3446) l3utterfly 2023-10-06 18:47:59 +08:00
04b2f4386e

ci : fix xcodebuild destinations (#3491) Jhen-Jie Hong 2023-10-06 05:36:43 -05:00
48edda30ee

convert : update Falcon script for new HF config (#3448) cebtenzzre 2023-10-05 15:00:34 -04:00
45eba9369f

build : use std::make_tuple() for compatibility with older GCC versions (#3488) Kenvix ⭐ 2023-10-06 01:16:39 +08:00
acec9eaaa9

common : process escape sequences in reverse prompts (#3461) staviq 2023-10-05 18:17:29 +02:00
e2583cbc29 CLBlast: Fix handling of on-device tensor data shibe2 2023-10-05 15:57:03 +04:00
e8b8d32e86

server : fix incorrect num_tokens_predicted (#3480) Jhen-Jie Hong 2023-10-05 09:02:55 -05:00
8f3a642ec1

swift : disable ACCELERATE_NEW_LAPACK (#3481) Jhen-Jie Hong 2023-10-05 09:00:07 -05:00
0745384449

ci : add swift build via xcodebuild (#3482) Jhen-Jie Hong 2023-10-05 08:56:21 -05:00
019ba1dcd0

convert : fix Baichuan2 models by using vocab size in config.json (#3299) Kerfuffle 2023-10-04 08:20:28 -06:00
beabc8cfb0

readme : add project status link Georgi Gerganov 2023-10-04 16:50:44 +03:00
0d152b37fe

ggml : fix build after #3329 Georgi Gerganov 2023-10-04 16:25:41 +03:00
f8c90cdbaa

llm : add Refact model (#3329) ds5t5 2023-10-04 06:23:39 -07:00
f93af02488

sync : ggml (conv 1d + 2d updates, UB fixes) (#3468) Georgi Gerganov 2023-10-04 15:29:58 +03:00
f72f8f22c9

finetune : readme fix typo (#3465) Merrick Christensen 2023-10-04 00:33:13 -06:00
79f34abddb

ggml : add RISC-V Vector Support for K-Quants and improved the existing intrinsics (#3453) Tameem 2023-10-03 23:38:19 +05:00
8186242b6d

main : consistent prefix/suffix coloring (#3425) h-h-h-h 2023-10-03 20:16:15 +02:00
ac2219fef3

llama : fix session saving/loading (#3400) Georgi Gerganov 2023-10-03 21:04:01 +03:00
48be797ffb

llama : expose model's rope_freq_scale in the API (#3418) Alex Klinkhamer 2023-10-03 10:09:28 -07:00
f56e1baec3

metal : alibi for arbitrary number of heads (#3426) Jiahao Li 2023-10-04 00:55:21 +08:00
017efe899d

cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (#3273) Eve 2023-10-03 16:53:15 +00:00
ff5a3f0c09

Work on the BPE tokenizer (#3252) goerch 2023-10-03 09:16:26 +02:00
1c84003c08

convert : fix vocab size when not defined in hparams (#3421) cebtenzzre 2023-10-02 18:07:24 -04:00
e78f0b0d05

cmake : increase minimum version for add_link_options (#3444) cebtenzzre 2023-10-02 15:38:43 -04:00
665018c749

CLBlast: Add broadcast support for matrix multiplication (#3402) shibe2 2023-10-02 23:26:15 +04:00
29a404a951

gguf : add BERT, MPT, and GPT-J arch info (#3408) cebtenzzre 2023-10-02 15:20:28 -04:00
0fe321031a

gguf : general usability improvements (#3409) cebtenzzre 2023-10-02 14:58:46 -04:00
9476b01226

cmake : make CUDA flags more similar to the Makefile (#3420) cebtenzzre 2023-10-02 09:16:50 -04:00
a03ce38455

finetune : fix #3404 (#3437) xaedes 2023-10-02 15:15:45 +02:00
a847676984

metal : set log callback before initializing (#3427) Adrian 2023-10-02 03:49:59 -07:00
095231dfd3

cmake : fix transient definitions in find pkg (#3411) bandoti 2023-10-02 06:51:49 -03:00
ea55295a74

docker : ignore Git files (#3314) Kevin Ji 2023-10-02 04:53:53 -04:00
c97f01c362

infill : add new example + extend server API (#3296) vvhg1 2023-10-02 09:42:02 +02:00
f5ef5cfb18

ggml-cuda : perform cublas mat mul of quantized types as f16 (#3412) slaren 2023-09-30 18:12:57 +02:00
40e07a60f9

llama.cpp : add documentation about rope_freq_base and scale values (#3401) slaren 2023-09-29 18:42:32 +02:00
bc34dd4f5b

train : fix KQ_pos allocation (#3392) Georgi Gerganov 2023-09-29 19:05:18 +03:00
2777a84be4

llama : quantize up to 31% faster on Linux and Windows with mmap (#3206) Cebtenzzre 2023-09-29 09:48:45 -04:00
0a4a4a0982

readme : update hot topics + model links (#3399) BarfingLemurs 2023-09-29 08:50:35 -04:00
569550df20

readme : add link to grammars app (#3388) Andrew Duffy 2023-09-29 07:15:57 -04:00
c71bf2c45c

swift : fix build on xcode 15 (#3387) Jhen-Jie Hong 2023-09-29 13:25:13 +08:00
bc39553c90

build : enable more non-default compiler warnings (#3200) Cebtenzzre 2023-09-28 17:41:44 -04:00
0ccfc62a96

ggml_tensor: update the structure comments. (#3283) Hua Jiang 2023-09-28 13:06:18 -07:00
7f1a0fe709

ggml : release the requested thread pool resource (#3292) Qu Zongfu 2023-09-29 03:51:52 +08:00
16bc66d947

llama.cpp : split llama_context_params into model and context params (#3301) slaren 2023-09-28 21:42:38 +02:00
0512d66670

ci : multithreaded builds (#3311) Eve 2023-09-28 19:31:04 +00:00
0e76a8992c

train : finetune LORA (#2632) xaedes 2023-09-28 20:40:11 +02:00
2db94d98ed

gguf : basic type checking in gguf_get_* (#3346) Cebtenzzre 2023-09-28 14:30:31 -04:00
ecf90b1a51

gguf : make token scores and types optional (#3347) Cebtenzzre 2023-09-28 14:30:15 -04:00
2619109ad5

ci : disable freeBSD builds due to lack of VMs (#3381) Georgi Gerganov 2023-09-28 19:36:36 +03:00
ec893798b7

llama : custom attention mask + parallel decoding + no context swaps (#3228) Georgi Gerganov 2023-09-28 19:04:36 +03:00
45855b3f1c

docs : mark code as Bash (#3375) Kevin Ji 2023-09-28 09:11:32 -04:00
4aea3b846e

readme : add Mistral AI release 0.1 (#3362) Pierre Alexandre SCHEMBRI 2023-09-28 14:13:37 +02:00
da0400344b

ggml-cuda : perform cublas fp16 matrix multiplication as fp16 (#3370) slaren 2023-09-28 12:08:28 +02:00
e519621010

convert : remove bug in convert.py permute function (#3364) Zhang Peiyuan 2023-09-28 02:45:20 +08:00
ac43576124

make-ggml.py : compatibility with more models and GGUF (#3290) Richard Roberson 2023-09-27 10:25:12 -06:00
20c7e1e804

gguf : fix a few general keys (#3341) Cebtenzzre 2023-09-27 12:18:07 -04:00
dc6897404e

metal : reusing llama.cpp logging (#3152) Rickard Hallerbäck 2023-09-27 17:48:33 +02:00
527e57cfd8

build : add ACCELERATE_NEW_LAPACK to fix warning on macOS Sonoma (#3342) Jag Chadha 2023-09-27 11:34:32 -04:00
ffe88a36a9

readme : add some recent perplexity and bpw measurements to READMES, link for k-quants (#3340) BarfingLemurs 2023-09-27 11:30:36 -04:00
99115f3fa6

cmake : fix build-info.h on MSVC (#3309) DAN™ 2023-09-25 18:45:33 -04:00
1726f9626f

docs: Fix typo CLBlast_DIR var. (#3330) 2f38b454 2023-09-26 02:24:52 +08:00
a98b1633d5

nix : add cuda, use a symlinked toolkit for cmake (#3202) Erik Scholz 2023-09-25 13:48:30 +02:00
c091cdfb24

llama-bench : add README (#3317) slaren 2023-09-23 21:48:24 +02:00
51a7cf5c6e

examples : fix RoPE defaults to match PR #3240 (#3315) Cebtenzzre 2023-09-23 05:28:50 -04:00
bedb92b603

scripts : use /usr/bin/env in shebang (#3313) Kevin Ji 2023-09-22 23:52:23 -04:00
bc9d3e3971

Update README.md (#3289) Lee Drake 2023-09-21 13:00:24 -06:00
36b904e200

ggml-opencl.cpp: Make private functions static (#3300) shibe2 2023-09-21 22:10:26 +04:00
324f3403d5

zig : fix for updated c lib (#3259) Edward Taylor 2023-09-21 21:08:20 +12:00
f56c418ab0

embedding : update README.md (#3224) yuiseki 2023-09-21 17:57:40 +09:00
8185710a80

CUDA: use only 1 thread if fully offloaded (#2915) Johannes Gäßler 2023-09-21 10:43:53 +02:00
7eb41179ed

readme : update hot topics Georgi Gerganov 2023-09-20 20:48:22 +03:00
a5661d7e71

llama : allow gguf RoPE keys to be overridden with defaults (#3240) Cebtenzzre 2023-09-20 12:12:47 -04:00
65c2c1c5ab

benchmark-matmult : do not use integer abs() on a float (#3277) Cebtenzzre 2023-09-20 12:06:08 -04:00
80834daecf

flake : Restore default package's buildInputs (#3262) kang 2023-09-20 22:48:22 +09:00
a40f2b656f

CI: FreeBSD fix (#3258) Alon 2023-09-20 15:06:36 +03:00
d119c04c15

examples : fix benchmark-matmult (#1554) Georgi Gerganov 2023-09-20 10:02:39 +03:00
8781013ef6

make : restore build-info.h dependency for several targets (#3205) Cebtenzzre 2023-09-18 10:03:53 -04:00
7ddf185537

ci : switch cudatoolkit install on windows to networked (#3236) Erik Scholz 2023-09-18 02:21:47 +02:00
ee66942d7e

CUDA: fix peer access logic (#3231) Johannes Gäßler 2023-09-17 23:35:20 +02:00
111163e246

CUDA: enable peer access between devices (#2470) Johannes Gäßler 2023-09-17 16:37:53 +02:00
8b428c9bc8

llama.cpp : show model size and BPW on load (#3223) slaren 2023-09-17 14:33:28 +02:00
578d8c8f5c

CUDA: fix scratch malloced on non-main device (#3220) Johannes Gäßler 2023-09-17 14:16:22 +02:00
b541b4f0b1

Enable BUILD_SHARED_LIBS=ON on all Windows builds (#3215) IsaacDynamo 2023-09-16 19:35:25 +02:00
5dbc2b3213

Enable build with CUDA 11.0 (make) (#3132) Vlad 2023-09-16 17:55:43 +03:00
b08e75baea

Fixing the last deviations from sentencepiece indicated by test-tokenizer-1 (#3170) goerch 2023-09-16 13:41:33 +02:00
e6616cf0db

examples : add compiler version and target to build info (#2998) Cebtenzzre 2023-09-15 16:59:49 -04:00
3aefaab9e5

check C++ code with -Wmissing-declarations (#3184) Cebtenzzre 2023-09-15 15:38:27 -04:00
69eb67e282

fix build numbers by setting fetch-depth=0 (#3197) Cebtenzzre 2023-09-15 15:18:15 -04:00
4fe09dfe66

llama : add support for StarCoder model architectures (#3187) Meng Zhang 2023-09-16 03:02:13 +08:00
80291a1d02

common : do not use GNU zero-length __VA_ARGS__ extension (#3195) Cebtenzzre 2023-09-15 14:02:01 -04:00