Commit graph

  • ef15649972
    build : fix most gcc and clang warnings (#2861) Cebtenzzre 2023-09-01 09:34:50 -04:00
  • d8d6977f48
    examples : add C grammar (#2357) Ben Siraphob 2023-09-01 09:32:14 -04:00
  • 5aec2cfaac
    ggml : add RISC-V vector intrinsics support (#2929) Tameem 2023-09-01 18:27:40 +05:00
  • 13268c5331
    metal : slight speed-up for add and mul kernels (#2917) Georgi Gerganov 2023-09-01 13:42:41 +03:00
  • 4dcd47d71d
    logs : fix mingw-like builds (fixes #2898) (#2911) staviq 2023-09-01 11:07:06 +02:00
  • 18705a30ef
    llama2c : fix segfault and alloc-dealloc-mismatch (#2913) Cebtenzzre 2023-09-01 05:03:49 -04:00
  • e8d9158925
    metal: somewhat faster f16 x f32 matrix multiply kernel (#2951) Kawrakow 2023-09-01 11:15:57 +03:00
  • bce1fef328
    convert : fix another python 3.8 issue (#2949) Cebtenzzre 2023-08-31 22:13:51 -04:00
  • 528134dd02
    remove convert-llama-7b-pth-to-gguf.py and convert-llama-hf-to-gguf.py (#2906) slaren 2023-09-01 01:32:09 +02:00
  • aeefac4ff7
    scripts: Use local gguf package when running from repo (#2927) Kerfuffle 2023-08-31 16:49:24 -06:00
  • e8422de39e
    @vxiiduu's fix for PrefetchVirtualMemory (#2930) DannyDaemonic 2023-08-31 04:21:45 -07:00
  • 92d0b751a7
    convert : fix python 3.8 support, modernize type annotations (#2916) Cebtenzzre 2023-08-31 01:02:23 -04:00
  • 8afe228000
    CUDA: mul_mat_q=true llama_context_params default (#2912) Johannes Gäßler 2023-08-30 21:46:19 +02:00
  • 71d6975559
    [Docker] fix tools.sh argument passing. (#2884) Henri Vasserman 2023-08-30 19:14:53 +03:00
  • b532a69b2f
    convert.py : use dir name to name the llama Georgi Gerganov 2023-08-30 13:29:40 +03:00
  • c90d135eb4
    examples : fix underscore in beam-search + .gitignore (close #2900) Georgi Gerganov 2023-08-30 12:52:46 +03:00
  • 0d1c706181
    gguf : add workflow for Pypi publishing (#2896) M. Yusuf Sarıgöz 2023-08-30 12:47:40 +03:00
  • 9509294420
    make : add test and update CI (#2897) alonfaraj 2023-08-30 12:42:51 +03:00
  • 35092fb547
    docs : add node-llama-cpp to README.md (#2885) Gilad S 2023-08-30 11:40:12 +03:00
  • dc07dc492e
    convert : various script cleanups/fixes + merges and special token handling (#2842) Kerfuffle 2023-08-30 02:25:50 -06:00
  • ad9ddcff6e
    llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879) chaihahaha 2023-08-30 14:50:55 +08:00
  • 8341a25957
    main : log file (#2748) staviq 2023-08-30 08:29:32 +02:00
  • 849408957c
    tests : add a C compliance test (#2848) Cebtenzzre 2023-08-30 02:20:26 -04:00
  • 06abf8eeba
    ggml : add view_src and view_offs to ggml_tensor for views (#2874) slaren 2023-08-29 23:24:42 +02:00
  • c03a243abf
    remove outdated references to -eps and -gqa from README (#2881) slaren 2023-08-29 23:17:34 +02:00
  • fa3582f509
    Tell users attmepting to run perplexity with too few tokens to use more (#2882) Kawrakow 2023-08-29 23:55:45 +03:00
  • e37e69dcc3
    10X faster BPE tokenizer (#2876) Kawrakow 2023-08-29 23:55:03 +03:00
  • 53885d7256
    py : fix "usage" messages (#2873) maddes8cht 2023-08-29 15:51:02 +02:00
  • bcce96ba4d
    convert.py : fix baichuan7B support (#2870) jameswu2014 2023-08-29 17:48:41 +08:00
  • 74e0caeb82
    readme : add react-native binding (#2869) Jhen-Jie Hong 2023-08-29 17:30:10 +08:00
  • d4b5e16c32
    make : fix clang tests build, add missing examples (#2859) Cebtenzzre 2023-08-29 04:42:41 -04:00
  • 3a007648f2
    metal : add option to disable debug logs (close #2764) Georgi Gerganov 2023-08-29 11:33:46 +03:00
  • 611363ac79 scripts : add pipefail Georgi Gerganov 2023-08-29 10:50:30 +03:00
  • 95b6e5212f
    added struct to llama_dump_timing_info_yaml's llama_context (#2857) Marcus Dunn 2023-08-28 23:33:27 -07:00
  • 44c117f41e
    train : mem usage and other improvements (#2439) xaedes 2023-08-28 21:51:47 +02:00
  • 43033b7bb4
    llama-bench : set locale to utf8 (#2832) slaren 2023-08-28 19:19:18 +02:00
  • 6b73ef1201
    YAML result logging + preset script (#2657) Johannes Gäßler 2023-08-28 17:59:39 +02:00
  • 75fafcbccc
    make : fix tests build (#2855) alonfaraj 2023-08-28 18:38:35 +03:00
  • be475f60af
    llama.cpp : fix wrong vsnprintf call in MS compiler (#2856) grahameth 2023-08-28 17:38:12 +02:00
  • 3af6b86301
    ggml : tiny ggml_vec_dot_q4_K_q8_K AVX2 improvement (#2819) Ronny Brendel 2023-08-28 14:51:08 +02:00
  • 35feac6560
    ggml : sync (mem align to header + conv_transpose_2d fixes + ggml_alloc) (#2852) Georgi Gerganov 2023-08-28 14:24:53 +03:00
  • 92b1bbd2ec
    CUDA: fix RoPE asserts, block sizes (#2833) Johannes Gäßler 2023-08-28 13:23:55 +02:00
  • dd0dc366da
    llama.h : add missing struct keyword for C compat in callback type (#2847) igarnier 2023-08-28 10:19:59 +02:00
  • f55538c3cc
    metal : fix memory leak (#2762) Georgi Gerganov 2023-08-28 10:59:08 +03:00
  • ebcee207b6
    quantize : make output filename optional again (#2823) Cebtenzzre 2023-08-28 02:32:25 -04:00
  • 3e8ff47af6
    devops : added systemd units and set versioning to use date. (#2835) JohnnyB 2023-08-28 07:31:24 +01:00
  • 103cfafc77
    gguf : fix strings to not be null-terminated (#2839) Georgi Gerganov 2023-08-27 21:50:22 +03:00
  • c10704d01e
    llama : fix MPI threads (close #2827) Georgi Gerganov 2023-08-27 18:55:41 +03:00
  • 230d46c723
    examples : update llama2.c converter to read vocab and write models in GGUF format (#2751) Olivier Chafik 2023-08-27 15:13:31 +01:00
  • 463173a6c0
    llama : speedup tokenization (#2831) Kawrakow 2023-08-27 16:50:33 +03:00
  • eaa13a48ff
    falcon : fix CUDA inference by making K and Q contiguous (#2830) Georgi Gerganov 2023-08-27 16:40:48 +03:00
  • da7455d046
    readme : fix headings Georgi Gerganov 2023-08-27 15:52:34 +03:00
  • 25423e9185
    scripts : helper convert script Georgi Gerganov 2023-08-27 15:24:40 +03:00
  • a6d1189fdd
    k_quants tuning for Falcon-7b (#2816) Kawrakow 2023-08-27 15:19:59 +03:00
  • c48c5bb0b0
    readme : update hot topics Georgi Gerganov 2023-08-27 14:44:35 +03:00
  • d0cee0d36d
    gguf : add 64-bit support (GGUF v2) (#2821) Georgi Gerganov 2023-08-27 14:19:54 +03:00
  • edd4c14817
    llama : more tokenizer fixes (#2810) Georgi Gerganov 2023-08-27 14:19:19 +03:00
  • 1591e2e590
    ggml : detect SSSE3 (#2825) Przemysław Pawełczyk 2023-08-27 10:10:25 +02:00
  • 789c8c945a
    ci : add LoRA test to CI (#2650) slaren 2023-08-27 09:03:27 +02:00
  • c1ac54b77a
    server : add /detokenize endpoint (#2802) Bruce MacDonald 2023-08-26 16:11:45 -07:00
  • 730d9c681e
    convert.py : advanced option (#2753) Kerfuffle 2023-08-26 14:13:36 -06:00
  • c7d92e6dfe
    llama : use Unicode Escape Sequence to replace encoded characters (#2814) Tim Miller 2023-08-27 03:27:07 +09:00
  • 61d1a2895e
    flake.nix : add rocm support and cleanup (#2808) Tungsten842 2023-08-26 20:19:44 +02:00
  • 741ca7dd1c
    llama : move #includes out of _GNU_SOURCE conditional (#2817) Cebtenzzre 2023-08-26 14:17:51 -04:00
  • 72f895c923
    main : fix bug (penalize_nl=false doesn't work) + suppress warning on mingw (#1528) Dr. Tom Murphy VII Ph.D 2023-08-26 14:12:56 -04:00
  • 50526f37eb
    llama : use std::abs in llama_sample_tail_free (#2800) Cebtenzzre 2023-08-26 12:53:52 -04:00
  • 04f4b1eb10
    k-quants : remove unnecessary tensor shape restrictions (#2811) Georgi Gerganov 2023-08-26 17:37:35 +03:00
  • 7592375403
    Better perplexity for 2- and 3-bit quantization for LLaMA-v2-70B (#2807) Kawrakow 2023-08-26 17:27:49 +03:00
  • 771551a793
    Fix HellaSwag (#2805) Kawrakow 2023-08-26 16:48:53 +03:00
  • f305bad11e
    flake : build llama.cpp on Intel with nix (#2795) Volodymyr Vitvitskyi 2023-08-26 14:25:39 +01:00
  • a2ca4e9de9
    Handle null rope scaling value (#2793) Nigel Bosch 2023-08-26 07:11:17 -05:00
  • 2ba83c8685
    Fix spm whitespaces (#2806) klosax 2023-08-26 13:45:53 +02:00
  • bae5c5f679
    examples : skip unnecessary external lib in server README.md how-to (#2804) lon 2023-08-26 10:07:43 +02:00
  • 232caf3c15
    llama : fix struct decl (#2790) Marcus Dunn 2023-08-25 09:17:15 -07:00
  • d046dcee08
    Faster perplexity computation (#2786) Kawrakow 2023-08-25 19:05:02 +03:00
  • c82742ac9c
    llama : add llama_beam_search() (#2267) Matt Pulver 2023-08-25 11:18:48 -04:00
  • 28b2c996ca
    convert.py : Get rope scale from HuggingFace models (#2772) Nigel Bosch 2023-08-25 09:41:52 -05:00
  • 154725c543
    llama-bench : add model sizes (#2771) slaren 2023-08-25 15:16:19 +02:00
  • 12e2e33a97
    convert.py : export rope freq_base when converting CodeLlama from an HF model (#2773) slaren 2023-08-25 14:08:53 +02:00
  • 29674ab4e8
    server : display token probabilities in the UI (#2489) Jhen-Jie Hong 2023-08-25 18:32:45 +08:00
  • 5439a0ab57
    ci : pip install gguf in editable mode (#2782) Georgi Gerganov 2023-08-25 13:03:25 +03:00
  • 8194cd8772
    gguf : export objects to user code (#2780) M. Yusuf Sarıgöz 2023-08-25 12:43:41 +03:00
  • 6bbc598a63
    ROCm Port (#1087) Henri Vasserman 2023-08-25 12:09:42 +03:00
  • 3f460a2b72
    cuda : add RoPE kernel for mode == 2 (NeoX) (#2760) Georgi Gerganov 2023-08-25 11:55:59 +03:00
  • 87e3733f24
    gguf : make gguf pip-installable M. Yusuf Sarıgöz 2023-08-25 09:26:05 +03:00
  • b91ad7f461
    ggml-alloc : enlarge size of parse_seq (#2776) Shouzheng Liu 2023-08-25 01:58:00 -04:00
  • 2e5f70a25f
    Added enum to llama_token_get_type return type (#2774) Marcus Dunn 2023-08-24 14:49:30 -07:00
  • d0f77b1353
    convert.py : try to determine n_ctx automatically for CodeLlama (#2770) slaren 2023-08-24 21:10:39 +02:00
  • 0d3094f0c7
    gguf : add rope_freq_base parameter for CodeLlama (#2769) slaren 2023-08-24 20:04:05 +02:00
  • 01f2224682
    falcon : write file type Georgi Gerganov 2023-08-24 19:58:30 +03:00
  • 38b16dfca6
    metal : bug-fix when enable ggml-alloc (#2757) Shouzheng Liu 2023-08-24 12:27:25 -04:00
  • 8f8c28e89c
    convert : auto-determine model name based on dir + scripts update Georgi Gerganov 2023-08-24 19:26:19 +03:00
  • 7694adda8d
    Fix for main example getting stuck when -n -2 and --interactive (#2767) Kerfuffle 2023-08-24 10:11:13 -06:00
  • fea95c682d
    fix convert.py for codellama, add llama 34B to the list of recognized models (#2768) slaren 2023-08-24 17:44:11 +02:00
  • ef955fbd23
    Tag release with build number (#2732) DannyDaemonic 2023-08-24 06:58:02 -07:00
  • d67777c202
    metal : add Q8_0 support (#2763) Georgi Gerganov 2023-08-24 16:19:57 +03:00
  • c3e53b421a
    llama : escape all U+2581 in a string (#2750) Georgi Gerganov 2023-08-24 12:26:01 +03:00
  • 6e91a1b070
    llama : fix grammar sometimes generating null char (#2756) Evan Jones 2023-08-24 00:07:13 -04:00
  • 44d5462b5c
    readme : fix link Georgi Gerganov 2023-08-23 23:44:19 +03:00
  • c7868b0753
    minor : fix trailing whitespace Georgi Gerganov 2023-08-23 23:43:00 +03:00