Commit graph

  • f1782c68de
    quantize : fail fast on write errors (#3521) cebtenzzre 2023-10-07 04:41:52 -04:00
  • c26765a0a1
    metal : support default.metallib load & reuse code for swift package (#3522) Jhen-Jie Hong 2023-10-07 03:40:27 -05:00
  • 0e797c2fc5
    llm : support Adept Persimmon 8B (#3410) Phillip Kravtsov 2023-10-07 00:12:43 -07:00
  • 3a716b4dae
    Fix for #3454 (#3455) goerch 2023-10-07 06:57:01 +02:00
  • 1faaae8c2b
    readme : update models, cuda + ppl instructions (#3510) BarfingLemurs 2023-10-06 15:13:36 -04:00
  • cb13d73a72
    server : docs fix default values and add n_probs (#3506) Mihai 2023-10-06 21:39:33 +03:00
  • 9ca79d5cbb
    kv cache slot search improvements (#3493) Kerfuffle 2023-10-06 10:10:13 -06:00
  • 0c731ca403
    prompts : fix editorconfig checks after #3416 Georgi Gerganov 2023-10-06 16:35:55 +03:00
  • a8777ad84e
    parallel : add option to load external prompt file (#3416) pudepiedj 2023-10-06 14:16:38 +01:00
  • 97af49fa39
    server : reuse llama_sample_token common util (#3494) Jhen-Jie Hong 2023-10-06 07:44:24 -05:00
  • 16820a5a0d
    llama : correct hparams comparison (#3446) l3utterfly 2023-10-06 18:47:59 +08:00
  • 04b2f4386e
    ci : fix xcodebuild destinations (#3491) Jhen-Jie Hong 2023-10-06 05:36:43 -05:00
  • 48edda30ee
    convert : update Falcon script for new HF config (#3448) cebtenzzre 2023-10-05 15:00:34 -04:00
  • 45eba9369f
    build : use std::make_tuple() for compatibility with older GCC versions (#3488) Kenvix ⭐ 2023-10-06 01:16:39 +08:00
  • acec9eaaa9
    common : process escape sequences in reverse prompts (#3461) staviq 2023-10-05 18:17:29 +02:00
  • e2583cbc29 CLBlast: Fix handling of on-device tensor data shibe2 2023-10-05 15:57:03 +04:00
  • e8b8d32e86
    server : fix incorrect num_tokens_predicted (#3480) Jhen-Jie Hong 2023-10-05 09:02:55 -05:00
  • 8f3a642ec1
    swift : disable ACCELERATE_NEW_LAPACK (#3481) Jhen-Jie Hong 2023-10-05 09:00:07 -05:00
  • 0745384449
    ci : add swift build via xcodebuild (#3482) Jhen-Jie Hong 2023-10-05 08:56:21 -05:00
  • 019ba1dcd0
    convert : fix Baichuan2 models by using vocab size in config.json (#3299) Kerfuffle 2023-10-04 08:20:28 -06:00
  • beabc8cfb0
    readme : add project status link Georgi Gerganov 2023-10-04 16:50:44 +03:00
  • 0d152b37fe
    ggml : fix build after #3329 Georgi Gerganov 2023-10-04 16:25:41 +03:00
  • f8c90cdbaa
    llm : add Refact model (#3329) ds5t5 2023-10-04 06:23:39 -07:00
  • f93af02488
    sync : ggml (conv 1d + 2d updates, UB fixes) (#3468) Georgi Gerganov 2023-10-04 15:29:58 +03:00
  • f72f8f22c9
    finetune : readme fix typo (#3465) Merrick Christensen 2023-10-04 00:33:13 -06:00
  • 79f34abddb
    ggml : add RISC-V Vector Support for K-Quants and improved the existing intrinsics (#3453) Tameem 2023-10-03 23:38:19 +05:00
  • 8186242b6d
    main : consistent prefix/suffix coloring (#3425) h-h-h-h 2023-10-03 20:16:15 +02:00
  • ac2219fef3
    llama : fix session saving/loading (#3400) Georgi Gerganov 2023-10-03 21:04:01 +03:00
  • 48be797ffb
    llama : expose model's rope_freq_scale in the API (#3418) Alex Klinkhamer 2023-10-03 10:09:28 -07:00
  • f56e1baec3
    metal : alibi for arbitrary number of heads (#3426) Jiahao Li 2023-10-04 00:55:21 +08:00
  • 017efe899d
    cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (#3273) Eve 2023-10-03 16:53:15 +00:00
  • ff5a3f0c09
    Work on the BPE tokenizer (#3252) goerch 2023-10-03 09:16:26 +02:00
  • 1c84003c08
    convert : fix vocab size when not defined in hparams (#3421) cebtenzzre 2023-10-02 18:07:24 -04:00
  • e78f0b0d05
    cmake : increase minimum version for add_link_options (#3444) cebtenzzre 2023-10-02 15:38:43 -04:00
  • 665018c749
    CLBlast: Add broadcast support for matrix multiplication (#3402) shibe2 2023-10-02 23:26:15 +04:00
  • 29a404a951
    gguf : add BERT, MPT, and GPT-J arch info (#3408) cebtenzzre 2023-10-02 15:20:28 -04:00
  • 0fe321031a
    gguf : general usability improvements (#3409) cebtenzzre 2023-10-02 14:58:46 -04:00
  • 9476b01226
    cmake : make CUDA flags more similar to the Makefile (#3420) cebtenzzre 2023-10-02 09:16:50 -04:00
  • a03ce38455
    finetune : fix #3404 (#3437) xaedes 2023-10-02 15:15:45 +02:00
  • a847676984
    metal : set log callback before initializing (#3427) Adrian 2023-10-02 03:49:59 -07:00
  • 095231dfd3
    cmake : fix transient definitions in find pkg (#3411) bandoti 2023-10-02 06:51:49 -03:00
  • ea55295a74
    docker : ignore Git files (#3314) Kevin Ji 2023-10-02 04:53:53 -04:00
  • c97f01c362
    infill : add new example + extend server API (#3296) vvhg1 2023-10-02 09:42:02 +02:00
  • f5ef5cfb18
    ggml-cuda : perform cublas mat mul of quantized types as f16 (#3412) slaren 2023-09-30 18:12:57 +02:00
  • 40e07a60f9
    llama.cpp : add documentation about rope_freq_base and scale values (#3401) slaren 2023-09-29 18:42:32 +02:00
  • bc34dd4f5b
    train : fix KQ_pos allocation (#3392) Georgi Gerganov 2023-09-29 19:05:18 +03:00
  • 2777a84be4
    llama : quantize up to 31% faster on Linux and Windows with mmap (#3206) Cebtenzzre 2023-09-29 09:48:45 -04:00
  • 0a4a4a0982
    readme : update hot topics + model links (#3399) BarfingLemurs 2023-09-29 08:50:35 -04:00
  • 569550df20
    readme : add link to grammars app (#3388) Andrew Duffy 2023-09-29 07:15:57 -04:00
  • c71bf2c45c
    swift : fix build on xcode 15 (#3387) Jhen-Jie Hong 2023-09-29 13:25:13 +08:00
  • bc39553c90
    build : enable more non-default compiler warnings (#3200) Cebtenzzre 2023-09-28 17:41:44 -04:00
  • 0ccfc62a96
    ggml_tensor: update the structure comments. (#3283) Hua Jiang 2023-09-28 13:06:18 -07:00
  • 7f1a0fe709
    ggml : release the requested thread pool resource (#3292) Qu Zongfu 2023-09-29 03:51:52 +08:00
  • 16bc66d947
    llama.cpp : split llama_context_params into model and context params (#3301) slaren 2023-09-28 21:42:38 +02:00
  • 0512d66670
    ci : multithreaded builds (#3311) Eve 2023-09-28 19:31:04 +00:00
  • 0e76a8992c
    train : finetune LORA (#2632) xaedes 2023-09-28 20:40:11 +02:00
  • 2db94d98ed
    gguf : basic type checking in gguf_get_* (#3346) Cebtenzzre 2023-09-28 14:30:31 -04:00
  • ecf90b1a51
    gguf : make token scores and types optional (#3347) Cebtenzzre 2023-09-28 14:30:15 -04:00
  • 2619109ad5
    ci : disable freeBSD builds due to lack of VMs (#3381) Georgi Gerganov 2023-09-28 19:36:36 +03:00
  • ec893798b7
    llama : custom attention mask + parallel decoding + no context swaps (#3228) Georgi Gerganov 2023-09-28 19:04:36 +03:00
  • 45855b3f1c
    docs : mark code as Bash (#3375) Kevin Ji 2023-09-28 09:11:32 -04:00
  • 4aea3b846e
    readme : add Mistral AI release 0.1 (#3362) Pierre Alexandre SCHEMBRI 2023-09-28 14:13:37 +02:00
  • da0400344b
    ggml-cuda : perform cublas fp16 matrix multiplication as fp16 (#3370) slaren 2023-09-28 12:08:28 +02:00
  • e519621010
    convert : remove bug in convert.py permute function (#3364) Zhang Peiyuan 2023-09-28 02:45:20 +08:00
  • ac43576124
    make-ggml.py : compatibility with more models and GGUF (#3290) Richard Roberson 2023-09-27 10:25:12 -06:00
  • 20c7e1e804
    gguf : fix a few general keys (#3341) Cebtenzzre 2023-09-27 12:18:07 -04:00
  • dc6897404e
    metal : reusing llama.cpp logging (#3152) Rickard Hallerbäck 2023-09-27 17:48:33 +02:00
  • 527e57cfd8
    build : add ACCELERATE_NEW_LAPACK to fix warning on macOS Sonoma (#3342) Jag Chadha 2023-09-27 11:34:32 -04:00
  • ffe88a36a9
    readme : add some recent perplexity and bpw measurements to READMES, link for k-quants (#3340) BarfingLemurs 2023-09-27 11:30:36 -04:00
  • 99115f3fa6
    cmake : fix build-info.h on MSVC (#3309) DAN™ 2023-09-25 18:45:33 -04:00
  • 1726f9626f
    docs: Fix typo CLBlast_DIR var. (#3330) 2f38b454 2023-09-26 02:24:52 +08:00
  • a98b1633d5
    nix : add cuda, use a symlinked toolkit for cmake (#3202) Erik Scholz 2023-09-25 13:48:30 +02:00
  • c091cdfb24
    llama-bench : add README (#3317) slaren 2023-09-23 21:48:24 +02:00
  • 51a7cf5c6e
    examples : fix RoPE defaults to match PR #3240 (#3315) Cebtenzzre 2023-09-23 05:28:50 -04:00
  • bedb92b603
    scripts : use /usr/bin/env in shebang (#3313) Kevin Ji 2023-09-22 23:52:23 -04:00
  • bc9d3e3971
    Update README.md (#3289) Lee Drake 2023-09-21 13:00:24 -06:00
  • 36b904e200
    ggml-opencl.cpp: Make private functions static (#3300) shibe2 2023-09-21 22:10:26 +04:00
  • 324f3403d5
    zig : fix for updated c lib (#3259) Edward Taylor 2023-09-21 21:08:20 +12:00
  • f56c418ab0
    embedding : update README.md (#3224) yuiseki 2023-09-21 17:57:40 +09:00
  • 8185710a80
    CUDA: use only 1 thread if fully offloaded (#2915) Johannes Gäßler 2023-09-21 10:43:53 +02:00
  • 7eb41179ed
    readme : update hot topics Georgi Gerganov 2023-09-20 20:48:22 +03:00
  • a5661d7e71
    llama : allow gguf RoPE keys to be overridden with defaults (#3240) Cebtenzzre 2023-09-20 12:12:47 -04:00
  • 65c2c1c5ab
    benchmark-matmult : do not use integer abs() on a float (#3277) Cebtenzzre 2023-09-20 12:06:08 -04:00
  • 80834daecf
    flake : Restore default package's buildInputs (#3262) kang 2023-09-20 22:48:22 +09:00
  • a40f2b656f
    CI: FreeBSD fix (#3258) Alon 2023-09-20 15:06:36 +03:00
  • d119c04c15
    examples : fix benchmark-matmult (#1554) Georgi Gerganov 2023-09-20 10:02:39 +03:00
  • 8781013ef6
    make : restore build-info.h dependency for several targets (#3205) Cebtenzzre 2023-09-18 10:03:53 -04:00
  • 7ddf185537
    ci : switch cudatoolkit install on windows to networked (#3236) Erik Scholz 2023-09-18 02:21:47 +02:00
  • ee66942d7e
    CUDA: fix peer access logic (#3231) Johannes Gäßler 2023-09-17 23:35:20 +02:00
  • 111163e246
    CUDA: enable peer access between devices (#2470) Johannes Gäßler 2023-09-17 16:37:53 +02:00
  • 8b428c9bc8
    llama.cpp : show model size and BPW on load (#3223) slaren 2023-09-17 14:33:28 +02:00
  • 578d8c8f5c
    CUDA: fix scratch malloced on non-main device (#3220) Johannes Gäßler 2023-09-17 14:16:22 +02:00
  • b541b4f0b1
    Enable BUILD_SHARED_LIBS=ON on all Windows builds (#3215) IsaacDynamo 2023-09-16 19:35:25 +02:00
  • 5dbc2b3213
    Enable build with CUDA 11.0 (make) (#3132) Vlad 2023-09-16 17:55:43 +03:00
  • b08e75baea
    Fixing the last deviations from sentencepiece indicated by test-tokenizer-1 (#3170) goerch 2023-09-16 13:41:33 +02:00
  • e6616cf0db
    examples : add compiler version and target to build info (#2998) Cebtenzzre 2023-09-15 16:59:49 -04:00
  • 3aefaab9e5
    check C++ code with -Wmissing-declarations (#3184) Cebtenzzre 2023-09-15 15:38:27 -04:00
  • 69eb67e282
    fix build numbers by setting fetch-depth=0 (#3197) Cebtenzzre 2023-09-15 15:18:15 -04:00
  • 4fe09dfe66
    llama : add support for StarCoder model architectures (#3187) Meng Zhang 2023-09-16 03:02:13 +08:00
  • 80291a1d02
    common : do not use GNU zero-length __VA_ARGS__ extension (#3195) Cebtenzzre 2023-09-15 14:02:01 -04:00