Gian-Carlo Pascutto
58d07a8043
metal : copy kernels for quant to F32/F16 conversions ( #12017 )
...
metal: use dequantize_q templates
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-02-25 11:27:58 +02:00
Adrian Kretz
22885105a6
metal : optimize dequant q6_K kernel ( #11892 )
2025-02-15 20:39:20 +02:00
Georgi Gerganov
68ff663a04
repo : update links to new url ( #11886 )
...
* repo : update links to new url
ggml-ci
* cont : more urls
ggml-ci
2025-02-15 16:40:57 +02:00
Georgi Gerganov
2139667ec4
metal : fix out-of-bounds write ( #11314 )
...
ggml-ci
2025-01-21 08:48:13 +02:00
PAB
a8cbab201d
ggml: add GGML_SET Metal kernel + i32 CPU kernel (ggml/1037)
...
* implemented cpu kernel
* add i32 test cases in test-backend-ops
* typedef `ggml_metal_kargs_set`
* implemented `kernel_set`
* memcpy
2024-12-05 13:27:33 +02:00
PAB
c2082d93a8
ggml : add GGML_PAD_REFLECT_1D operation (ggml/1034)
...
* ggml_pad_reflect_1d defined in header
* implemented on CPU
* called the forward pass
* impl Metal kernel
* added Metal kernel
* added OP_PAD_REFLECT_1D in test-backend-ops.cpp
* add test-pad-reflect-1d test case
* test case support multiple backend
2024-12-05 13:27:31 +02:00
PAB
efb6ae9630
feat: add GGML_UNARY_OP_ARGMAX Metal kernel (ggml/1019)
...
* implemented argmax kernel
* tpig -> tgpig
* change to strides
* contiguous assertions
* kernel working and tested
* argmax simd parallel implementation
* added 2 new tests for argmax in test-backend-ops
* cosmit
* added 3 tests cases for perf eval
* add test_argmax in make_test_cases_perf
* Update test-backend-ops.cpp
Co-authored-by: Diego Devesa <slarengh@gmail.com>
---------
Co-authored-by: Diego Devesa <slarengh@gmail.com>
2024-12-03 20:04:49 +02:00
PAB
667d70d170
metal : add GGML_OP_CONV_TRANSPOSE_1D kernels (ggml/1026)
...
* wip
* wip implementation f32
* kernel conv transpose 1d f32 working
* initial commit
2024-12-03 20:04:49 +02:00
Georgi Gerganov
0115df2f65
metal : small-batch mat-mul kernels ( #10581 )
...
* metal : small-batch mat-mul kernels
ggml-ci
* metal : add rest of types
ggml-ci
* metal : final adjustments
ggml-ci
* metal : add comments
ggml-ci
2024-12-03 11:52:33 +02:00
Georgi Gerganov
b756441104
metal : minor code formatting
2024-11-25 15:08:04 +02:00
Plamen Minev
611fabd792
metal : fox offset integer overflows in im2col (ggml/1015)
...
-- While running StableDiffusion.cpp locally with Metal some offsets overflow and results in incorrect calculations
2024-11-19 20:03:21 +02:00
PAB
12b0ad953a
metal : add GGML_UNARY_OP_ELU kernel (ggml/1018)
2024-11-19 20:03:21 +02:00
Georgi Gerganov
cf32a9b93a
metal : refactor kernel args into structs ( #10238 )
...
* metal : add kernel arg structs (wip)
* metal : fattn args
ggml-ci
* metal : cont + avoid potential int overflow [no ci]
* metal : mul mat struct (wip)
* cont : mul mat vec
* cont : pass by reference
* cont : args is first argument
* cont : use char ptr
* cont : shmem style
* cont : thread counters style
* cont : mul mm id
ggml-ci
* cont : int safety + register optimizations
ggml-ci
* metal : GGML_OP_CONCAT
ggml-ci
* metal : GGML_OP_ADD, GGML_OP_SUB, GGML_OP_MUL, GGML_OP_DIV
* metal : GGML_OP_REPEAT
* metal : GGML_OP_CPY
* metal : GGML_OP_RMS_NORM
* metal : GGML_OP_NORM
* metal : add TODOs for rest of ops
* ggml : add ggml-metal-impl.h
ggml-ci
2024-11-17 11:23:01 +02:00
Diego Devesa
ae8de6d50a
ggml : build backends as libraries ( #10256 )
...
* ggml : build backends as libraries
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: R0CKSTAR <xiaodong.ye@mthreads.com>
2024-11-14 18:04:35 +01:00