ROCm Port (#1087)
* use hipblas based on cublas * Update Makefile for the Cuda kernels * Expand arch list and make it overrideable * Fix multi GPU on multiple amd architectures with rocblas_initialize() (#5) * add hipBLAS to README * new build arg LLAMA_CUDA_MMQ_Y * fix half2 decomposition * Add intrinsics polyfills for AMD * AMD assembly optimized __dp4a * Allow overriding CC_TURING * use "ROCm" instead of "CUDA" * ignore all build dirs * Add Dockerfiles * fix llama-bench * fix -nommq help for non CUDA/HIP --------- Co-authored-by: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com> Co-authored-by: ardfork <134447697+ardfork@users.noreply.github.com> Co-authored-by: funnbot <22226942+funnbot@users.noreply.github.com> Co-authored-by: Engininja2 <139037756+Engininja2@users.noreply.github.com> Co-authored-by: Kerfuffle <44031344+KerfuffleV2@users.noreply.github.com> Co-authored-by: jammm <2500920+jammm@users.noreply.github.com> Co-authored-by: jdecourval <7315817+jdecourval@users.noreply.github.com>
This commit is contained in:
parent
3f460a2b72
commit
6bbc598a63
12 changed files with 335 additions and 59 deletions
15
.gitignore
vendored
15
.gitignore
vendored
|
@ -16,20 +16,7 @@
|
|||
.vs/
|
||||
.vscode/
|
||||
|
||||
build/
|
||||
build-em/
|
||||
build-debug/
|
||||
build-release/
|
||||
build-ci-debug/
|
||||
build-ci-release/
|
||||
build-static/
|
||||
build-cublas/
|
||||
build-opencl/
|
||||
build-metal/
|
||||
build-mpi/
|
||||
build-no-accel/
|
||||
build-sanitize-addr/
|
||||
build-sanitize-thread/
|
||||
build*/
|
||||
out/
|
||||
tmp/
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue