diff --git a/ci/run.sh b/ci/run.sh index 4d9c2dc48..ebd662b38 100755 --- a/ci/run.sh +++ b/ci/run.sh @@ -59,6 +59,8 @@ if [ ! -z ${GG_BUILD_SYCL} ]; then export ONEAPI_DEVICE_SELECTOR="level_zero:0" # Enable sysman for correct memory reporting export ZES_ENABLE_SYSMAN=1 + # to circumvent precision issues on CPY operations + export SYCL_PROGRAM_COMPILE_OPTIONS="-cl-fp32-correctly-rounded-divide-sqrt" CMAKE_EXTRA="${CMAKE_EXTRA} -DGGML_SYCL=1 -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DGGML_SYCL_F16=ON" fi diff --git a/docs/backend/SYCL.md b/docs/backend/SYCL.md index 19fe8a9d2..f1204dded 100644 --- a/docs/backend/SYCL.md +++ b/docs/backend/SYCL.md @@ -302,6 +302,10 @@ cmake -B build -DGGML_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx - cmake --build build --config Release -j -v ``` +It is possible to come across some precision issues when running tests that stem from using faster +instructions, which can be circumvented by setting the environment variable `SYCL_PROGRAM_COMPILE_OPTIONS` +as `-cl-fp32-correctly-rounded-divide-sqrt` + #### Nvidia GPU The SYCL backend depends on [oneMath](https://github.com/uxlfoundation/oneMath) for Nvidia and AMD devices. @@ -322,6 +326,9 @@ cmake -B build -DGGML_SYCL=ON -DGGML_SYCL_TARGET=NVIDIA -DGGML_SYCL_DEVICE_ARCH= cmake --build build --config Release -j -v ``` +It is possible to come across some precision issues when running tests that stem from using faster +instructions, which can be circumvented by passing the `-fno-fast-math` flag to the compiler. + #### AMD GPU The SYCL backend depends on [oneMath](https://github.com/uxlfoundation/oneMath) for Nvidia and AMD devices.