How to disable or change the timeout limit for the GPU under linux?
You can disable the watchdog by modifying your Xorg config (Option Interactive "0"
). An example is available in the answer to this question: CUDA Visual Profiler 'Interactive' X config option?
OpenGL: render time limit on linux
I'm afraid this is not possible. After a lot of scouring through the documentation of both X and Wayland, I could not find anything mentioning GPU watchdog timer settings, so I believe this is driver-specific and likely inaccessible to the user (that or I am terrible at searching).
It is however possible to disable this watchdog under X on NVIDIA hardware by adding a line to your xorg.conf
, which is then passed on to the graphics driver.
Option "Interactive" "boolean"
This option controls the behavior of the driver's watchdog, which attempts to detect and terminate GPU programs that get stuck, in order to ensure that the GPU remains available for other processes. GPU compute applications, however, often have long-running GPU programs, and killing them would be undesirable. If you are using GPU compute applications and they are getting prematurely terminated, try turning this option off.
Note that even the NVIDIA docs don't mention a numeric quantity for the timeout.
Disabling TDR for CUDA in Windows 8
Windows WDDM Driver Timeout Detection and Recovery mechanism can be disabled or the timeout can be extended to be greater than the default 2 seconds.Timeout Detection and Recovery is documented on MSDN.
(Edited: The above link is dead. The information that it provided might now be available at https://docs.microsoft.com/en-us/windows-hardware/drivers/display/tdr-registry-keys)
Nsight Visual Studio Edition Nsight.Monitor has settings to disable or increase the timeout. Otherwise, you can use the registry keys in the MSDN article. Make sure to restart the computer after making changes.
I recommend that you increase TdrDelay before completely disabling TDR.
Tesla GPUs can use the Tesla Compute Cluster driver which does not have a timeout watchdog.
How do I select which GPU to run a job on?
The problem was caused by not setting the CUDA_VISIBLE_DEVICES
variable within the shell correctly.
To specify CUDA device 1
for example, you would set the CUDA_VISIBLE_DEVICES
using
export CUDA_VISIBLE_DEVICES=1
or
CUDA_VISIBLE_DEVICES=1 ./cuda_executable
The former sets the variable for the life of the current shell, the latter only for the lifespan of that particular executable invocation.
If you want to specify more than one device, use
export CUDA_VISIBLE_DEVICES=0,1
or
CUDA_VISIBLE_DEVICES=0,1 ./cuda_executable
Related Topics
Can't Run Sonar Server Caused by Elasticsearch Cannot Running as Root
Use a C Library in Swift on Linux
Microsecond Accurate (Or Better) Process Timing in Linux
How to See Contents of Hive Orc Files in Linux
How to Do Runtime Binding Based on CPU Capabilities on Linux
How to Respond to Prompts in a Linux Bash Script Automatically
Use Sed with Ignore Case While Adding Text Before Some Pattern
How Are Threads/Processes Parked and Woken in Linux, Prior to Futex
Why Linux Kernel Use Trap Gate to Handle Divide_Error Exception
Pseudo-Random Stack Pointer Under Linux
Get Time in Milliseconds Without an Installing an Extra Package
Linux - Without Hardware Soundcard, Capture Audio Playback, and Record It to File