|
NAMEpbs_ gpureset - reset GPU error countsSYNOPSIS#include <pbs_error.h>#include <pbs_ifl.h> int pbs_ gpureset(int connect, char *mom_node, int gpu_id, int ecc_perm, int ecc_vol) DESCRIPTIONIssue a batch request for the pbs_mom to reset the ECC counts on one of it's Nvidia GPUs. The GPU's error count is reset by sending a GPU Control batch request to the batch server.The argument, mom_node, specifies the host within the cluster on which the GPU is located. The argument is the name of a host that is a member of the cluster of hosts managed by the server. The argument, gpu_id, specifies ID of the GPU on the MOM node. The argument, ecc_perm, specifies whether or not to reset the GPU's permanent ECC error count. Value of 1 resets, value of 0 does not. The argument, ecc_vol, specifies whether or not to reset the GPU's volatile ECC error count. Value of 1 resets, value of 0 does not. This call requires PBS Operator or Manager privilege. It also requires that Torque be configured with --enable-nvidia-gpu. SEE ALSOqgpureset(1B)DIAGNOSTICSWhen the batch request generated by the pbs_ gpureset() function has been completed successfully by a batch server, the routine will return 0 (zero). Otherwise, a non zero error is returned. The error number is also set in pbs_errno.
Visit the GSP FreeBSD Man Page Interface. |