|
1)
Questions and Answers :
Issue Discussion :
Invalid tasks
(Message 1470)
Posted 8 Feb 2022 by Magiceye04 Post: Now there is another issue. The tasks do not even start any more. <core_client_version>7.16.6</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63)</message> <stderr_txt> DEBUG: Args: ../../projects/www.mlcathome.org_mlcathome/mlds-gpu_9.80_x86_64-pc-linux-gnu__cuda10200 -c --maxepoch 2048 nthreads: 1 gpudev: 0 Re-exec()-ing to set environment correctly terminate called after throwing an instance of 'c10::Error' what(): CUDA error: forward compatibility was attempted on non supported HW Exception raised from current_device at /home/mlcbuild/git/pytorch-build/build-cuda/pytorch-prefix/src/pytorch/c10/cuda/CUDAFunctions.h:40 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7f6f708ce99b in ./libc10.so) frame #1: at::cuda::getCurrentDeviceProperties() + 0x167 (0x7f6ef2f9f3b7 in ./libtorch_cuda.so) frame #2: <unknown function> + 0x88018 (0x559489c91018 in ../../projects/www.mlcathome.org_mlcathome/mlds-gpu_9.80_x86_64-pc-linux-gnu__cuda10200) frame #3: __libc_start_main + 0xf3 (0x7f6ef21600b3 in /lib/x86_64-linux-gnu/libc.so.6) frame #4: <unknown function> + 0x8675a (0x559489c8f75a in ../../projects/www.mlcathome.org_mlcathome/mlds-gpu_9.80_x86_64-pc-linux-gnu__cuda10200) SIGABRT: abort called Stack trace (12 frames): ../../projects/www.mlcathome.org_mlcathome/mlds-gpu_9.80_x86_64-pc-linux-gnu__cuda10200(+0x37df9c)[0x559489f86f9c] /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7f6f708593c0] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7f6ef217f18b] /lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7f6ef215e859] ../../projects/www.mlcathome.org_mlcathome/mlds-gpu_9.80_x86_64-pc-linux-gnu__cuda10200(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x135)[0x55948a0387f5] ../../projects/www.mlcathome.org_mlcathome/mlds-gpu_9.80_x86_64-pc-linux-gnu__cuda10200(+0x398846)[0x559489fa1846] ../../projects/www.mlcathome.org_mlcathome/mlds-gpu_9.80_x86_64-pc-linux-gnu__cuda10200(+0x398891)[0x559489fa1891] ../../projects/www.mlcathome.org_mlcathome/mlds-gpu_9.80_x86_64-pc-linux-gnu__cuda10200(+0x3968c4)[0x559489f9f8c4] ./libtorch_cuda.so(_ZN2at4cuda26getCurrentDevicePropertiesEv+0x1bd)[0x7f6ef2f9f40d] ../../projects/www.mlcathome.org_mlcathome/mlds-gpu_9.80_x86_64-pc-linux-gnu__cuda10200(+0x88018)[0x559489c91018] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7f6ef21600b3] ../../projects/www.mlcathome.org_mlcathome/mlds-gpu_9.80_x86_64-pc-linux-gnu__cuda10200(+0x8675a)[0x559489c8f75a] Exiting... </stderr_txt> ]]> |
|
2)
Questions and Answers :
Issue Discussion :
All test Wu's (9.96) result in "error while computing"
(Message 1469)
Posted 8 Feb 2022 by Magiceye04 Post: I had the same problem one week ago. All tasks had validation errors after hours of work. I stopped the work until yesterday. Now the error changed, the WU to not even start. |
©2022 MLC@Home Team
A project of the Cognition, Robotics, and Learning (CORAL) Lab at the University of Maryland, Baltimore County (UMBC)