CUDA app testing

Message boards : News : CUDA app testing

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Yavanius
Avatar

Send message
Joined: 17 Jan 17
Posts: 11
Credit: 15,198
RAC: 51
Message 5055 - Posted: 17 Oct 2017, 21:23:57 UTC
Last modified: 17 Oct 2017, 21:34:57 UTC

Intel i5-2520M (2nd Gen) w/NVS4200M

Tasks are erring out with Output file absent...

10/17/2017 14:25:01 | DrugDiscovery@Home | Output file gmx_GMX-T024_XXXX_TTTTT_tasks.gmx_636_1000.in_1_r513545929_2 for task gmx_GMX-T024_XXXX_TTTTT_tasks.gmx_636_1000.in_1 absent
10/17/2017 14:25:01 | DrugDiscovery@Home | Output file gmx_GMX-T024_XXXX_TTTTT_tasks.gmx_636_1000.in_1_r513545929_5 for task gmx_GMX-T024_XXXX_TTTTT_tasks.gmx_636_1000.in_1 absent

195 (0x000000C3) EXIT_CHILD_FAILED

<core_client_version>7.8.3</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 195 (0xc3)</message>
<stderr_txt>
14:25:03 (3636): wrapper (7.9.26016): starting
[DEBUG] replacing '$PROJECT_DIR' with 'C:\ProgramData\BOINC/projects/boinc.drugdiscoveryathome.com'
[DEBUG] replacing '$NTHREADS' with '1'
[DEBUG] replacing '$GPU_DEVICE_NUM' with '0'
[DEBUG] replacing '$PWD' with 'C:\ProgramData\BOINC\slots\5'
14:25:03 (3636): wrapper: running gmx.exe (mdrun -nt 1 -deffnm md -gpu_id 0)
14:25:04 (3636): cpu time 0.031250, checkpoint CPU time 0.000000 frac done 0.000000
14:25:05 (3636): gmx.exe exited; CPU time 0.031250
14:25:05 (3636): app exit status: 0xc0000135
14:25:05 (3636): called boinc_finish(195)
ID: 5055 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Bill F

Send message
Joined: 15 Aug 17
Posts: 7
Credit: 38,814
RAC: 340
Message 5056 - Posted: 18 Oct 2017, 4:36:22 UTC

Windows failure after 4 seconds

Work Unit gmx_GMX-T031_XXXX_TTTTT_tasks.gmx_394_1000.in_3

Ver 0.23

PC 7712 Win 7 64 Bit Ultimate x64 Edition, Service Pack 1, (06.01.7601.00)

AuthenticAMD
AMD Phenom(tm) II X6 1045T Processor [Family 16 Model 10 Stepping 0]

NVIDIA GeForce 210 (1024MB) driver: 342.01 OpenCL: 1.0

Run Time 4 Sec

<core_client_version>7.8.3</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 195 (0xc3)</message>
<stderr_txt>
23:20:10 (7760): wrapper (7.9.26016): starting
[DEBUG] replacing '$PROJECT_DIR' with 'C:\ProgramData\BOINC/projects/boinc.drugdiscoveryathome.com'
[DEBUG] replacing '$NTHREADS' with '1'
[DEBUG] replacing '$GPU_DEVICE_NUM' with '0'
[DEBUG] replacing '$PWD' with 'C:\ProgramData\BOINC\slots\10'
23:20:10 (7760): wrapper: running gmx.exe (mdrun -nt 1 -deffnm md -gpu_id 0)
23:20:12 (7760): cpu time 0.000000, checkpoint CPU time 0.000000 frac done 0.000000
23:20:13 (7760): gmx.exe exited; CPU time 0.000000
23:20:13 (7760): app exit status: 0xc0000135
23:20:13 (7760): called boinc_finish(195)

</stderr_txt>
]]>
ID: 5056 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile PhilTheNet

Send message
Joined: 22 Dec 16
Posts: 6
Credit: 498,758
RAC: 5,647
Message 5057 - Posted: 18 Oct 2017, 4:59:42 UTC - in response to Message 5056.  

All the same :

<core_client_version>7.8.2</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 195 (0xc3)
</message>
<stderr_txt>
20:19:27 (844): wrapper (7.9.26016): starting (811 +dev#, opencl_ati_101)
[DEBUG] replacing '$PROJECT_DIR' with 'D:\BOINC/projects/boinc.drugdiscoveryathome.com'
[DEBUG] replacing '$NTHREADS' with '1'
[DEBUG] replacing '$GPU_DEVICE_NUM' with '7337568'
[DEBUG] replacing '$PWD' with 'D:\BOINC\slots\0'
20:19:27 (844): wrapper: running gmx.exe (-gpu_id 0 mdrun -nt 1 -deffnm md -gpu_id 7337568)
20:19:30 (844): cpu time 0.000000, checkpoint CPU time 0.000000 frac done 0.000000
20:19:31 (844): cpu time 0.000000, checkpoint CPU time 0.000000 frac done 0.000000
20:19:32 (844): cpu time 0.000000, checkpoint CPU time 0.000000 frac done 0.000000
20:19:33 (844): cpu time 0.000000, checkpoint CPU time 0.000000 frac done 0.000000
20:19:34 (844): cpu time 0.000000, checkpoint CPU time 0.000000 frac done 0.000000
20:19:35 (844): cpu time 0.000000, checkpoint CPU time 0.000000 frac done 0.000000
20:19:36 (844): cpu time 0.000000, checkpoint CPU time 0.000000 frac done 0.000000
20:19:37 (844): cpu time 0.000000, checkpoint CPU time 0.000000 frac done 0.000000
20:19:38 (844): cpu time 0.000000, checkpoint CPU time 0.000000 frac done 0.000000
20:19:39 (844): cpu time 0.000000, checkpoint CPU time 0.000000 frac done 0.000000
20:19:40 (844): cpu time 1.203125, checkpoint CPU time 0.000000 frac done 0.000000
20:19:41 (844): cpu time 1.203125, checkpoint CPU time 0.000000 frac done 0.000000
20:19:42 (844): cpu time 1.203125, checkpoint CPU time 0.000000 frac done 0.000000
20:19:43 (844): cpu time 1.203125, checkpoint CPU time 0.000000 frac done 0.000000
20:19:44 (844): cpu time 1.203125, checkpoint CPU time 0.000000 frac done 0.000000
20:19:45 (844): cpu time 1.203125, checkpoint CPU time 0.000000 frac done 0.000000
20:19:46 (844): cpu time 1.203125, checkpoint CPU time 0.000000 frac done 0.000000
20:19:47 (844): cpu time 1.203125, checkpoint CPU time 0.000000 frac done 0.000000
20:19:48 (844): cpu time 1.203125, checkpoint CPU time 0.000000 frac done 0.000000
20:19:49 (844): gmx.exe exited; CPU time 1.203125
20:19:49 (844): app exit status: 0xc000001d
20:19:49 (844): called boinc_finish(195)

</stderr_txt>
]]>

:(
ID: 5057 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jacob Klein

Send message
Joined: 3 May 10
Posts: 2
Credit: 115,688
RAC: 1,122
Message 5058 - Posted: 18 Oct 2017, 5:43:16 UTC
Last modified: 18 Oct 2017, 5:43:34 UTC

Can you please configure the project to NOT send these tasks to PCs without AVX2 ?!?
ID: 5058 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Krzysztof Piszczek - wspieram Polski Projekt Boinc
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 8 Nov 10
Posts: 125
Credit: 5,508,546
RAC: 38,082
Message 5062 - Posted: 18 Oct 2017, 12:36:29 UTC - in response to Message 5058.  

Can you please configure the project to NOT send these tasks to PCs without AVX2 ?!?

This application doesn't use AVX2, just AVX.

But yes, I will create plan_class to not send WU's to stations without AVX.
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team and Universe@Home admin.
ID: 5062 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
JugNut

Send message
Joined: 26 Jan 17
Posts: 11
Credit: 94,298,548
RAC: 597,096
Message 5063 - Posted: 18 Oct 2017, 13:28:29 UTC - in response to Message 5058.  
Last modified: 18 Oct 2017, 13:49:16 UTC

Hey Krzysztof,
The GPU usage is extremely poor, running one at a time GPU usage is only 7% on 980ti & 1080ti's .

On this rig i7-5960x with 2 x GTX 970's driver version 382.05 i'm running 3 at a time(0.33).
But in the log#1 on the above box I found this.....


CUDA driver: 6.50
CUDA runtime: 0.0


NOTE: Error occurred during GPU detection:
CUDA driver version is insufficient for CUDA runtime version
Can not use GPU acceleration, will fall back to CPU kernels.



Running on 1 node with total 4 cores, 8 logical cores, 0 compatible GPUs
Hardware detected:
CPU info:
Vendor: GenuineIntel
Brand: Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz
Family: 6 model: 58 stepping: 9
CPU features: aes apic avx clfsh cmov cx8 cx16 f16c htt lahf_lm mmx msr nonstop_tsc pcid pclmuldq pdcm popcnt pse rdrnd rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
SIMD instructions most likely to fit this hardware: AVX_256
SIMD instructions selected at GROMACS compile time: SSE4.1


Binary not matching hardware - you might be losing performance.
SIMD instructions most likely to fit this hardware: AVX_256
SIMD instructions selected at GROMACS compile time: SSE4.1


The current CPU can measure timings more accurately than the code in
gmx was configured to use. This might affect your simulation
speed as accurate timings are needed for load-balancing.
Please consider rebuilding gmx with the GMX_USE_RDTSCP=ON CMake option
------------------------------------------------------------------------------------------------

So are the GPU's falling back to CPU only?
If this driver version is insufficient, then what version is sufficient?
ID: 5063 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 27 Jul 17
Posts: 27
Credit: 85,112
RAC: 39
Message 5064 - Posted: 18 Oct 2017, 13:43:07 UTC

What I don't understand is why not just update to the latest GROMACS? It is free, isn't it?
Then, you could go to a later version of CUDA as I understand the problem.
ID: 5064 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Krzysztof Piszczek - wspieram Polski Projekt Boinc
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 8 Nov 10
Posts: 125
Credit: 5,508,546
RAC: 38,082
Message 5065 - Posted: 18 Oct 2017, 13:45:48 UTC - in response to Message 5063.  
Last modified: 18 Oct 2017, 13:46:21 UTC

@ JugNut Your CUDA version is 6.5, application require min. 8.0 so in you case application switch to CPU only mode.
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team and Universe@Home admin.
ID: 5065 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
JugNut

Send message
Joined: 26 Jan 17
Posts: 11
Credit: 94,298,548
RAC: 597,096
Message 5066 - Posted: 18 Oct 2017, 13:54:08 UTC - in response to Message 5065.  
Last modified: 18 Oct 2017, 14:50:59 UTC

Thanks Krzysztof, perhaps you could post minimum requirements for the GPU app on the front page.

Cheers & once again good luck.
ID: 5066 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Bill F

Send message
Joined: 15 Aug 17
Posts: 7
Credit: 38,814
RAC: 340
Message 5069 - Posted: 18 Oct 2017, 15:20:43 UTC

System 7712 under Windows with AMD CPU that has only AVX not AVX2 or AVX256 is failing all CUDA jobs.

Nvidia GPU 0: GeForce 210 [b](driver version 342.01, CUDA version 6.5, compute capability 1.2, 1024MB, 844MB available, 67 GFLOPS peak)

Do I meet minimum Spec's for this work ?

Thanks
Bill F
ID: 5069 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Krzysztof Piszczek - wspieram Polski Projekt Boinc
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 8 Nov 10
Posts: 125
Credit: 5,508,546
RAC: 38,082
Message 5070 - Posted: 18 Oct 2017, 15:30:24 UTC - in response to Message 5069.  
Last modified: 18 Oct 2017, 15:31:03 UTC

No, you have CUDA 6.5 where 8.0 is minimum, you need to update your CUDA driver.
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team and Universe@Home admin.
ID: 5070 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
noxcivi

Send message
Joined: 25 Sep 17
Posts: 9
Credit: 66,045
RAC: 336
Message 5071 - Posted: 18 Oct 2017, 15:33:32 UTC - in response to Message 5063.  
Last modified: 18 Oct 2017, 15:41:33 UTC


On this rig i7-5960x with 2 x GTX 970's driver version 382.05 i'm running 3 at a time(0.33).
But in the log#1 on the above box I found this.....


CUDA driver: 6.50
CUDA runtime: 0.0

That can't be right. CUDA driver module stepped up to 8.0 way before the 375er driver versions. The compute capability of a GTX 970 is 5.2. On a Windows 7-10 or Linux with drivers from 2016 or younger it should work.


@Krzysztof: The requirement of compute capability 2.3 seems odd, too. This is not a version to be a capability milestone on any NVIDIA gpu, ever. There is Fermi with 2.0 and 2.1, Kepler with 3.x and Maxwell with 5.x (and all of them support CUDA SDK 8.0).


Nvidia GPU 0: GeForce 210 [b](driver version 342.01, CUDA version 6.5, compute capability 1.2, 1024MB, 844MB available, 67 GFLOPS peak)

Do I meet minimum Spec's for this work ?

No, you don't. Geforce 210 hat a GT218 chip with compute capability 1.2.
ID: 5071 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Krzysztof Piszczek - wspieram Polski Projekt Boinc
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 8 Nov 10
Posts: 125
Credit: 5,508,546
RAC: 38,082
Message 5072 - Posted: 18 Oct 2017, 15:42:21 UTC - in response to Message 5071.  
Last modified: 18 Oct 2017, 15:45:13 UTC

Compute capability is set on server plan_class where 2.3 ships ok to GTX650 and cuda_fermi doesn't send to ANY machine connected to server (including my 1060 6GB Windows machine) - probably there is a bug in server or client software. I have speach with TJM (Enigma@Home admin) where he have exactly opposite situation :)

Also, some people have installed separately CUDA subsystem and in this cause, in some machine drivers update doesn't update CUDA runtime libraries.

For people who wish to make quickes possible gmx for their Linux computers I published some info here.

Compilation under Windows is much more difficult...
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team and Universe@Home admin.
ID: 5072 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
noxcivi

Send message
Joined: 25 Sep 17
Posts: 9
Credit: 66,045
RAC: 336
Message 5073 - Posted: 18 Oct 2017, 16:06:29 UTC - in response to Message 5072.  
Last modified: 18 Oct 2017, 16:08:44 UTC

I unterstand your point, but boinc plan classes are not an understandable system requirement for the standard windows boinc user. They need to know if i.e. GT(X) 4xx will work or not. So just tell them which they need at last. Fresh graphics drivers should be installed without saying...

PS: AVX256 is not the official name of any instruction set. There is AVX, AVX2 and AVX-512. So please use a name people can find in cpu-z. So they can check if their cpu is capable.
ID: 5073 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Krzysztof Piszczek - wspieram Polski Projekt Boinc
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 8 Nov 10
Posts: 125
Credit: 5,508,546
RAC: 38,082
Message 5074 - Posted: 18 Oct 2017, 16:15:04 UTC - in response to Message 5073.  

PS: AVX256 is not the official name of any instruction set. There is AVX, AVX2 and AVX-512. So please use a name people can find in cpu-z. So they can check if their cpu is capable.

True, I just wrote AVX versions reported by Gromacs compiler (which recognise few different sets of AVX instructions and I had used lowest one).

In plan_class I set AVX as requirement for CPU, we will see if that works corectly. Unfortunately only way to see if it works properly is to wait for results (other way will be building machine just below the requirements and test on it but I haven't any CPU without AVX on board...) :(
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team and Universe@Home admin.
ID: 5074 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 27 Jul 17
Posts: 27
Credit: 85,112
RAC: 39
Message 5075 - Posted: 18 Oct 2017, 17:51:23 UTC
Last modified: 18 Oct 2017, 17:52:02 UTC

I am still getting errors on my GTX 1060 running under Windows 7 64-bit (387.92 driver, which is CUDA 9.1). It is supported by a Haswell CPU (i7-4771) with four cores free.
Good luck with your wrappers.

<core_client_version>7.8.3</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 195 (0xc3)</message>
<stderr_txt>
13:53:02 (4620): wrapper (7.9.26016): starting
[DEBUG] replacing '$PROJECT_DIR' with 'C:\ProgramData\BOINC/projects/boinc.drugdiscoveryathome.com'
[DEBUG] replacing '$NTHREADS' with '1'
[DEBUG] replacing '$GPU_DEVICE_NUM' with '0'
[DEBUG] replacing '$PWD' with 'C:\ProgramData\BOINC\slots\4'
13:53:02 (4620): wrapper: running gmx.exe (mdrun -nt 1 -deffnm md -gpu_id 0)
13:53:03 (4620): cpu time 0.000000, checkpoint CPU time 0.000000 frac done 0.000000
13:53:04 (4620): gmx.exe exited; CPU time 0.000000
13:53:04 (4620): app exit status: 0xc0000135
13:53:04 (4620): called boinc_finish(195)

</stderr_txt>
]]>
ID: 5075 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Krzysztof Piszczek - wspieram Polski Projekt Boinc
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 8 Nov 10
Posts: 125
Credit: 5,508,546
RAC: 38,082
Message 5076 - Posted: 18 Oct 2017, 18:02:31 UTC - in response to Message 5075.  

"exit code 195" usually means that application can't load one of required libraries or haven't permission to use it.
Unfortunately, I have no idea what library it is as Windows doesn't report it...

Can you try to start gmx.exe in command line (outside of BOINC slot folder) and let me know if it starts without error (other that command line parameters are not given)?
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team and Universe@Home admin.
ID: 5076 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 27 Jul 17
Posts: 27
Credit: 85,112
RAC: 39
Message 5077 - Posted: 18 Oct 2017, 18:58:52 UTC - in response to Message 5076.  

Can you try to start gmx.exe in command line (outside of BOINC slot folder) and let me know if it starts without error (other that command line parameters are not given)?

OK, I went to the "C:\ProgramData\BOINC\projects\boinc.drugdiscoveryathome.com" folder and tried to run "gmx_23_x86_64_windows.exe", but got this error message:
"The program can't start because MSVCP110.dll is missing from your computer".
ID: 5077 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Krzysztof Piszczek - wspieram Polski Projekt Boinc
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 8 Nov 10
Posts: 125
Credit: 5,508,546
RAC: 38,082
Message 5080 - Posted: 18 Oct 2017, 21:39:57 UTC - in response to Message 5077.  

Can you try to start gmx.exe in command line (outside of BOINC slot folder) and let me know if it starts without error (other that command line parameters are not given)?

OK, I went to the "C:\ProgramData\BOINC\projects\boinc.drugdiscoveryathome.com" folder and tried to run "gmx_23_x86_64_windows.exe", but got this error message:
"The program can't start because MSVCP110.dll is missing from your computer".

It mean, that you don't have some Microsoft libraries installed on system...
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team and Universe@Home admin.
ID: 5080 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mmonnin

Send message
Joined: 25 Jan 17
Posts: 34
Credit: 6,961,791
RAC: 22,181
Message 5081 - Posted: 18 Oct 2017, 22:18:20 UTC - in response to Message 5070.  

No, you have CUDA 6.5 where 8.0 is minimum, you need to update your CUDA driver.


Why is the plan class cuda23 if cuda80 is needed? Just call it cuda65 or cuda80 to match the requirement.

So just AVX? You said AVX256 and AVX and AVX2 added some 256 bit commands so it wasn't really specific enough.
ID: 5081 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : News : CUDA app testing


©2017 All rights reserved | Design by Digital BioPharm Ltd