CUDA app testing

Message boards : News : CUDA app testing

To post messages, you must log in.

1 · 2 · 3 · 4 · Next

AuthorMessage
Profile Krzysztof Piszczek - wspieram Polski Projekt Boinc
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 8 Nov 10
Posts: 125
Credit: 5,508,546
RAC: 31,111
Message 5001 - Posted: 16 Oct 2017, 19:02:58 UTC
Last modified: 17 Oct 2017, 14:53:43 UTC

As some of you already know, we just started testing of CUDA Gromacs application for Linux and Windows computers.

To use it under Linux you should have Gromacs installed (apt-get install gromacs on Debian based systems).

Under Windows you should have Nvidia GPU with CUDA 2.3 capability and AVX256 capable CPU.

Also, ignore estimated running time reported by Manager as is currently set very high by me for testing purposes.
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team and Universe@Home admin.
ID: 5001 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
jbest

Send message
Joined: 4 Sep 17
Posts: 3
Credit: 150,228
RAC: 2,395
Message 5016 - Posted: 17 Oct 2017, 2:32:49 UTC - in response to Message 5001.  

Upon downloading and running these I instantly get a windows error "gmx.exe" and the program has to be closed. I then get a computation error.

What's up do we think?
ID: 5016 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Krzysztof Piszczek - wspieram Polski Projekt Boinc
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 8 Nov 10
Posts: 125
Credit: 5,508,546
RAC: 31,111
Message 5017 - Posted: 17 Oct 2017, 2:34:47 UTC - in response to Message 5016.  

Do yours CPU support AVX_256 instructions and GPU support CUDA 2.3?
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team and Universe@Home admin.
ID: 5017 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
langfod

Send message
Joined: 3 Aug 17
Posts: 19
Credit: 4,367,692
RAC: 16,776
Message 5018 - Posted: 17 Oct 2017, 3:47:56 UTC - in response to Message 5016.  

The X5560 is Haswell based.

AVX_256 vector routines, which are required by this version of Gromacs, did not appear until Sandy Bridge, two generations later.
ID: 5018 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 27 Jul 17
Posts: 27
Credit: 85,112
RAC: 29
Message 5019 - Posted: 17 Oct 2017, 6:46:33 UTC - in response to Message 5018.  

AVX_256 vector routines, which are required by this version of Gromacs, did not appear until Sandy Bridge, two generations later.

AVX 256 (otherwise known as AVX2), did not appear until Haswell.
https://en.wikipedia.org/wiki/Advanced_Vector_Extensions
ID: 5019 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Krzysztof Piszczek - wspieram Polski Projekt Boinc
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 8 Nov 10
Posts: 125
Credit: 5,508,546
RAC: 31,111
Message 5020 - Posted: 17 Oct 2017, 6:48:56 UTC - in response to Message 5019.  

No, there is two "AVX256" extensions...
AVX_256 and
AVX2_256
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team and Universe@Home admin.
ID: 5020 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 27 Jul 17
Posts: 27
Credit: 85,112
RAC: 29
Message 5022 - Posted: 17 Oct 2017, 6:54:48 UTC - in response to Message 5020.  

No, there is two "AVX256" extensions...
AVX_256 and
AVX2_256

OK, does that mean that Ivy Bridge (i7-3770 for example) is good enough to support the Windows version?
ID: 5022 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jayray

Send message
Joined: 11 Oct 17
Posts: 4
Credit: 599,159
RAC: 2,973
Message 5023 - Posted: 17 Oct 2017, 6:55:39 UTC - in response to Message 5020.  

My second GPU, GTX 1060 is only at GPU clock 139 MHz and Memory clock 405 MHz while running these test jobs.

First GPU, GTX 1080 Ti runs at 1936 / 5005 MHz. Also two concurrent tasks proceed at same rate even though 1080 Ti is 2x as fast as 1060 card.

Both GPUs will run PrimeGrid GPU jobs normally. Just a heads up.
ID: 5023 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 27 Jul 17
Posts: 27
Credit: 85,112
RAC: 29
Message 5027 - Posted: 17 Oct 2017, 7:07:56 UTC - in response to Message 5016.  

Upon downloading and running these I instantly get a windows error "gmx.exe" and the program has to be closed. I then get a computation error.

I am now running a GTX 1060 supported by an i7-4771 (with four free cores) under Windows 7 64-bit and get the same error.

<core_client_version>7.8.3</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 195 (0xc3)</message>
<stderr_txt>
03:09:13 (85916): wrapper (7.7.26016): starting
03:09:13 (85916): wrapper: running gmx.exe (mdrun -nt 1 -deffnm md)
03:09:14 (85916): gmx.exe exited; CPU time 0.000000
03:09:14 (85916): app exit status: 0xc0000135
03:09:14 (85916): called boinc_finish(195)

</stderr_txt>
]]>
ID: 5027 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Krzysztof Piszczek - wspieram Polski Projekt Boinc
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 8 Nov 10
Posts: 125
Credit: 5,508,546
RAC: 31,111
Message 5028 - Posted: 17 Oct 2017, 7:11:14 UTC - in response to Message 5022.  

No, there is two "AVX256" extensions...
AVX_256 and
AVX2_256

OK, does that mean that Ivy Bridge (i7-3770 for example) is good enough to support the Windows version?

Not really...
From my experience, at least 4 threads of Xeon 1230v3 is needed to needed to fulfil just one GTX 1060 6GB card with Gromacs... But this is very based on input values...
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team and Universe@Home admin.
ID: 5028 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Bill F

Send message
Joined: 15 Aug 17
Posts: 7
Credit: 39,070
RAC: 295
Message 5033 - Posted: 17 Oct 2017, 8:42:29 UTC

Running Windows 7 with an old NVIDIA GeForce 210 (1024MB) driver: 342.01 OpenCL: 1.0 and so far all WU's are failing at about 5 seconds/

<core_client_version>7.6.6</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)
</message>
<stderr_txt>
07:38:37 (11923): wrapper (7.5.26014): starting
07:38:37 (11923): wrapper: running gmx (mdrun -nt 1 -deffnm md)
:-) GROMACS - gmx mdrun, 2016.3 (-:

GROMACS is written by:
Emile Apol Rossen Apostolov Herman J.C. Berendsen Par Bjelkmar
Aldert van Buuren Rudi van Drunen Anton Feenstra Gerrit Groenhof
Christoph Junghans Anca Hamuraru Vincent Hindriksen Dimitrios Karkoulis
Peter Kasson Jiri Kraus Carsten Kutzner Per Larsson
Justin A. Lemkul Magnus Lundborg Pieter Meulenhoff Erik Marklund
Teemu Murtola Szilard Pall Sander Pronk Roland Schulz
Alexey Shvetsov Michael Shirts Alfons Sijbers Peter Tieleman
Teemu Virolainen Christian Wennberg Maarten Wolf
and the project leaders:
Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel

Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2017, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.

GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.

GROMACS: gmx mdrun, version 2016.3
Executable: gmx

-------------------------------------------------------
Program: gmx mdrun, version 2016.3
Source file: src/gromacs/utility/path.cpp (line 241)
Function: static bool gmx::Path::isEquivalent(const string&, const string&)

System I/O error:
Path::isEquivalent called with two invalid files
Reason: No such file or directory
(call to stat() returned error code 2)

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------
07:38:38 (11923): gmx exited; CPU time 0.004000
07:38:38 (11923): app exit status: 0x100
07:38:38 (11923): called boinc_finish(195)

</stderr_txt>
]]>
ID: 5033 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
mmonnin

Send message
Joined: 25 Jan 17
Posts: 34
Credit: 6,961,791
RAC: 16,334
Message 5035 - Posted: 17 Oct 2017, 9:59:38 UTC - in response to Message 5022.  

No, there is two "AVX256" extensions...
AVX_256 and
AVX2_256

OK, does that mean that Ivy Bridge (i7-3770 for example) is good enough to support the Windows version?


Ivy only has AVX. No AVX2.

CPUs with AVX2 per Wiki

Intel
Haswell processor, Q2 2013
Haswell E processor, Q3 2014
Broadwell processor, Q4 2014
Broadwell E processor, Q3 2016
Skylake processor, Q3 2015
Kaby Lake processor, Q3 2016(ULV mobile)/Q1 2017(desktop/mobile)
Coffee Lake processor, Q3 2017
Cannonlake processor, expected in 2018
Cascade Lake processor, expected in 2018
AMD
Excavator processor, Q2 2015
Zen processor, Q1 2017
ID: 5035 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
__Dutch__

Send message
Joined: 25 Sep 17
Posts: 3
Credit: 64,267,866
RAC: 344,124
Message 5037 - Posted: 17 Oct 2017, 11:51:41 UTC

Hey,

I am running the Windows CUDA apps on a selection of GPUs, but even the GeForce 1070s get the WUs with more compute time required than the due date allows. Currently I am receiving WUs that need 1 day and 9 hours of compute time, but are due in 1 day and 3 hours... Especially given that not all my machines are free for BOINC work 24/7, should I just not run your CUDA work?

Cheers,
Dutch
ID: 5037 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
jbest

Send message
Joined: 4 Sep 17
Posts: 3
Credit: 150,228
RAC: 2,395
Message 5038 - Posted: 17 Oct 2017, 13:44:09 UTC - in response to Message 5017.  

Looks like no then sadly....

Can you generate some WUs that can utilise CUDA GPUs but with 'older gen' CPU's? That is if you want max participation....

Cheers.
ID: 5038 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Jim1348

Send message
Joined: 27 Jul 17
Posts: 27
Credit: 85,112
RAC: 29
Message 5039 - Posted: 17 Oct 2017, 14:09:53 UTC - in response to Message 5038.  

Can you generate some WUs that can utilise CUDA GPUs but with 'older gen' CPU's? That is if you want max participation....

I am in the same boat, but even with a GTX 980 under Linux, which takes too long and requires too much CPU support for me. I think a later version of CUDA would work better. CUDA 6.5 works with practically all cards, and runs much better from what I have seen on other projects. Even the Fermi cards have drivers for CUDA 8.0 now. I think if they try to preserve backwards-compatibility beyond that, they will sink the whole boat.
ID: 5039 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Krzysztof Piszczek - wspieram Polski Projekt Boinc
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 8 Nov 10
Posts: 125
Credit: 5,508,546
RAC: 31,111
Message 5040 - Posted: 17 Oct 2017, 14:25:51 UTC - in response to Message 5039.  

At the moment, even my GTX650 (quite old card) receiving tasks (on Linux), but you need to notice, that it is original Gromacs application where I have no way to change it to utilise more extensive GPU. It is depend on WU's really and we will see on real tasks how it utilise GPU on practice (currently, to check if it works correctly we have same tasks for every WU).
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team and Universe@Home admin.
ID: 5040 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
__Dutch__

Send message
Joined: 25 Sep 17
Posts: 3
Credit: 64,267,866
RAC: 344,124
Message 5041 - Posted: 17 Oct 2017, 14:37:52 UTC - in response to Message 5040.  

I see in your OP that work unit times are incorrect according to you, but they are running a long time on high end cards, and report compute sizes upwards of 5 PetaFLOPs. Could you please explain why the deadline on them is incredibly short? The tasks also do not have any clear checkpoints. Are they internally checkpointed?
ID: 5041 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile Krzysztof Piszczek - wspieram Polski Projekt Boinc
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 8 Nov 10
Posts: 125
Credit: 5,508,546
RAC: 31,111
Message 5042 - Posted: 17 Oct 2017, 14:53:15 UTC - in response to Message 5041.  

Next tasks will have longer deadline, for first batch it was setted short as it was very small test batches and I require quick feedback to know that it works.
Last batch had 4 days deadline and new one will have at least a week.
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home project team and Universe@Home admin.
ID: 5042 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile bcavnaugh
Avatar

Send message
Joined: 22 Feb 17
Posts: 2
Credit: 1,173,444
RAC: 4,499
Message 5045 - Posted: 17 Oct 2017, 15:47:40 UTC
Last modified: 17 Oct 2017, 16:21:26 UTC

4 GTX 980 + Core i7-5960X Haswell-E 8-Core 3.0GHz
Testing:

<app_config>
<app>
<name>gmx</name>
<gpu_versions>
<gpu_usage>1.0</gpu_usage>
<cpu_usage>0.2</cpu_usage>
</gpu_versions>
</app>
</app_config>

One odd thing I have found on my 4 GPU Rig 4 Tasks are running but only 1 GPU looks active.
Same here on my 1080 Ti Rig with 2 GPUs
<app_config>
<app>
<name>gmx</name>
<gpu_versions>
<gpu_usage>2.0</gpu_usage>
<cpu_usage>0.2</cpu_usage>
</gpu_versions>
</app>
</app_config>

Some Images



Crunching@EVGA The Number One Team in the BOINC Community.
Folding@EVGA The Number One Team in the Folding@Home Community.
ID: 5045 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
Profile bcavnaugh
Avatar

Send message
Joined: 22 Feb 17
Posts: 2
Credit: 1,173,444
RAC: 4,499
Message 5053 - Posted: 17 Oct 2017, 21:22:06 UTC

1 GTX 1080 Ti + RYZEN 7 1800X
Seems that this runs much better under the RYZEN
Testing:

<app_config>
<app>
<name>gmx</name>
<gpu_versions>
<gpu_usage>1.0</gpu_usage>
<cpu_usage>0.2</cpu_usage>
</gpu_versions>
</app>
</app_config>


Crunching@EVGA The Number One Team in the BOINC Community.
Folding@EVGA The Number One Team in the Folding@Home Community.
ID: 5053 · Rating: 0 · rate: Rate + / Rate - Report as offensive    Reply Quote
1 · 2 · 3 · 4 · Next

Message boards : News : CUDA app testing


©2017 All rights reserved | Design by Digital BioPharm Ltd