Estimated WU duration = 00:00:00 - WU hasn't even started

Message boards : Number crunching : Estimated WU duration = 00:00:00 - WU hasn't even started

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
David Ball

Send message
Joined: 1 May 09
Posts: 2
Credit: 20,346
RAC: 0
Message 1242 - Posted: 9 Sep 2009, 22:38:54 UTC

My system BOINC 6.6.36/Vista 32 bit, on a Core 2 Duo E6420 (2.13Ghz), asked for a bit of work (should have only been an hour or two) and got 42 workunits, all of which have an estimated duration of 00:00:00. My duration correction factor is 53. It's running several projects and DD only has about a 5 % share of the CPU.

It's now started the first WU and has run for 21 minutes but seems to be stuck at 91.66%.

This is on a system that doesn't have a graphics card, just the integrated graphics on the Intel 945G chipset.

I hate to think what the duration correction factor is going to be after it finishes a WU and updates it.
ID: 1242 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Tim Turner
Avatar

Send message
Joined: 1 May 09
Posts: 570
Credit: 184,322
RAC: 0
Message 1243 - Posted: 9 Sep 2009, 23:31:22 UTC - in response to Message 1242.  
Last modified: 9 Sep 2009, 23:47:03 UTC

the progress bar is a system that is going to get worked on eventually, as stated in the other progress bar thread.

these runs of the current p9_xxxx should take no more than 4 hrs even on older cpu's.
i'm willing to bet that your C2D will take about 1-2 hrs if i'm not mistaken. guessing basing off the slow speeds my P-M and p4's take.

Edit: for some reason, my P-Mobile is doing 00:00:01 est duration as well.. i'm guessing that the Flops est. is off.
Tim Turner
Public Relations Admin
Secunia PSI: http://secunia.com/vulnerability_scanning/personal/
If you need help via voice or Convo; PM me and i will give you details on where i will be; Teamspeak, Yahoo Messenger, or Skype.
ID: 1243 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Jack Shultz
Avatar

Send message
Joined: 10 Apr 09
Posts: 503
Credit: 120,150
RAC: 0
Message 1244 - Posted: 10 Sep 2009, 0:02:08 UTC - in response to Message 1243.  

I am going to try some weights for the next batch of autodock. too soon to use the weights on the mdrun.
ID: 1244 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Ageless
Avatar

Send message
Joined: 11 Apr 09
Posts: 172
Credit: 7,631
RAC: 0
Message 1296 - Posted: 14 Sep 2009, 22:10:39 UTC - in response to Message 1295.  

Just FYI about that, I seem to recall reading recently that for late model 6x (maybe older 6x and 5x too) if the TDCF goes over 100 (the max), the client will resort to a kind of 'failsafe' mode of only DL'ing one task per core and running them in EDF.

It'll start asking 1 second worth of work on a DCF of 90 and above. The maximum is 100, it won't go any higher.
Jord

'Cause you seem like an orchard of mines, Just take one step at a time.
ID: 1296 · Rating: 0 · rate: Rate + / Rate - Report as offensive
John McLeod VII
Avatar

Send message
Joined: 26 Aug 09
Posts: 10
Credit: 17,643
RAC: 0
Message 1301 - Posted: 15 Sep 2009, 13:56:34 UTC

If the average TDCF is above 10, please MULTIPLY the averate TDCF by the current FPOPS_EST to get the new FPOPS _EST. It looks like you have been dividing the FPOPS_EST by the average TDCF. At this point, you should probably multiply the FPOPS_EST for the next batch by something well over 100. It is going to take a while for the Duration Correction Factor to fall back in range again.


BOINC WIKI
ID: 1301 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Saenger
Avatar

Send message
Joined: 21 Apr 09
Posts: 53
Credit: 48,639
RAC: 78
Message 1343 - Posted: 18 Sep 2009, 12:16:56 UTC
Last modified: 18 Sep 2009, 13:08:04 UTC

Last time I looked it was at 100 on my machine, a project reset brought it back down to 1.

The main problem with this estimate is that the computer gets loaded with tons of WUs, because they lie to the manager and the scheduler that they will take no time, but soon after starting they will change to panic mode and block the puter for all other projects.
I don't know how this could be done, but the estimate needs to be set to something far higher or the maximum number of WUs per core has to be set to something far lower to keep this project remotely friendly in a multi-project environment.

Best: fix estimate to proper values.
As long as that's not possible: No more than 5 WU per core at the same time.
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki
ID: 1343 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Jack Shultz
Avatar

Send message
Joined: 10 Apr 09
Posts: 503
Credit: 120,150
RAC: 0
Message 1347 - Posted: 18 Sep 2009, 23:00:54 UTC - in response to Message 1343.  

I have a special template now for assignments. These should have high priority to specific hosts. I need to get this debuging in real time rather than have an assignment sit in the queue for hours before I get a response. Also now limiting it to one result per workunit. So hopefully this testing will be invisible going forward.
ID: 1347 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Saenger
Avatar

Send message
Joined: 21 Apr 09
Posts: 53
Credit: 48,639
RAC: 78
Message 1351 - Posted: 19 Sep 2009, 9:31:32 UTC
Last modified: 19 Sep 2009, 9:33:09 UTC

Before I set it to accept work again: Have you as well limited the amount of WUs per core significantly? Otherwise it's bad behaviour towards the other dozen projects on my computer will not make it possible to reactivate it. As long as I got 200 WUs at once this project is plainly not working properly.

The problem is not the number of results per WU but the number of results per core, that makes the computer unavailable for other projects.
Restrict it to no more than 5 per core at the same time, an it should be fine.
ID: 1351 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Jack Shultz
Avatar

Send message
Joined: 10 Apr 09
Posts: 503
Credit: 120,150
RAC: 0
Message 1360 - Posted: 20 Sep 2009, 18:44:54 UTC - in response to Message 1351.  

I changed this from 50 to 20

<max_wus_in_progress>
20
</max_wus_in_progress>

ID: 1360 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Tim Turner
Avatar

Send message
Joined: 1 May 09
Posts: 570
Credit: 184,322
RAC: 0
Message 1361 - Posted: 20 Sep 2009, 18:49:21 UTC - in response to Message 1360.  

ouch! um, that'll under feed my quad but, i'll just put a cpdn model on 1 core and let the other 3 crunch dd@h and H@H.
Tim Turner
Public Relations Admin
Secunia PSI: http://secunia.com/vulnerability_scanning/personal/
If you need help via voice or Convo; PM me and i will give you details on where i will be; Teamspeak, Yahoo Messenger, or Skype.
ID: 1361 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Jack Shultz
Avatar

Send message
Joined: 10 Apr 09
Posts: 503
Credit: 120,150
RAC: 0
Message 1362 - Posted: 20 Sep 2009, 19:54:23 UTC - in response to Message 1361.  

If there is a prefered project setting like a per cpu, I don't see it but look for yourself here http://boinc.berkeley.edu/trac/wiki/ProjectOptions
ID: 1362 · Rating: 0 · rate: Rate + / Rate - Report as offensive
zombie67 [MM]
Volunteer tester
Avatar

Send message
Joined: 25 Apr 09
Posts: 58
Credit: 1,785,257
RAC: 1,359
Message 1413 - Posted: 29 Sep 2009, 2:17:42 UTC

Completion estimates are still only 00:00:01. And this is with a DCF of 100.

Can something please be done? DD@H is constantly in panic mode.
@zombie_67

ID: 1413 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Jack Shultz
Avatar

Send message
Joined: 10 Apr 09
Posts: 503
Credit: 120,150
RAC: 0
Message 1415 - Posted: 29 Sep 2009, 14:57:22 UTC - in response to Message 1413.  

I'm updating the estimates again
ID: 1415 · Rating: 0 · rate: Rate + / Rate - Report as offensive
zombie67 [MM]
Volunteer tester
Avatar

Send message
Joined: 25 Apr 09
Posts: 58
Credit: 1,785,257
RAC: 1,359
Message 1435 - Posted: 1 Oct 2009, 2:28:05 UTC

FWIW, I got more tasks again today. No change. Still 00:00:01 with a DCF of 100.
@zombie_67

ID: 1435 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Ageless
Avatar

Send message
Joined: 11 Apr 09
Posts: 172
Credit: 7,631
RAC: 0
Message 1437 - Posted: 1 Oct 2009, 14:16:41 UTC

I'm doing a run of ~30 tasks on one of my computers to get a good average for the number of fpops we plan to give out on Autodock/MGL tasks. This will take another couple of hours.
Jord

'Cause you seem like an orchard of mines, Just take one step at a time.
ID: 1437 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Tim Turner
Avatar

Send message
Joined: 1 May 09
Posts: 570
Credit: 184,322
RAC: 0
Message 1438 - Posted: 1 Oct 2009, 21:18:55 UTC - in response to Message 1435.  

FWIW, I got more tasks again today. No change. Still 00:00:01 with a DCF of 100.



you got your wish, perhaps a little too much IMO.
Tim Turner
Public Relations Admin
Secunia PSI: http://secunia.com/vulnerability_scanning/personal/
If you need help via voice or Convo; PM me and i will give you details on where i will be; Teamspeak, Yahoo Messenger, or Skype.
ID: 1438 · Rating: 0 · rate: Rate + / Rate - Report as offensive
zombie67 [MM]
Volunteer tester
Avatar

Send message
Joined: 25 Apr 09
Posts: 58
Credit: 1,785,257
RAC: 1,359
Message 1439 - Posted: 1 Oct 2009, 22:09:10 UTC - in response to Message 1438.  

FWIW, I got more tasks again today. No change. Still 00:00:01 with a DCF of 100.



you got your wish, perhaps a little too much IMO.


? I don't understand.
@zombie_67

ID: 1439 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Ageless
Avatar

Send message
Joined: 11 Apr 09
Posts: 172
Credit: 7,631
RAC: 0
Message 1440 - Posted: 1 Oct 2009, 22:18:22 UTC

We have some new estimate numbers to test. Please let us know if we're in the ballpark for your system.
Jord

'Cause you seem like an orchard of mines, Just take one step at a time.
ID: 1440 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Tim Turner
Avatar

Send message
Joined: 1 May 09
Posts: 570
Credit: 184,322
RAC: 0
Message 1441 - Posted: 1 Oct 2009, 23:07:12 UTC - in response to Message 1440.  
Last modified: 1 Oct 2009, 23:15:40 UTC

We have some new estimate numbers to test. Please let us know if we're in the ballpark for your system.


22 hrs estimate for my laptop which runs them in 15 minutes, 12 hrs estimate for my quad... which runs in 7-10 minutes...

if it's a ball park were talking about... those numbers and estimates are way off... like Pluto's orbit and we are on earth, that far off.

and these new estimates; it makes the work request less frequent... i had to bump up my work cache on laptop to 3 days to get more than 2 wu's.

this is of course with autodock...

laptop:
1406 floating point mips
2270 integer mips

quad
Number of CPUs: 4
2290 floating point MIPS (Whetstone) per CPU
4841 integer MIPS (Dhrystone) per CPU
Tim Turner
Public Relations Admin
Secunia PSI: http://secunia.com/vulnerability_scanning/personal/
If you need help via voice or Convo; PM me and i will give you details on where i will be; Teamspeak, Yahoo Messenger, or Skype.
ID: 1441 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Ageless
Avatar

Send message
Joined: 11 Apr 09
Posts: 172
Credit: 7,631
RAC: 0
Message 1448 - Posted: 2 Oct 2009, 9:35:58 UTC - in response to Message 1441.  
Last modified: 2 Oct 2009, 9:47:03 UTC

22 hrs estimate for my laptop which runs them in 15 minutes, 12 hrs estimate for my quad... which runs in 7-10 minutes...

and these new estimates; it makes the work request less frequent... i had to bump up my work cache on laptop to 3 days to get more than 2 wu's.

There is of course that little factor of the Task Duration Correction Factor (TDCF) to consider. What's yours at? 100, or something else as high I would think. Which then shows those large estimates.

So before you get new work, reset the project. This will reset the DCF.

Or if you're handy with Notepad, exit BOINC completely, open client_state.xml
CTRL + F
type drugdisco
press Enter.
Find the line <duration_correction_factor>xxx.xxxxxx</duration_correction_factor> and change its number to 1.000000
Save client_state.xml file.
Restart BOINC.

Now, these estimates will only be for Autodock with MGL, not for Autodock, mdrun or anything else.

Resetting the project will also remove all the applications, so you have to download those again. At least until you've reset that number, you won't get tons of work with a Zero estimate, which will run in EDF. :-)

I've just reset my TDCF, got one task and its estimate is 14 minutes and a bit. That is off for my system, but it'll learn.
Jord

'Cause you seem like an orchard of mines, Just take one step at a time.
ID: 1448 · Rating: 0 · rate: Rate + / Rate - Report as offensive
1 · 2 · Next

Message boards : Number crunching : Estimated WU duration = 00:00:00 - WU hasn't even started


©2017 All rights reserved | Design by Digital BioPharm Ltd