Upload, Download and WU Errors

Message boards : Number crunching : Upload, Download and WU Errors

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 11 · Next

AuthorMessage
STE\/E

Send message
Joined: 1 May 09
Posts: 37
Credit: 720,416
RAC: 29
Message 482 - Posted: 28 May 2009, 17:25:39 UTC - in response to Message 472.  
Last modified: 28 May 2009, 17:31:25 UTC

I'm trying to give credit where you guys get canceled. Let me know if I'm missing something.


I don't think it'll do me any good because I followed Jord's advice, I didn't Abort them but just reset most of my Box's so the WU's never got Reported as Finished ...
ID: 482 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Ageless
Avatar

Send message
Joined: 11 Apr 09
Posts: 172
Credit: 7,631
RAC: 0
Message 490 - Posted: 28 May 2009, 19:44:58 UTC - in response to Message 483.  

At this point in time it doesn't matter if you aborted or reset. I think that all those tasks trying to report will make your computer slow as well, so in your case you did the right thing, Steve.

Now, could you please set a lower cache on this project, to prevent this from happening again?
Jord

'Cause you seem like an orchard of mines, Just take one step at a time.
ID: 490 · Rating: 0 · rate: Rate + / Rate - Report as offensive
STE\/E

Send message
Joined: 1 May 09
Posts: 37
Credit: 720,416
RAC: 29
Message 492 - Posted: 28 May 2009, 19:53:40 UTC - in response to Message 483.  
Last modified: 28 May 2009, 20:03:43 UTC

I'm trying to give credit where you guys get canceled. Let me know if I'm missing something.


I don't think it'll do me any good because I followed Jord's advice, I didn't Abort them but just reset most of my Box's so the WU's never got Reported as Finished ...


Wait a second...

Don't go blaming Jord for that. He said to abort what you had onboard, not reset the project. There's a biiiiig difference.

The part about not being able to report was kind of misleading though, but that just meant the scheduler was disabled at the time.

Alinator


First of all I didn't Blame Jord for anything other than I followed his advice, I didn't say he gave me bad advice or anything just that I followed it. I was ready to Abort/Reset them anyway because for the Second time in as many weeks this Project had totally screwed up my Farm. It was something I had to do one way or the other because my Farm was @ a Standstill from all the WU's trying to Upload/Report.
ID: 492 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Ageless
Avatar

Send message
Joined: 11 Apr 09
Posts: 172
Credit: 7,631
RAC: 0
Message 494 - Posted: 28 May 2009, 20:02:16 UTC - in response to Message 492.  

Kiss and make up, Al and Steve. Go on, you know you want to. :-)
Jord

'Cause you seem like an orchard of mines, Just take one step at a time.
ID: 494 · Rating: 0 · rate: Rate + / Rate - Report as offensive
STE\/E

Send message
Joined: 1 May 09
Posts: 37
Credit: 720,416
RAC: 29
Message 495 - Posted: 28 May 2009, 20:02:48 UTC - in response to Message 490.  
Last modified: 28 May 2009, 20:15:09 UTC

At this point in time it doesn't matter if you aborted or reset. I think that all those tasks trying to report will make your computer slow as well, so in your case you did the right thing, Steve.

Now, could you please set a lower cache on this project, to prevent this from happening again?


Jord, I find it Weird that my Cache setting would have anything to do with being able to Upload WU's, that must be some new fangled thing with one of the newer releases of BOINC but I'm still running v6.5.0 so it shouldn't be affected ... ;)

Anywho my Cache is only @ .5 Day's now, I already short myself by running that low because the GPUGrid Project won't give me my 1 Per Core with a setting that low but I could try a Lower Resource Share Setting & see what happens, having a lower Resource Share to the Project should make Uploading WU's much easier ... ;)

PS: I just discovered I still have some WU's left on a few Box's, for some reason they didn't get Reset like the rest. Should I now Abort/Reset them or what. It looks like they were all sent to me @ 28 May 2009 16:31:23 UTC time.

Also is it safe to Download more WU's if there is any to Download ???
ID: 495 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Ageless
Avatar

Send message
Joined: 11 Apr 09
Posts: 172
Credit: 7,631
RAC: 0
Message 496 - Posted: 28 May 2009, 20:15:04 UTC - in response to Message 495.  

You could just attach one computer (per platform) for testing purposes, until the major kinks have been ironed out. We're not using a GPU platform (yet), so if you have a system without a CUDA card, attach that one only.

Let your high rollers do the other projects you're attached to without giving them the burden to take on a starting up alpha project with all its problems as well.
Jord

'Cause you seem like an orchard of mines, Just take one step at a time.
ID: 496 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Ageless
Avatar

Send message
Joined: 11 Apr 09
Posts: 172
Credit: 7,631
RAC: 0
Message 499 - Posted: 28 May 2009, 20:51:43 UTC - in response to Message 495.  

PS: I just discovered I still have some WU's left on a few Box's, for some reason they didn't get Reset like the rest. Should I now Abort/Reset them or what. It looks like they were all sent to me @ 28 May 2009 16:31:23 UTC time.

Those are probably newly downloaded tasks or resends. When you reset, it'll delete all tasks in progress. Although the project had at one time have the "resend lost results" option on on the server, that may still be the case.

I just put DD on NNT for the time being (for today).

Also is it safe to Download more WU's if there is any to Download ???

I'm not sure how far Jack is with reparations. If you don't trust it, try it on one machine only and leave the rest on NNT.
Jord

'Cause you seem like an orchard of mines, Just take one step at a time.
ID: 499 · Rating: 0 · rate: Rate + / Rate - Report as offensive
STE\/E

Send message
Joined: 1 May 09
Posts: 37
Credit: 720,416
RAC: 29
Message 500 - Posted: 28 May 2009, 21:01:57 UTC
Last modified: 28 May 2009, 21:02:56 UTC

I just got rid of them via Reset, none were running anyway because I suspended them when I seen them earlier. I left my lone none GPU Capable Box calling for work to see if it gets any & if it does it won't interfere with the GPU Capable ones.
ID: 500 · Rating: 0 · rate: Rate + / Rate - Report as offensive
STE\/E

Send message
Joined: 1 May 09
Posts: 37
Credit: 720,416
RAC: 29
Message 507 - Posted: 30 May 2009, 9:59:38 UTC

Still having major problems Uploading Finished WU's, it takes over an hour to Upload just 1 WU. Even with Cache Settings of .1 your going to get a Dozen or more WU's, then 4 of them take off running in High Priority Mode, finish & clog up your System trying to Upload the 60mb+ File.

I had to Abort 6 or so Finished WU's last night again on 1 system because just the 6 of them had been trying to Upload for 1.5 hours already and were only 25% done. Then they started to freeze the system & it rebooted it's self a few times so I just aborted them and the box ran fine again.

On another box I took a chance and let the 4 that were trying to Upload to continue to do so while I went to bed for the night. I suspended the other Drug WU's though, when I got up the 4 that were trying to upload were gone & Reported back to the Server. I'm trying to run just 1 @ a time right now but it's hard to monitor all box's to make sure they don't run to many & clog up the systems with uploads.
ID: 507 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Jack Shultz
Avatar

Send message
Joined: 10 Apr 09
Posts: 503
Credit: 120,150
RAC: 0
Message 510 - Posted: 2 Jun 2009, 1:09:38 UTC - in response to Message 509.  

I'll first say that I think we won't need as much data as we have been collecting so far with GROMACS. So future uploads should be quite easier. With regards to estimating FPOPS, yes I'll change that one up next time. I was trying to gauge it.
ID: 510 · Rating: 0 · rate: Rate + / Rate - Report as offensive
zombie67 [MM]
Volunteer tester
Avatar

Send message
Joined: 25 Apr 09
Posts: 58
Credit: 1,785,257
RAC: 4
Message 533 - Posted: 5 Jun 2009, 10:31:01 UTC

I can't upload completed tasks. "Server is out of disk space."
@zombie_67

ID: 533 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Jack Shultz
Avatar

Send message
Joined: 10 Apr 09
Posts: 503
Credit: 120,150
RAC: 0
Message 536 - Posted: 5 Jun 2009, 10:39:52 UTC - in response to Message 533.  

apparently the trajectory files just take too much disk space :-(
ID: 536 · Rating: 0 · rate: Rate + / Rate - Report as offensive
X1900AIW

Send message
Joined: 1 May 09
Posts: 1
Credit: 4,129
RAC: 0
Message 537 - Posted: 5 Jun 2009, 11:27:48 UTC

Serious problems in the upload process: finished upload are restarting again and again ... crunching is faster than uploading the results ?

ID: 537 · Rating: 0 · rate: Rate + / Rate - Report as offensive
zombie67 [MM]
Volunteer tester
Avatar

Send message
Joined: 25 Apr 09
Posts: 58
Credit: 1,785,257
RAC: 4
Message 538 - Posted: 5 Jun 2009, 11:47:15 UTC - in response to Message 536.  

apparently the trajectory files just take too much disk space :-(


What's the solution? You need to buy more drives? Please don't say you'll have to delete the p0rn! ;)
@zombie_67

ID: 538 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Jack Shultz
Avatar

Send message
Joined: 10 Apr 09
Posts: 503
Credit: 120,150
RAC: 0
Message 539 - Posted: 5 Jun 2009, 12:23:50 UTC - in response to Message 538.  

funny thing, for the same exact paramenters, some people upload about 9 MB and other people upload about 40 MB ???

Wierd...I have not been able to determine if there is a correlation between platforms. I will be monitoring this.
ID: 539 · Rating: 0 · rate: Rate + / Rate - Report as offensive
STE\/E

Send message
Joined: 1 May 09
Posts: 37
Credit: 720,416
RAC: 29
Message 540 - Posted: 5 Jun 2009, 13:44:34 UTC
Last modified: 5 Jun 2009, 13:50:53 UTC

The Project should Package a Bottle of Draino with each WU, I have about 60 WU's that have each been trying to Upload about 38.5kb for the last 3-4 hour's now & in the process dragging my Systems and the other Projects down. I hate to abort that many Finished WU's but I need my Box's back to at least run the Projects you can return WU's to ... 0_o

Even Suspending all Box's but 1 with only 1 or 2 WU's to Upload does no good, they simply refuse to Upload or take 1/2 a day to do it if at all ...
ID: 540 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Tim Turner
Avatar

Send message
Joined: 1 May 09
Posts: 570
Credit: 184,322
RAC: 0
Message 567 - Posted: 6 Jun 2009, 19:07:07 UTC
Last modified: 6 Jun 2009, 19:08:51 UTC

just making sure that you want us to run gromac's .07 Run_1244?

i have 1 error so far.
1 of 4 have run.....

error message.... output file..
6/6/2009 2:35:15 PM DrugDiscovery [error] Can't rename output file run_1244238373618045179_2_0 to projects/boinc.drugdiscoveryathome.com/run_1244238373618045179_2_0: Error 2
Tim Turner
Public Relations Admin
Secunia PSI: http://secunia.com/vulnerability_scanning/personal/
If you need help via voice or Convo; PM me and i will give you details on where i will be; Teamspeak, Yahoo Messenger, or Skype.
ID: 567 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Heavy Metal Dungeon Keeper

Send message
Joined: 2 May 09
Posts: 22
Credit: 81,216
RAC: 0
Message 568 - Posted: 6 Jun 2009, 20:22:41 UTC

Linux WUs on Ubuntu 9.04 x64 have compure error:

06/06/2009 20:25:27 DrugDiscovery [error] Can't rename output file run_1244238375067792285_1_0 to projects/boinc.drugdiscoveryathome.com/run_1244238375067792285_1_0: Error -1

ID: 568 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Jack Shultz
Avatar

Send message
Joined: 10 Apr 09
Posts: 503
Credit: 120,150
RAC: 0
Message 570 - Posted: 7 Jun 2009, 4:16:02 UTC - in response to Message 567.  

Because these are stochastic modeling techniques, it does not surprise me that certain runs fail. keep crunching, you should get the credit eventually.
ID: 570 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Profile Saenger
Avatar

Send message
Joined: 21 Apr 09
Posts: 53
Credit: 52,036
RAC: 44
Message 593 - Posted: 12 Jun 2009, 19:56:54 UTC

Same here since some time:
Fr 12 Jun 2009 02:03:56 CEST|DrugDiscovery|Starting run_1244238480638357728_1
Fr 12 Jun 2009 02:03:56 CEST|DrugDiscovery|Starting task run_1244238480638357728_1 using gromacs version 205
Fr 12 Jun 2009 02:04:04 CEST|DrugDiscovery|[error] Can't rename output file run_1244238480638357728_1_0 to projects/boinc.drugdiscoveryathome.com/run_1244238480638357728_1_0: Error -1
Fr 12 Jun 2009 02:04:04 CEST|DrugDiscovery|Computation for task run_1244238480638357728_1 finished

It's result #619643, the stderr.out is:
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)
</message>
<stderr_txt>
wrapper: starting
TASK::parse(): unexpected text 6666666666
TASK::parse(): unexpected text 6666666666
TASK::parse(): unexpected text 6666666666
TASK::parse(): unexpected text 6666666666
TASK::parse(): unexpected text 6666666666
TASK::parse(): unexpected text 66666666666
TASK::parse(): unexpected text 6666666666
TASK::parse(): unexpected text 66666666666
TASK::parse(): unexpected text 6666666666
TASK::parse(): unexpected text 6666666666666
TASK::parse(): unexpected text 66666666666
wrapper: running ../../projects/boinc.drugdiscoveryathome.com/unzip_0.200_x86_64-pc-linux-gnu (-qq -n gromacs.zip)
wrapper: running gromacs/bin/pdb2gmx_d.exe (-ff amber03 -water tip3p)
gromacs/bin/pdb2gmx_d.exe: relocation error: ./libc.so.6: symbol _dl_tls_get_addr_soft, version GLIBC_PRIVATE not defined in file ld-linux-x86-64.so.2 with link time reference
app exit status: 0x7f00
called boinc_finish

</stderr_txt>
]]>
ID: 593 · Rating: 0 · rate: Rate + / Rate - Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 11 · Next

Message boards : Number crunching : Upload, Download and WU Errors


©2018 All rights reserved | Design by Digital BioPharm Ltd