RitualClarity Posted January 17, 2014 Posted January 17, 2014 The added bubbly wraps might have helped protect it but could aid in moisture building up. Chance are however it wasn't the packing. The varying serial# are a good thing. Still only one hard drive known to be bad. If only one is giving errors and trouble than that can't really be a MOBO error, cable and connections yes but you checked those. You seem to be doing what I normally do in situations like this when something horrible happens.. i expect more horrible things to happen and start imagining the worst is yet to come.
prideslayer Posted January 17, 2014 Author Posted January 17, 2014 Oh.. no no no, I'm not being paranoid. I'm having a real nightmare scenario here. Let me reexplain. It seems that no matter what combination of drives I put in the array, a rebuild *always* fails. The only solution to this problem that has worked thus-far is for me to go into the RAID BIOS, delete the array, recreate it, and then restore from backup. The fact that the backups had been failing since Dec 29th was just icing on the cake. The whole point of a RAID-10 is that if any disk (or up to half of them, if it's the *right* half) can fail and I will not have to restore from backup, because the array can rebuild. This is why I use RAID-10 and not -5. So yes, only one of the WDs has turned out to be actually defective, but I'm still having RAID problems -- without the WDs being involved at all. Right now the backup is restoring. I'd been trying all day to rebuild the array without restoring from backup, so I could pull out the single WD that was still in there, and it failed every time -- leading to checking the cabling and all that. When it's done, I'll pop a drive out again, and try to rebuild again, just as a test. I have 5 of these seagate 500G drives, but only use 4 in an array -- never a bad idea to have a spare on hand. My bet is that the array will fail to rebuild again, for no reason I have been able to discover. There haven't been any BIOS updates for my motherboard in ages. When this is done I'll check for new chipset drivers, but I think they're up to date.
panthercom Posted January 17, 2014 Posted January 17, 2014 I knew about the synchronous motors being used in the old electromechanical clocks (and Hammond organs); didn't know the trick had been carried over to digital clocks. The company I work for makes several products with internal clocks. The cheaper model drifts as much as an hour a year because the engineer specified a cheaper crystal It seems like I read that the government was thinking of relaxing the standards for cycles per second accuracy, which did not amuse me at all since I have some vintage reel to reel tape recorders with synchronous motors. I'd feel bad for the Hammond organ players, but most of them are snobbish towards the guys that play the old transistorized organs like me. (my nick is a shortened version of Panther Combo Organ; which is a funky old Italian transistor organ) Don't tell me about MP3's; they're the crappy cassette tape of the 21st century I downloaded some MP3's of stuff I had on vinyl, and was appalled at the sound quality. Of course the kids are blowing their hearing out with their earbuds and mp3 players so they probably won't be able to tell the difference anyway. Clocks that run too fast are probably a result of using cheap crystals for the timekeeper circuit; they come in different tolerances A/C powered clocks typically do not use crystals or other timing circuits, they use the AC power cycle rate. In the US, the power companies regulate this so it's within some tolerance every day that's better than all but the best crystals and the circuitry is much cheaper.See here and here.. It's a well-kept secret that they rely on the HUMAN Brain and auditory system to filter out all the high frequency garbage they emit; This is true not just of amps, but of even the MP3 format (and others). "psychoacoustic analysis" is the term, and it means "the average human brain will filter out the screwed up bits and fill in the gaps."
RitualClarity Posted January 17, 2014 Posted January 17, 2014 The fact that the backups had been failing since Dec 29th was just icing on the cake. The whole point of a RAID-10 is that if any disk (or up to half of them, if it's the *right* half) can fail and I will not have to restore from backup, because the array can rebuild. This is why I use RAID-10 and not -5. Yes. That is the best solution for both speed and protection. Far better than 01 or 5. Cost a little more per Gig to operate but definatly worth it .. normally. It seems that no matter what combination of drives I put in the array, a rebuild *always* fails. The only solution to this problem that has worked thus-far is for me to go into the RAID BIOS, delete the array, recreate it, and then restore from backup. That shouldn't happen at all. You shouldn't have to recreate the raid array. You should never have to recreate the raid array unless you desire changing something. The fact that the array always fails regardless of what combination of drives in the array. With 10 you should be fine with a single drive failure and even two if they are the "right" two. When it's done, I'll pop a drive out again, and try to rebuild again, just as a test. I have 5 of these seagate 500G drives, but only use 4 in an array -- never a bad idea to have a spare on hand. My bet is that the array will fail to rebuild again, for no reason I have been able to discover. I agree with you. You can go out and buy from Bestbuy or wherever 4 *Proper* drives ( TILER etc) and you will still have a problem. Sound like the Raid Chip on the MOBO is dying. Do you have another desktop or system that can do raid10? If so try that system. I bet it will recover just fine on that unit. There haven't been any BIOS updates for my motherboard in ages. When this is done I'll check for new chipset drivers, but I think they're up to date. Even if they weren't up to date they worked before. I am not a software engineer however I doubt that suddenly it would start to fail. because the bios or drivers are old. The machine should still function. Courrpted drivers could be a very valid cause. Slim but possible.
prideslayer Posted January 17, 2014 Author Posted January 17, 2014 I knew about the synchronous motors being used in the old electromechanical clocks (and Hammond organs); didn't know the trick had been carried over to digital clocks. The company I work for makes several products with internal clocks. The cheaper model drifts as much as an hour a year because the engineer specified a cheaper crystal Well they're not motors anymore but yeah, the concept is still used in digital clocks. They were/are talking about relaxing the regulations, but I don't think it'll be terrible if they do it. I believe they are only talking about how much it can drift during the course of a day; they will still make phase adjustments to bring it back in line, they just won't do them as often. here is one of the ICs. I agree with you. You can go out and buy from Bestbuy or wherever 4 *Proper* drives ( TILER etc) and you will still have a problem. Sound like the Raid Chip on the MOBO is dying. Do you have another desktop or system that can do raid10? If so try that system. I bet it will recover just fine on that unit. I do, and I might try that, but right now I just need to get it back up and running so I can work tomorrow -- not on this laptop. It was very nice today to be back on my desktop with the real keyboard and all the screens. Even if they weren't up to date they worked before. I am not a software engineer however I doubt that suddenly it would start to fail. because the bios or drivers are old. The machine should still function. Courrpted drivers could be a very valid cause. Slim but possible. It still works fine, I've never even had to try a rebuild before. Since this is onboard raid that's not a proper raid controller (LSI, mylex, adaptec, etc), the raid "stuff" is actually done by the driver, using the CPU. A bug in the driver could exactly this kind of problem. It hasn't suddenly started to fail -- it was working fine until I started popping drives and trying to rebuild the array. One of the seagates, I think, was starting to fail. That's what prompted me to order new drives in the first place.
RitualClarity Posted January 17, 2014 Posted January 17, 2014 Since this is onboard raid that's not a proper raid controller (LSI, mylex, adaptec, etc), the raid "stuff" is actually done by the driver, I hadn't thought you had a proper raid card. A bug in the driver could exactly this kind of problem. It hasn't suddenly started to fail -- it was working fine until I started popping drives and trying to rebuild the array. Ah, I see. Yes, the driver could work perfectly for years without the added stresses of rebuilding the array. You never tested the raid system to verify that it was fully functional and could rebuild the array so you wouldn't know if there was a bug or issue until now... Hence the current issue. I hope it is just a simple driver issue that is an easy fix. You are fast running out of options for a solution. Edit: It was very nice today to be back on my desktop with the real keyboard and all the screens. Triple screens FTW...
prideslayer Posted January 17, 2014 Author Posted January 17, 2014 Oh, there's always The Final Solution. Get a real RAID card.
RitualClarity Posted January 17, 2014 Posted January 17, 2014 A raid card would be much better. Probably handle the error corrections and rebuild much faster as well.
prideslayer Posted January 17, 2014 Author Posted January 17, 2014 Oh software raid is always faster.. way faster. Those little 800mhz ARM processors and i960s and such cannot keep up with a real processor. But if I can't do something as simple as a rebuild without one.. well.. what's the point. I'll be restoring from backup anyway, so may just go RAID-0!
RitualClarity Posted January 17, 2014 Posted January 17, 2014 I was referring to those professional and semi professional raid cards not the common retail cards that most people buy. However I might have to revisit the software raid. My info was outdated. Now a days with the current processor s the overhead of running even a very large array isn't that bad on a multicore system. The added benefit of not having to search high and low for the exact chip set for recovery if the card goes bad is also a blessing. Raid 0.. you are joking correct..? Wouldn't you have some issues with low level corruption entering into your data. That data that later would be copied into your backups?
prideslayer Posted January 17, 2014 Author Posted January 17, 2014 Eh those types of errors can make it into just about any raid level, including -10 or -5. If just one disk writes the wrong bit in a -10, there's no authority to say which one is the right one when it's read back up. In a -5, you just won't know about it until you lose a drive. Such errors are pretty rare though, and I can always extend my backup retention time. Cardwise, it's Adaptec, LSI, 3ware, or Intel for me. Mylex was absorbed by IBM then sold off to LSI, and they were always my favorite brand. Those Cheetah's are hanging off an AcceleRAID 170. ICP-Vortex, my othe favorite, got gobbled up by Adaptec.
RitualClarity Posted January 17, 2014 Posted January 17, 2014 ZFS system FTW so I have heard. Can fix errors very well and keep the data intact. Can have up to 3 separate "pools" of data. Really only wants software raid or jobd sata cards.
prideslayer Posted January 17, 2014 Author Posted January 17, 2014 I use ZFS on my FreeNAS boxes @ work. It takes *gobs* of memory to run efficiently, but it does have some nice features. The only reason I actually use it though is because it's the only fast and reliable way to expand a filesystem, online, with no downtime. It lets me give the database, fileservers, etc just the space they need -- and expand it as they grow. Of course it's all backed by a fibre SAN running RAID10. I'm no fool!
Halstrom Posted January 17, 2014 Posted January 17, 2014 Don't tell me about MP3's; they're the crappy cassette tape of the 21st century I downloaded some MP3's of stuff I had on vinyl, and was appalled at the sound quality. Of course the kids are blowing their hearing out with their earbuds and mp3 players so they probably won't be able to tell the difference anyway. MP3 might not be Audiophile quality but it's good enough for me, I like putting 180 tracks on one CD in the car, not sure it makes that much difference listening to Metallica and Nivarna Cassette was far worse, the fun with alignment, on VIC20 & C64 Cassette drives was always a ball of fun. I can also remember we spent a week on a mates car stereo replacing speakers and speaker wiring because we kept losing the left side speakers to then find it turned out to be the Jimmy Barnes tape he kept playing had lost the left channel
windpl Posted January 17, 2014 Posted January 17, 2014 What I learned from using Norton software. Anything that Norton released after 2000y, burn it with fire!
prideslayer Posted January 17, 2014 Author Posted January 17, 2014 Instead of burning it with fire.. I've upgraded to the trial of system recovery 2013. I have done one backup and full restore with it and so far, so good. The SRD is 64bit now so no driver problems, yay!
RitualClarity Posted January 17, 2014 Posted January 17, 2014 But still the drive problem ? Or have you fixed / found the issue yet.
prideslayer Posted January 17, 2014 Author Posted January 17, 2014 I am still investigating. The NEW new drives also arrive later today. Need to work now so I have it up and running on the old drives, not going to do any more drive popping or testing until after hours. I do have a media patrol running on one of the old ones that I think might be the original bad one, but I doubt it'll finish before I've replaced it -- that's a slow process. I did check drivers last night, all up to date. I will probably just order a realraid card anyway and just ignore the southbridge. The only other thing on it is my BD and I've never had any problems with that, not that I use it very often. Well unless I'm booting off a CD to restore from backup that is.
RitualClarity Posted January 17, 2014 Posted January 17, 2014 here's to hoping that the new drives do the trick.
prideslayer Posted January 18, 2014 Author Posted January 18, 2014 Well after more and more fighting, I'm still... dissatisfied with things. Rebuild still fail even with the new drives, or at least, rebuilding to a dissimilar drive does. Out of curiosity, I will try rebuilding the new array later on by popping and reinserting a drive, and see what happens. It's worked in the past on the old drives without issue -- every failure I've had in rebuilding (that was not to the one defective WD) was when I put a new drive of a different (but larger) size in, and told the array to rebuild with it. With all new drives installed and restored from backup, bad sectors started popping up on one of the new drives, according to the raid controller. I: - Removed it from the drive cage and hooked it up directly, still got errors. - Swapped cables, still got errors. - Moved it to another channel, errors followed the drive. At this point you'd think that perhaps I got another bad drive. That wouldn't be impossible, but given the issues I've had, I doubt it's the case. So I gave in and ordered a real RAID controller. I looked at the top brands (except highpoint.. they are supposedly good now, but I refuse) for what I wanted, and settled on an Areca ARC-1224-8i, w/ the battery backup. It'll just have my 4 drives on it for now, but in the future I can grow into it by replacing them with SAS drives/backplanes, adding more drives, etc. Should be pretty snappy with 1G of cache too. Stay tuned for inevitable problems moving my Win7 install over to this some time next week.
prideslayer Posted January 19, 2014 Author Posted January 19, 2014 Well raid card still on order (obviously) but more weirdness has made me decide it's a motherboard problem, so GO GO GADGET WALLET. Byebye ancient processor and motherboard, hello new. Should get here about the same time as the raid.
RitualClarity Posted January 19, 2014 Posted January 19, 2014 What you get? Did you get another AMD or did you go for Intel?
prideslayer Posted January 19, 2014 Author Posted January 19, 2014 I decided months ago that next time I switched, I'd go with Intel. AMD has really dropped the ball in terms of performance over the past few years. This PC was built in 2009, so it is a little on the 'old' side. PCIe 2.0, USB 2.0, etc. The one thing that almost made me stick with AMD is the fact that many of their boards support up to 64G of memory, while the intel consumer chipset tops out at 32. Anyway, ordered: Intel Z87 Xeon E3-1270V3 4x8G kit I'm not a crazy (PC) modder/overclocker/etc anymore, and my requirements for a motherboard are a bit different than most gamers/enthusiasts; e.g. I want an intel NIC onboard, or at least not a realtek. Some have a qualcomm which I could give a chance, their wireless chips are great. If I could get one with no onboard raid, sound, or video I'd be all over it.
pantherlux Posted January 19, 2014 Posted January 19, 2014 wow, a lot of fucked up tech-stuff in such a short time... i would have done go-go-gadgeto-pc-to-the-wall at that point already oO respect to stay that calm the WD problem sounds so familiar to me, had to send back 2 drives (and the third... lets just say i dont know where it landed) gone back to seagate: all fine. even my stoneage scsi drives (about 1 to 2 gib) still work. i hope your new system wont fail you in a long time and your ass is safe then (reading the thing with your pc is your working equipment) just a quick idea: maybe the backup software crashes the drives? im no tech-pro with that kind of stuff, but is it possible that it writes an endmarker or something like that to the drive (like in the times where we had autoreversemarkers for tapemachines and so) and when the size changes the backupsoftware compares the drives and thinks "hey, thats odd, theres still room to place stuff, ok, lets do this" and then it just fills them up with unusable datas and reads them incorrect after that. to be honest: i dont use backup software, i only copy my datas, music, photos and mods/games etc to a external drive and replace them after some time with the new ones thats it. so it was just a quick idea, pls dont hit me for that
prideslayer Posted January 19, 2014 Author Posted January 19, 2014 Hah.. my not-as-old seagates are still going strong too, ST336607LC's (U320 SCA cheetahs, 36G). I have five and they all still work. Before that I was an ultrastar bigot, and I have some 36Z15's that are still going strong as well. Small capacity and loud as hell force me to retire them. Thanks for the well-wishing. The backup software is not that 'advanced' these days, and doesn't read from or write to raw disks. It just creates backup images that are normal files, and performs the backups using file copying and windows restore points. OS support for that stuff has made it a lot easier (and cheaper) to write backup software. Other than the lack of notification (which was my fault after all), I have nothing but good things to say about Norton Ghost / Symantec System Recovery. I've used ghost for many years, it's never let me down without it being my fault.
Recommended Posts
Archived
This topic is now archived and is closed to further replies.