• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

Software raid within Windows

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

smoth

Member
Joined
Jul 26, 2009
I am currently studying in Germany doing computer simulations and estimate that I will have about ~1.5 TB of data by the time I leave in 2 weeks (currently have ~900GB). I am realizing that I came over here on the unprepared side, but fortunately brought a 2TB hdd mainly for entertainment...needless to say, I have deleted all of my entertainment and have started loading it up with all my simulation results. However, since this is for my PhD thesis, I need to ensure this data is relatively safe (I shall define as being able to withstand a drive failure). My current strategy for my research, photography, and music collection is to install random drives (currently have 2 1TB and 5 2TB) and use some software to sync folders on different drives...wherever I can find space to sync the full drive. This is inexpensive, easy to set up, expandable (up to a point), but inefficient.

I have no need for the benefits of a hardware raid solution, especially given the cost of a 10+ drive setup and I find it to be very limiting in upgradability, especially for personal use. Before I dig myself into too deep of a hole, I need a more elegant solution as I have to store this data reliably for the next 10+ years.

I am thinking a software raid system, such as FlexRaid, is my best bet for several reasons:

-Expandable with any size drive makes for easy upgradeability as drive sizes grow without having to replace the whole array or search for outdated models

-compatible with normal drives (does not require TLER, etc)

-does not require expensive controller cards

-significantly cheaper than hardware solutions due to above 2 mentioned items (I am a student after all)

-customizable redundancy


I have been scouring the forums and dont see much mention of FlexRaid.

Desired solution:
-expandable with any size drive
-able to use any type of drive
-able to read drives individually if computer crashes
-inexpensive
-improved read performance would be nice (simulation files are on the order of 10-200GB)
-power savings is always good
-capable of surviving 1 drive failure/5 drives in array


So I have a few concerns/questions for those who have/have knowledge of software raid solutions that can function within a Windows environment:


1) It looks like FlexRaid will allow me to build a single volume array with parity using my drives already containing data...Is this risky?

2) Are there any other solutions besides FlexRaid that will run in Windows? FlexRaid seems to be under a lot of scrutiny for a variety of reasons...not exactly what you like to see on something for data protection. Unraid seems to do better...could I run it in a VM?

Any other suggestions/advice would be greatly appreciated!

I will be turning my current rig into a dedicated fileserver/htpc this summer when I build a new workstation.

EDIT: the more I read, the more I become unhappy with these options.
FlexRaid: least expensive solution, but poor documentation and support is concerning,
Unraid: only 1 parity...1 drive protection on a 10-20 drive array...not much better than nothing, and expensive
ZFS: cannot run in Windows, requires too many resources, requires all drives be the same size
Hardware: all drives must be same size, all drives must have RAID feature set, requires battery backup, simply unaffordable in the array size I require(>$2k)

I may just have to dump all my movies onto an external drive and be very stingy with my space usage until I can build another rig, and try to get a 2TB drive ZFS array underway. Can you run multiple ZFS arrays on the same system? Could I add a 4TB drive array in a few years to expand storage or will I always be stuck with 2TB drives?

Thanks for your help!
 
Last edited:
I think youre right that I just need to do it right. By purchasing more drives for a temporary solution now, I am digging myself a hole to transfer it all to a new parity based system for added storage efficiency. With some compression, and some reorganizing, condensing, I think I should be able to manage by using my external 2TB drive until the summer when I build a new system. It will be close and I will only have about 200GB to play with across my drives, but it should work.

This summer, I will be transitioning my current system into a ZFS fileserver and will start it off with 3 4TB drives, transfer my data off my drive mess, add my 2TB drives into a second pool, and use my 2 1TB dives in a raid 0 config in my new workstation as active storage, backed up on the file server.

As you can see, I will be starting with 22TB of raw disk space, which from what I have read requires 1GB RAM/TB. I have been considering upgrading to 24GB for a while (max of the board) and with suggested ZFS configurations like that, it will not go to waste long term. However, being that I would be maxed at 24GB, and will just be starting my file server (still will have 7 or 8 drives slots left). I realize the recommendations are for enterprise uses, so I am wondering if 24GB will be sufficient for say, a 40TB array for a home environment over a gigabit connection?

In case you haven't noticed, I am not a fan of temporary solutions, and usually like to have a 10+ year outlook on any computer purchase.
 
I'm looking for the B word and not seeing it. Is a backup strategy outside of this discussion? I'm sure you want some sort of reliable backup for this data that you want to use for the next ten years.
 
About the B word...I have a few categories of data each with their own backup solution:

entertainment media: backed up on original disks
documents: Google Drive
research data (sans simulation results): Dropbox/school's server
photos: 50% backed up at parents house

remaining data (pictures, simulation results, animations)(~4TB) I have no easy way to backup and maintain with regular syncs, so I am relying solely on mechanical fault tolerance.

While it would suck to loose the original picture and animations files, at least all of the important ones are on Facebook, albeit in low quality, and my art prints are uploaded to my printer in full quality. The main reason I would need my simulation results after I analyze them is if I need to defend a patent in court. So if I were to have some sort of catastrophic failure, and had my patents questioned, I would need to have my simulations results recalculated (I have the parameter files backed up on Dropbox). While that would be a major pain as many of them were run on custom FEM packages at a variety of universities around the world, it should be doable under those circumstances.
 
About the B word...I have a few categories of data each with their own backup solution:

entertainment media: backed up on original disks
documents: Google Drive
research data (sans simulation results): Dropbox/school's server
photos: 50% backed up at parents house

remaining data (pictures, simulation results, animations)(~4TB) I have no easy way to backup and maintain with regular syncs, so I am relying solely on mechanical fault tolerance.

While it would suck to loose the original picture and animations files, at least all of the important ones are on Facebook, albeit in low quality, and my art prints are uploaded to my printer in full quality. The main reason I would need my simulation results after I analyze them is if I need to defend a patent in court. So if I were to have some sort of catastrophic failure, and had my patents questioned, I would need to have my simulations results recalculated (I have the parameter files backed up on Dropbox). While that would be a major pain as many of them were run on custom FEM packages at a variety of universities around the world, it should be doable under those circumstances.
 
I have lost way too many drives to leave critical data only in one location...just with the university i am visiting now would have done the same. Their file server crashed yesterday and corrupted ALL of the data...lost about 400 GB of simulation results. It will take about 3600 cpu hours to recalculate the data and I leave to go home in 7 days. Keep in mind that everyone else id trying to use the servers and they only have 32 cores...needless to say I am not happy as I now wont be able to run a set of simulations that I really needed. Lesson learned...ALWAYS have a duplicate of mission critical data on another machine...especially if your business depends on it.
 
I have 3 NAS devices and every month I back them up to 3 separate 2TB external's. I made a simple batch file to do it...Then I unplug them.. Mater of fact, I think I'm going to do it tonight..
 
I used to do that...but lately I have been having trouble keeping up to just have space to store my files, let alone double the space. over the past 2 weeks I have generated 1.4TB of simulation results. That's the main reason I want a parity based redundancy system, just to increase the efficiency of my array. In addition to the 3 new drives I will need, it will increase the efficiency of the rest of my storage so it will give me an even greater net gain of space and an increase in data safety.

Probably in 4-5 years when I build my next system I will be able to have 2 file servers...and hopefully drives big enough to make a mirror backup of the other server financially viable.
 
Back