• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

[Home Server Build] - My first server, ZFS, RAID 5 and other networking stuff

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

Eichhorn18

Member
Joined
Feb 2, 2005
Location
Alberta, Canada
I wanted to create this as a start of my build log that will eventually (hopefully) produce a neat little home server. This is my first attempt at building a server and while I do not have all the details worked out yet I hope some of you will help me shape what I need and enjoy seeing the build come to life. This started about a month ago and there was some previous discussion about my motivations, prebuilt NAS options which I was considering at the time and some recommendations to run Linux on my server such as CentOS, Samba etc. That discussion is linked to below and I will continue to discuss that, here.

http://www.overclockers.com/forums/showthread.php?t=725275&highlight=Eichhorn18

Project Objectives:
  • Provide storage capacity for media to be streamed across my home network
  • Provide a secure location for archived data (accessed infrequently)
  • Provide data redundancy without outsourcing or transferring data offsite
  • Upgrade my network infrastructure to "future proof" my home
  • Learn to use a new OS
  • Access my data from PC and Mac workstations (currently 4 in total)
  • Have fun doing it all

To get some help with all this I want to open it for further discussion. I've laid out all my questions below that I have not been able to resolve yet by researching / reviewing. I've already mentally divided this project into three parts.

  1. The server hardware
  2. The network and connected devices
  3. The server software

Budget

I figure this is fairly critical when asking for hardware recommendations. I am lucky in that I have an open ended budget (within reason), but have some general ideas of what I imagine I would like to spend. That being said if I am compromising something, please point it out.

  1. Server hardware - $1300 - $1800 (this I am fuzzy on)
  2. Network upgrade - $500 - $600 (I am fairly certain this is enough)

This was part of a larger project to use parts from my current workstation which I use for gaming (stats in my sig) to free up funds to upgrade my gaming station. I'd appreciate if someone would comment on what hardware from my current rig might be useful / overkill in a server environment.

(I am in Canada and so I will quote all prices in CAD and have sourced prices off newegg.ca or cablestogo)

Details & Questions

The network upgrade is where I started to think about this. I have cable internet that runs to an old Linksys WRT54G wireless router. First and foremost I was thinking of upgrading this to a new router, thinking of getting onboard with the new AC standard. I do not own a switch.

  • What do people think of Asus networking devices?

Networking buy list
  1. Asus RT-AC66U Dual-Band Wireless-AC1750, $189
  2. NETGEAR 16 Port Gigabit Business-Class Desktop Switch, $169
  3. 1000ft Cat6 UTP 550 MHz Solid PVC CMR-Rated Cable, $280
  4. RJ45 Cat5E Modular Plug - 25pk, $23

I wanted to buy a larger spool of cable to run through my house to eventually wire every room with a plug. So the 1000ft now is sort of a future investment.

Server buy list

This is where I am looking for recommendations.

  • A quad gigabit network adapter
  • A RAID controller card
  • Storage (I want to have approximately 9 TB of usable storage)
  • Motherboard, PSU, RAM, CPU etc.

This is where I'm interested in using my current build parts, if it seems necessary given that I am interested in employing ZFS. That being said, I also want recommendations for a completely fresh build.
Go wild.

RAID 5, ZFS and SAS vs. SATA

I've spent several days researching these topics and have decided that so much has happened since I build my last computer. My first and foremost question before getting into ZFS is:

What are the benefits / financial drawbacks of enterprise vs. consumer level hardware, drives etc?

What do I gain and loose by using SAS or SATA for this server?


I can't find good information on SAS drives and know only that they are more expensive than SATA drives.

Previous discussion with others has lead me to believe that to run a server that can be accessed by PC's and Mac's means I need a linux OS and something called Samba. I've also read a lot about ZFS but it seems like some people have not bought into this or are hesitant to use it yet.

Is ZFS the best option for my goals?


If I use ZFS, does this change my hardware requirements?

Finally, I think RAID 5 is my best option for the amount of storage I want.

Is RAID 5 the right way to go?

Hardware Specifics

My thoughts for the server was to buy a RAID controller with the capability of handling up to 8 drives. The LSI SAS version (there are several) is expensive! This is making a case for sticking to SATA. Thoughts?

Can someone recommend how I should handle the RAID part of this server.

The whole networking part of this is fun for me and I had the following scheme in mind.

Internet ==> Modem ==> Router ==> Switch <=(Quad gigabit)= Server

Switch =(Quad gigabit)=> Main workstation
Switch =(Dual gigabit)=> other computers
Switch =(single gigabit)=> xbox 360

The router in this setup broadcasts AC & N signal to two laptops and cell phones.

My thoughts on the quad gigabit was to maximize throughput from the server to the switch. I am not familiar with this setup but know that I need the appropriate network card.

Looking for recommendations on setup of the network and purchase of the network card for the server.


Linux, Samba, etc.

For now I'm going to leave this out and focus on the build. Thideras got me going on learning linux, which has been a steep learning curve. More to come on this but if anyone has some links to really well laid out resources I would appreciate getting those. I'm in the learning states of this still.



Phew... I think that's all for now. Looking forward to getting feedback on this and getting things ready to purchase / build.

Cheers and thanks in advance for all your support and comments. When I get building this I will do a detailed build log here as well as setup of the software.
 
Last edited:
I would drop the AC66U router and run a pfSense virtual machine instead. Then, use those 200 bucks for access points (if you need WiFi). Also, 25 connectors for 23 CAD looks really expensive. Try and source bulk RJ45 plugs. You can find them as cheap as 10 bucks for 100 of them. I don't think there is a difference on them for home usage. Maybe if you go to the full 90+5+5m there will be a difference. Not worth it for home usage IMHO.

Remember you can buy used equipment. A Dell 5224 shouldn't be more than 100 CAD, and is a killer gigabit switch, with 24 ports.

For NICs, I can't recommend but Intel. They're the NICs you want. You only want Intels. Believe me. Look for used equipment as well, but make sure they're PCI Express (PCI-e, not PCI-X).

Storage, I would go for WD Reds. They're on sale on the Canadian egg, just 10 bucks over standard Seagate drives, and they are "pro consumer" grade. They are designed for RAID systems.

Try to grab server-grade equipment (specially ECC RAM, you will love this). Look at the AMD Opteron range, as Xeon is expensive. Even though this might be overkill, look at this ad on the classies. It includes a PSU, fans, and some memory. It is an awesome deal, as the board includes two gigabit ports (and a management one), an awesome RAID controller comparable to an IBM M1015. Pretty much, with this you just have to buy a Norco case and the HDDs.

I'd wait for our storage guru Thideras to answer all the RAID questions.
 
Networking buy list
  1. Asus RT-AC66U Dual-Band Wireless-AC1750, $189
  2. NETGEAR 16 Port Gigabit Business-Class Desktop Switch, $169
  3. 1000ft Cat6 UTP 550 MHz Solid PVC CMR-Rated Cable, $280
  4. RJ45 Cat5E Modular Plug - 25pk, $23
These prices are ridiculous. I would suggest a e3000 for wireless, which is nearly 1/4 the price of that Asus. For a switch, pick up a used PowerConnect 5224. I paid $30 shipped for mine, so all you have to do is look out for good deals. That Netgear is probably not even managed and likely complete garbage. I hope for the price of that CAT6 cable it is plenum. If not, you are paying way too much. For the plugs, I paid less than $10 for 100.


What are the benefits / financial drawbacks of enterprise vs. consumer level hardware, drives etc
Enterprise hardware is expensive because extra care is taken to increase their longevity. Parts are more robust and many have error checking built into them. This can sometimes be extremely expensive, compared to their "home" version, specifically with hard drives. However, with a proper setup, you can get by using regular drives in a home server. There are some other differences on what loads they can handle, but you won't hit these in a home environment.

For example, I have family that works in a data center and I sometimes get to hear some stories about the hardware they run. They had multiple disks in a RAID array that would drop to degraded after constant use, but would magically start working if they were left alone. Disks could be swapped out and the problem persisted. They found out that they were hammering the disks hard enough and long enough that the actuator arm was flexing under load, producing read errors. They had to swap out to different disks.

What do I gain and loose by using SAS or SATA for this server?
For home use? You lose nothing and save a lot of money.

Is ZFS the best option for my goals?
I think that software RAID will be best for this situation, but whether ZFS is "best" is up to you. The only two problems that I run into are that it isn't natively on Linux (BSD has it natively and so does OpenSolaris) and you can't expand the pool like traditional RAID. For example, say you had your 9 TB usable space and wanted to add another 3 TB drive. You can't*. You have to add vdevs (virtual devices, think of them like small RAID arrays) to the pool. Meaning, you'd want to add drives in sets of three or more. If you are going to be adding drives one at a time, ZFS is not going to work well for you.

*This isn't exactly true, you can add it, but there would be no parity. It would just be a single drive. But, this effectively renders it useless, so there is no point to do it.

If I use ZFS, does this change my hardware requirements?
It is suggested that you have 1 GB of RAM for every 1 TB of hard drive (not usable space, raw space). If you want 9 TB usable and you are running the equivalent to RAID 5, you need 12 TB of raw disk (assuming 3 TB drives), which means you will want 12 GB of RAM for ZFS alone. Then you need to factor expansion in the future and what else you will use the server for (virtual machines, websites, databases?).

Is RAID 5 the right way to go?
I really don't like giving yes/no answers because I want the person to understand why I'm suggesting what I'm suggesting and to come to the same conclusion with a bit of research. Following advice blindingly is the worst thing you can do on this build. However, I would suggest it, yes. If you are running more than five disks, you should switch to RAID 6. With that much storage, your chances of failure during a rebuild (when the array is most vulnerable) is very high. Having a second parity disk completely removes that unless it fails as well.

Can someone recommend how I should handle the RAID part of this server.
Grab the M1015, flash it to a HBA (host bus adapter, no RAID) and handle RAID in software. This way, you bypass any limitations of the RAID controller and your array is substantially more portable.

Looking for recommendations on setup of the network and purchase of the network card for the server.
I'd start with a dual Intel network card and expand if you need it. This is only going to help if you are completely using a gigabit port, which in a home environment is pretty rare. To combine the ports, you need a switch that is capable of 802.3ad (the PowerConnect 5224 is) and setup a bond in Linux. I did this and am currently using four gigabit connections to my switch. I've never had a problem with running a single line and this done more for fun because I hadn't done it before.


Linux, Samba, etc.
Post here if you have any specific questions and I will do my best to help out.
 
These prices are ridiculous. I would suggest a e3000 for wireless, which is nearly 1/4 the price of that Asus. For a switch, pick up a used PowerConnect 5224. I paid $30 shipped for mine, so all you have to do is look out for good deals. That Netgear is probably not even managed and likely complete garbage. I hope for the price of that CAT6 cable it is plenum. If not, you are paying way too much. For the plugs, I paid less than $10 for 100.

So you think the picking up an AC wireless router is pointless at this point?

The CAT 6 cable is not plenum. I'm surprised because i've seen other people around the forums quote prices around what I've found for this cable... and a lot of people using CablesToGo.ca. Thanks for the reality check.

I think that software RAID will be best for this situation, but whether ZFS is "best" is up to you. The only two problems that I run into are that it isn't natively on Linux (BSD has it natively and so does OpenSolaris) and you can't expand the pool like traditional RAID. For example, say you had your 9 TB usable space and wanted to add another 3 TB drive. You can't*. You have to add vdevs (virtual devices, think of them like small RAID arrays) to the pool. Meaning, you'd want to add drives in sets of three or more. If you are going to be adding drives one at a time, ZFS is not going to work well for you.

*This isn't exactly true, you can add it, but there would be no parity. It would just be a single drive. But, this effectively renders it useless, so there is no point to do it.

Further down you recommend getting a hardware RAID (M1015), and flashing it for portability. By this, do you mean that using a controller would allow me to take the array and move it to a different server in the unfortunate situation where something like the motherboard craps out? How is this possible if the raid array is set up in the software?

Also, is RAID-Z worth considering? Or is this what you are talking about. My understanding of RAID-Z is that it's not an actual raid array in the traditional sense but is handles in the ZFS software like a raid array. I was watching a review on ZFS and some people were suggesting that raid is redundant if you have ZFS since your data is in a large pool across multiple physical discs and set up to have redundancy.

In the same reviews of ZFS, people were suggesting that using ZFS or RAID-Z in the software with an attached hardware controller would be a disadvantage for performance since ZFS likes to handle the raid like setup itself rather than passing it to a pre-configured controller.....

wrt your comment about ZFS not allowing the addition of drives to expand the storage pool... I've read some reviews that suggest this is not the case and the storage pool will "reorganize itself" with the addition of new discs. You say that that adding a new drive to a ZFS pool would be doable but the drive would not be part of the parity. What do you mean by this?

It is suggested that you have 1 GB of RAM for every 1 TB of hard drive (not usable space, raw space). If you want 9 TB usable and you are running the equivalent to RAID 5, you need 12 TB of raw disk (assuming 3 TB drives), which means you will want 12 GB of RAM for ZFS alone. Then you need to factor expansion in the future and what else you will use the server for (virtual machines, websites, databases?).

Does the RAM module size play any effect in performance? Additionally, I know that microsoft limits the amount of usable ram by the system in some cases.... would this be completely dependent on the OS? In the case of a setup with some linux OS, would this be the case as well?

I really don't like giving yes/no answers because I want the person to understand why I'm suggesting what I'm suggesting and to come to the same conclusion with a bit of research. Following advice blindingly is the worst thing you can do on this build. However, I would suggest it, yes. If you are running more than five disks, you should switch to RAID 6. With that much storage, your chances of failure during a rebuild (when the array is most vulnerable) is very high. Having a second parity disk completely removes that unless it fails as well.

I've heard lots of people say the following. "Your chances of failure during a rebuild is very high." Why is this so?

Grab the M1015, flash it to a HBA (host bus adapter, no RAID) and handle RAID in software. This way, you bypass any limitations of the RAID controller and your array is substantially more portable.

Is this essentially removing the onboard software from the raid controller and simply using the card as a way to connect up to 16 discs, since the motherboard likely does not have connections for that many?

I'd start with a dual Intel network card and expand if you need it. This is only going to help if you are completely using a gigabit port, which in a home environment is pretty rare. To combine the ports, you need a switch that is capable of 802.3ad (the PowerConnect 5224 is) and setup a bond in Linux. I did this and am currently using four gigabit connections to my switch. I've never had a problem with running a single line and this done more for fun because I hadn't done it before.

I agree. I likely do not need the bandwidth across the network at the moment to make use of four gigabit connections. But it sounds fun. :p




I appreciate everyone's comments. Please forgive me if some of my questions are obvious or common knowledge. This territory in computers is new for me and I am learning.
 
So you think the picking up an AC wireless router is pointless at this point?

The CAT 6 cable is not plenum. I'm surprised because i've seen other people around the forums quote prices around what I've found for this cable... and a lot of people using CablesToGo.ca. Thanks for the reality check.

AC Wireless is not even on your daily hardware, no need to buy a new wireless router that will probably get cheaper and better with time. If the forums you go to are "audiophile forums" you will see them buy $1k 0.5m AC cables to "improve and filter signal". Not to mention they're complete gimmicks, and status symbols for audiophiles. Going back to reality, as long as the cable is certified for Cat.6 you won't have a problem.

I forgot you live in Canada. You are lucky enough to have Monoprice, so use it. Get the cables from that site. Believe me, there's no difference in cables as long as they're rated for what you want them to.

Further down you recommend getting a hardware RAID (M1015), and flashing it for portability. By this, do you mean that using a controller would allow me to take the array and move it to a different server in the unfortunate situation where something like the motherboard craps out? How is this possible if the raid array is set up in the software?

What is known as HBA mode exposes the raw hard disks to the systems. They are just like your standard AHCI controller. There's no RAID arrays set up, no nothing. Thideras said "for portability" because, if you are running a software RAID/ZFS you can just remove the drives, plug them on another machine with mdadm/zfs/whatever-you-use and they will work. The portability part refers to not needing a similar controller to get the data from it. If you use LSI's RAID mode, you are bound to LSI controllers. You can't access your RAID array from an Intel (ignoring controllers made by LSI) desktop controller. You need a LSI card.

Also, is RAID-Z worth considering? Or is this what you are talking about. My understanding of RAID-Z is that it's not an actual raid array in the traditional sense but is handles in the ZFS software like a raid array. I was watching a review on ZFS and some people were suggesting that raid is redundant if you have ZFS since your data is in a large pool across multiple physical discs and set up to have redundancy.

In the same reviews of ZFS, people were suggesting that using ZFS or RAID-Z in the software with an attached hardware controller would be a disadvantage for performance since ZFS likes to handle the raid like setup itself rather than passing it to a pre-configured controller.....

RAID-Z is RAID-5, but using ZFS. It is a software RAID array with parity. RAID (hardware RAID, the one we're used to with desktop boards and RAID controllers) is redundant and not recommended with ZFS because, as you said, ZFS wants the raw drives. ZFS wants to access the raw drives, with no RAID in between.


wrt your comment about ZFS not allowing the addition of drives to expand the storage pool... I've read some reviews that suggest this is not the case and the storage pool will "reorganize itself" with the addition of new discs. You say that that adding a new drive to a ZFS pool would be doable but the drive would not be part of the parity. What do you mean by this?

If you are running a standard ZFS "stripe" pool you can add a new drive to the pool and it will wonderfully accept it and reorganize itself, as you said.

But if you want to add drives to a RAID-Z pool, you need to add vdevs. A vdev is something you defined previously, where you set the mode for the array, say, parity. You can't add another drive to a vdev. You can only add new vdevs to a pool.

Does the RAM module size play any effect in performance? Additionally, I know that microsoft limits the amount of usable ram by the system in some cases.... would this be completely dependent on the OS? In the case of a setup with some linux OS, would this be the case as well?

Microsoft limits the RAM on home systems and servers to harvest some extra money from licenses, yes. For example, Win7 Home Premium was limited to 16GB. With Linux, those absurd limitations do not exist, as long as you are using a 64-bit kernel or a 32-bit w/PAE.

I've heard lots of people say the following. "Your chances of failure during a rebuild is very high." Why is this so?

A RAID array stripes the data and sends them to different drives at the same time to increase speed. That means, that no drive has a full file (excluding small ones that may fit on the stripe size). If a drive fails, you lose all the data because you lost pieces of all the files. To address this, we have RAID arrays with parity (RAID-5, RAID-6, RAID-Z, RAID-Z2).

If a drive dies on a RAID array with 1 parity (RAID-5, RAID-Z), you can repair the array, recreating the lost disk using the parity stored on the other ones and a new disk. During this process, your array is pretty vulnerable because if another drive dies, you lose everything. If all the drives are from the same batch, they will probably have a similar MTBF. The chances of another drive dying then are high. And if another drive dies, you can't repair the disk.

We can address this using 2 parity stripes (RAID-6, RAID-Z2). This way, we have two parity records that will let us rebuild up to two disks. If a disk dies, we can rebuild it, and if another disk dies, we can rebuild it aswell, at the same time.

Is this essentially removing the onboard software from the raid controller and simply using the card as a way to connect up to 16 discs, since the motherboard likely does not have connections for that many?

I guess you could say so.


I agree. I likely do not need the bandwidth across the network at the moment to make use of four gigabit connections. But it sounds fun. :p
Same happens with thideras. I hardly think thideras needs the ridiculous amount of RAM he has, or the bandwidth he has. It is overkill. But he has it. And it's fun! I even have a special GMail filter to notice me of any new posts to his thread! :D

I appreciate everyone's comments. Please forgive me if some of my questions are obvious or common knowledge. This territory in computers is new for me and I am learning.

It's better to ask a lot of questions than to risk valuable data. We are here for that purpose! None of us was born with an innate computer skill, we have all been newbies once, there's no shame on that. And, hey, we all have a lot of things to learn. It's easier to learn with such a friendly community as OCF. :3

EDIT: Check this video out for ZFS.


 
So you think the picking up an AC wireless router is pointless at this point?
I'm not sure I understand what you are asking. I suggested a different (much cheaper) wireless router.

Further down you recommend getting a hardware RAID (M1015), and flashing it for portability. By this, do you mean that using a controller would allow me to take the array and move it to a different server in the unfortunate situation where something like the motherboard craps out? How is this possible if the raid array is set up in the software?
The M1015 I suggested would be used as a HBA after flashing, not a RAID card. This means it passes all its disks through to the operating system without touching them. Think of it like a big fancy SATA disk controller.

Also, is RAID-Z worth considering? Or is this what you are talking about. My understanding of RAID-Z is that it's not an actual raid array in the traditional sense but is handles in the ZFS software like a raid array. I was watching a review on ZFS and some people were suggesting that raid is redundant if you have ZFS since your data is in a large pool across multiple physical discs and set up to have redundancy.

In the same reviews of ZFS, people were suggesting that using ZFS or RAID-Z in the software with an attached hardware controller would be a disadvantage for performance since ZFS likes to handle the raid like setup itself rather than passing it to a pre-configured controller.....

wrt your comment about ZFS not allowing the addition of drives to expand the storage pool... I've read some reviews that suggest this is not the case and the storage pool will "reorganize itself" with the addition of new discs. You say that that adding a new drive to a ZFS pool would be doable but the drive would not be part of the parity. What do you mean by this?
My comments were for ZFS. RAIDz is ZFS. Anyone that says you can expand virtual devices by adding hard drives to an existing virtual device is flat out wrong. You can expand virtual devices by swapping each disk in the device with a larger one and resilvering every time. You can also add additional virtual devices, which expands the pool. These three things are not the same.

To expand the pool appropriately (meaning you keep parity), you need to add groups of virtual devices in RAIDz. This means in groups of two or more.

And I don't mean to be insulting, but if you don't understand the basics of ZFS, you really should do some research into how it works. Otherwise none of this is going to make any sense and if you decide to go with it, you are just going to end up with a huge headache. This concept applies to anything in this build, really.

Does the RAM module size play any effect in performance? Additionally, I know that microsoft limits the amount of usable ram by the system in some cases.... would this be completely dependent on the OS? In the case of a setup with some linux OS, would this be the case as well?
As long as you are running the maximum number of memory channels with a configuration, it doesn't matter speed wise. It will matter if you intend to upgrade. For example, if you want 16 GB of RAM and you have four slots populated. Two 8gb sticks would run the same speed as four 4 GB sticks. The downside with running four sticks is that you'd have to replace all the sticks to add more RAM.

I've heard lots of people say the following. "Your chances of failure during a rebuild is very high." Why is this so?
Disk usage during a rebuild is high, especially if there are other applications using the drives at the same time. The more you are hammering on the drives, the longer the rebuild will take and the higher the chances of failure. This increases the chance of catastrophic data loss considerably. This is why when you are running a lot of disks (5 or greater), you should really consider RAID 6 or other levels.

Is this essentially removing the onboard software from the raid controller and simply using the card as a way to connect up to 16 discs, since the motherboard likely does not have connections for that many?
It doesn't disable the hardware RAID on the controller, but it adds the option to pass through the disks to the operating system. Think of it like a big SATA controller. Without an expander, you could run 8 drives off the card (one port is 4 SATA ports).
 
make sure that both your network card and switch supports link aggregation and teaming.
 
Back