PDA

View Full Version : System build with optimized SSD


AceNZ
08-31-09, 01:47 AM
I'm just starting on a new high-end PC build, and have decided to use SSD this time around. I've resisted RAID in the past for many reasons, including noise. The idea of having silent, low-power SSD that's also fast has been enough to prompt me to build a new system.

I would be interested in hearing suggestions for high-end components: motherboard, RAID controllers, power supply, etc. My budget is roughly US$10K. My priorities are (in order): reliability, noise, performance, cost.

I have a few questions:

-- How many SSD drives can fit in a single 3.5" or 5.5" drive bay, and what's the best way to pack them in?

-- Does anyone make an external enclosure that holds multiple 2.5" drives, to help simplify drive swaps in the event of failures?

-- What's the best RAID level for SSD? RAID-5 seems like the obvious choice, since having the parity blocks spread out among all drives would distribute the wear from disk writes. RAID-1 or -10 doesn't seem good, since their performance benefits are in largely due to offsetting rotational delay, which doesn't exist in SSD. No rotational delay would also seem to offset one of the big perf-related disadvantages of RAID-5.

-- I've read that formatting the drives with a size that's less than the size of the device can help with performance. Is that correct? If so, is it enough of a gain to be worth doing?

-- What's the optimal NTFS cluster size for SSD? With 4KB blocks and 512KB pages, it seems like something larger than the Windows default of 4KB would help reduce the number of files sharing each page, and thereby help reduce the number of pages that need to be erased during updates.

-- What's the optimal stripe size for SSD?

-- Has anyone tried using SSD for a database? If so, how does the write load impact device lifetime?

-- Is there a performance impact of mixing reads and writes on the same drive?

-- Given the 4:1 or so cost-per-byte difference between the Intel X25-M and the X25-E (vs 2.5:1 write performance), when would you consider using the -E? From a performance perspective, it seems like you would end up with a much faster system by striping four Ms together than with a single E.

-- How much does the RAM cache on a typical RAID controller really help? Is 2GB really better than 512MB? Maybe it helps more for Windows than for a database? In the past, I've often seen better DB performance when I disable the cache.

Super Nade
08-31-09, 09:08 AM
Welcome to the forums! :welcome:

Answers are in orange text. :)

I'm just starting on a new high-end PC build, and have decided to use SSD this time around. I've resisted RAID in the past for many reasons, including noise. The idea of having silent, low-power SSD that's also fast has been enough to prompt me to build a new system.

I would be interested in hearing suggestions for high-end components: motherboard, RAID controllers, power supply, etc. My budget is roughly US$10K. My priorities are (in order): reliability, noise, performance, cost.

I have a few questions:

-- How many SSD drives can fit in a single 3.5" or 5.5" drive bay, and what's the best way to pack them in?
Not sure, my rig is on a benching station. :)

-- Does anyone make an external enclosure that holds multiple 2.5" drives, to help simplify drive swaps in the event of failures?
Something like this?-->http://www.scsi4me.com/q14ss-bundle.html

-- How much does the RAM cache on a typical RAID controller really help? Is 2GB really better than 512MB? Maybe it helps more for Windows than for a database? In the past, I've often seen better DB performance when I disable the cache.
-- What's the optimal stripe size for SSD?
-- What's the best RAID level for SSD? RAID-5 seems like the obvious choice, since having the parity blocks spread out among all drives would distribute the wear from disk writes. RAID-1 or -10 doesn't seem good, since their performance benefits are in largely due to offsetting rotational delay, which doesn't exist in SSD. No rotational delay would also seem to offset one of the big perf-related disadvantages of RAID-5.
Somebody else would be able to help you here. :(

-- I've read that formatting the drives with a size that's less than the size of the device can help with performance. Is that correct? If so, is it enough of a gain to be worth doing?
There is no need to do this with Intel, Indelinix or Samsung controllers. Most drives come with the necessary buffer space for maintainence operations pre-allocated and sufficient.

-- What's the optimal NTFS cluster size for SSD? With 4KB blocks and 512KB pages, it seems like something larger than the Windows default of 4KB would help reduce the number of files sharing each page, and thereby help reduce the number of pages that need to be erased during updates.
AFAIK, it does not matter and I have not seen any reports going one way or the other. I'd suggest throwing your page file on a magnetic drive.



-- Has anyone tried using SSD for a database? If so, how does the write load impact device lifetime?
These devices have not been out long enough to gather such data. One may extrapolate (as Intel did) and come up with a number such as 10 years, but it is not reliable. Intel drives are best suited for server/enterprise performance.

-- Is there a performance impact of mixing reads and writes on the same drive?
What do you mean? Can you elaborate a bit?

-- Given the 4:1 or so cost-per-byte difference between the Intel X25-M and the X25-E (vs 2.5:1 write performance), when would you consider using the -E? From a performance perspective, it seems like you would end up with a much faster system by striping four Ms together than with a single E.
It depends on your application and budget. I would go with the cheaper MLC alternative. See HERE (http://www.oempcworld.com/support/SLC_vs_MLC.htm) AND HERE (http://www.ramsan.com/podandvid/slc_vs_mlc.htm)for the differences between SLC and MLC flash technology.

AceNZ
08-31-09, 09:58 AM
Welcome to the forums! :welcome:

Answers are in orange text. :)

Thanks.

-- Does anyone make an external enclosure that holds multiple 2.5" drives, to help simplify drive swaps in the event of failures?
Something like this?-->http://www.scsi4me.com/q14ss-bundle.html

That would be nice for converting a 5-inch bay (which also answers one of my other questions). I was thinking more along these lines:

http://h18006.www1.hp.com/storage/disk_storage/msa_diskarrays/drive_enclosures/msa70/index.html

but priced more like a computer case and less like a sports car.

-- Is there a performance impact of mixing reads and writes on the same drive?
What do you mean? Can you elaborate a bit?

Let's say I have one program that's reading intensively, and another that's writing intensively. Other than bandwidth, is there any other impact on performance when reads and writes are interleaved? For example, do the devices share internal buffers between reads and writes, or do they take time to switch from read mode to write mode?

Specifically, for a database is there still a performance-related reason for separating data and log files onto separate drives?

-- Given the 4:1 or so cost-per-byte difference between the Intel X25-M and the X25-E (vs 2.5:1 write performance), when would you consider using the -E? From a performance perspective, it seems like you would end up with a much faster system by striping four Ms together than with a single E.
It depends on your application and budget. I would go with the cheaper MLC alternative. See HERE AND HEREfor the differences between SLC and MLC flash technology.

Thanks for the links. The big (huge) missing point for me was the 10X increase in write lifetime of SLC over MLC (100,000 vs. 10,000 cycles). That suggests to me that all write-heavy data should go on SLC, with read-only or read-mostly data on MLC.

Super Nade
09-03-09, 08:05 AM
Hi AceNZ,

Sorry for disappearing from the thread. I feel you know more about SSDs than I do, so all I can do is provide a few general answers. This should answer most of your questions-->http://anandtech.com/storage/showdoc.aspx?i=3631&p=3

-Read and write operations (to my knowledge) are synchronized with a clock pulse. The relevant parameters would be the trailing edge or leading edge of such a pulse, so there is no true simultaneity in r/w operations unless they are based of two sync pulse trains (you don't want that anyway due to phase matching/timing complexities).

-As for sharing buffers, I would tentatively say yes (prefaced with what I said above).

-As for the question of bandwidth, the limitation would be the system bus and the SATA delimiter being used.

-The issue with log files is that they are small and do not fit the default 128kb "scrub-cage". This ties in directly with the "stuttering/ fill-up slow downs" issues. The limitation seems to be the choice of the "scrub-cage" -->http://anandtech.com/storage/showdoc.aspx?i=3631&p=5
In the context of a dB (I'm not familiar with dB specific systems), assuming that most changes are small and random, this could be a serious problem. AFAIK, only Intel have optimized their drives for enterprise systems (random write performance).

-As for a unified metric, it depends on several factors such as lifetime, reliability, performance and cost. You can't have your cake and eat it too. :)

AceNZ
09-03-09, 09:36 AM
This should answer most of your questions-->http://anandtech.com/storage/showdoc.aspx?i=3631&p=3

Thanks for the link -- interesting (long!) article.

-Read and write operations (to my knowledge) are synchronized with a clock pulse. The relevant parameters would be the trailing edge or leading edge of such a pulse, so there is no true simultaneity in r/w operations unless they are based of two sync pulse trains (you don't want that anyway due to phase matching/timing complexities).

My thought was that since some SSDs (including Intel) buffer several requests before issuing them to the NAND array, that a change from read to write or vice versa might cause that request buffer to be flushed early, which would cost performance. But the details depend on the firmware -- the only way to know for sure is to test it.

-The issue with log files is that they are small and do not fit the default 128kb "scrub-cage". This ties in directly with the "stuttering/ fill-up slow downs" issues. The limitation seems to be the choice of the "scrub-cage" -->http://anandtech.com/storage/showdoc.aspx?i=3631&p=5
In the context of a dB (I'm not familiar with dB specific systems), assuming that most changes are small and random, this could be a serious problem. AFAIK, only Intel have optimized their drives for enterprise systems (random write performance).

Database log files aren't usually small, since a new record is added to the log for every change that's made to the DB. Individual write sizes vary, and are determined by the size of the transaction. Logs are written 100% sequentially; no random writes are required. But since they are write-heavy, they would also reach the 10,000 write limit more quickly -- and TRIM doesn't help, because the log files are never deleted, just overwritten.

From a cost perspective, it seems like the best strategy would be to separate write-heavy files onto the -E, and use the -M for read-only or read-mostly data. Unfortunately, that would mean that the read throughput would be less than if all drives were used for both reading and writing with -M drives, but it also means more drives for the same price compared to if they were all -E...

I'm in the middle of designing what I think will be a kick-ass I/O-rich machine. I'm hoping to get over 4 GB/sec out of it. I was thinking of posting a build log and pics somewhere -- SSDs aren't so new anymore, though, so I have no idea if there would be any interest.

Super Nade
09-03-09, 10:53 AM
I need to think a bit more about how buffer flushing is done and how it ties up with controller specifics? It is a very difficult question to answer.

I am not sure about the specifics involved in the over-writing process. AFAIK, the used sectors are marked free but does the new data get written on these very same sectors? Does this vary between an SSD and a HDD? The OS ought to be transparent to the type of storage media so these decisions would be left to the controller.

If data volumes are very large magnetic devices still rule the roost. :)

Please post a build log! I am interested to see how this turns out? :)

Mr Alpha
09-03-09, 12:45 PM
Over at IT@Anandtech they did some testing with a bunch Intel SLC drives in various of server and database benchmarks in different RAID configuration and compared them to more traditional SAS setups. Link (http://it.anandtech.com/IT/showdoc.aspx?i=3532&p=1)

Super Nade
09-03-09, 02:03 PM
Nice find Alpha! :beer:

Evilsizer
09-03-09, 04:43 PM
with ssd's if a sector is written to then the data erased. any new data is written to new sectors of the nand memory. that is handled controller level, that is level wearing or something like that. it keeps the life of the drive up by writing to sections of the nand that havent been used yet.

if your talking about modifing data already stored. then i believe the controller takes that data and moves it to a new section to be writen too. i could be wrong but that is what i understood the intel and newer controllers to do.

AceNZ
09-04-09, 05:42 AM
I need to think a bit more about how buffer flushing is done and how it ties up with controller specifics? It is a very difficult question to answer.

SSD controllers use a form of striping internally. With the Intel controller, requests are accumulated into a 10-wide queue before being issued to the NAND array. If that queue has to be processed early when the array switches from read mode to write mode, then performance will suffer.

I am not sure about the specifics involved in the over-writing process. AFAIK, the used sectors are marked free but does the new data get written on these very same sectors? Does this vary between an SSD and a HDD? The OS ought to be transparent to the type of storage media so these decisions would be left to the controller.

It's a multi-step process. One difference between SSD and HDD is that SSD has a separate LBA mapping layer, so physical blocks can have varying logical block addresses. When you write to an SSD, if a free page is available, then the LBA map is updated accordingly, and the data is written in the free page. If most or all pages have been written, then one that is partly full is read, combined with the new data (in a DRAM buffer) and written to an empty cell. Meanwhile, the old page is scheduled to be erased.

Yes, this is all transparent to the OS. However, in addition to TRIM (the OS notifying the drive that certain blocks are now free), there are other things the OS and applications could do to minimize how frequently pages have to be erased.

Over at IT@Anandtech they did some testing with a bunch Intel SLC drives in various of server and database benchmarks in different RAID configuration and compared them to more traditional SAS setups. Link (http://it.anandtech.com/IT/showdoc.aspx?i=3532&p=1)

That write-up looks like it's more a test of the controllers than of the drives. Their results are really messed up, in many ways. Just the fact that they can't get above 800 MB/sec sequential reads with 8 X25-E drives, using an x8 RAID controller, should be a huge red flag. That alone brings all the rest of their results into question.