RESULTS: Short-Stroke Single 640gb WD Black

JigPu · Nov 16, 2009

I can also echo what tuskenraider says. At my last job I was in charge of creating drive images for new systems and then sticking those images onto a dozen or so new machines at a whack.

After performing a fresh install of Windows XP, there is significant file fragmentation. This must be a side effect of how Windows installs itself, and isn't really relevant to this discussion. What is relevant though is that I find that after a fresh install files have been placed in multiple "blocks". You'll get lots of files together, then a lot of free space, then lots of files together, then lots of free space, etc. For a fresh Windows install, these blocks are scattered in the first quarter of the drive. After a good defragging both issues are solved, but after installing more software the drive will show more "block" allocations.

I don't know why the NTFS driver would do this. I know it is supposed to put some room after files to allow for expansion without fragmentation, but what I'm talking about is hundreds of megabytes to gigabytes of free space separating densely-packed blocks of files. Perhaps it tries to allocate groups of files which are created at nearly the same time close to each other on the assumption that they're part of one program? It could try to leave a buffer zone between allocation groups to ensure that if a program expands in size (from that latest WoW patch for instance) that new files for it can be allocated near others from the same program.

That said, I agree that short-stroking is (nearly) useless if you have to simultaneously access data in two partitions. The drive will need to seek back and forth across the drive regardless of if the data is in two partitions or if it was allocated far apart on the drive in a single partition. Short stroking does have benefits for inter-partition accesses which are not simultaneous or of a low volume though.

JigPu

JigPu · Nov 17, 2009

Steven-1979 said:
If the allocation is that bad by default, how would it work on say a 3 platter drive? Would it do things like allocate the data unevenly to the platters (favoring one over the other)? Would short stroking then use the fastest part of each platter or could you end up with a situation where the partition you've created is made entirely on 1 platter (out of the 3) thus hurting performance?

That's not how drives allocate their sectors. Drives don't first fill one "head" and then the next -- they fill one "cylinder" and then the next. A "cylinder" is all the sectors a specific distance from the edge of the drive. As cylinders fill up, the heads need to move across the platter to slower and slower cylinders.

Though CHS notation is obsolete nowdays (read about it here if unfamiliar), I find it gives a good visual representation for how the sectors are allocated. Assume your drive has 6 heads (3 two-sided platters) and can fit 1000 sectors on a track.

Block 1 = Cylinder 1, Head 1, Sector 1
Block 2 = Cylinder 1, Head 1, Sector 2
Block 1000 = Cylinder 1, Head 1, Sector 1000
Block 1001 = Cylinder 1, Head 2, Sector 1
Block 1002 = Cylinder 1, Head 2, Sector 2
Block 7000 = Cylinder 1, Head 6, Sector 1000
Block 7001 = Cylinder 2, Head 1, Sector 1
Block 7002 = Cylinder 2, Head 1, Sector 2

The drive first writes "around" a track until it loops back on itself. It then switches heads and again writes "around" a track. Only after writing with every head does it move its seek arm to a different cylinder (at which point the process repeats).

JigPu

tuskenraider · Nov 17, 2009

cyberfish said:
If it's such a useful and simple concept, why isn't M$ doing it?

I just spent 20min. writing up some new findings with Win7 and Perfect Disk 10, then wrong button.........deleted. Just installed PD 10 today on a fresh Win7 install. It showed all the data at the beginning of the drive with just a file analysis. Problem is, that was on an SSD, which would be bad if the physical representation is proper. Looks like file placement has changed with Win7.......or PD 10 is working differently. I'm gonna try an install on my wife's PC tomorrow with a HDD and see what happens.

Steven-1979 said:
If the allocation is that bad by default, how would it work on say a 3 platter drive? Would it do things like allocate the data unevenly to the platters (favoring one over the other)? Would short stroking then use the fastest part of each platter or could you end up with a situation where the partition you've created is made entirely on 1 platter (out of the 3) thus hurting performance?

Platter quantity doesn't effect data placement, but the more platters, the less data on each platter to make up a given file. The location on each platter will be the same since the heads all work together.

cyberfish · Nov 17, 2009

If the allocation is that bad by default, how would it work on say a 3 platter drive? Would it do things like allocate the data unevenly to the platters (favoring one over the other)? Would short stroking then use the fastest part of each platter or could you end up with a situation where the partition you've created is made entirely on 1 platter (out of the 3) thus hurting performance?

That is a very interesting point. I wonder how they map the space.

For a fresh Windows install, these blocks are scattered in the first quarter of the drive

That would make sense. The performance in the first quarter should be nearly identical, and it leaves a lot of room for expansion.

They want to leave room for "future grouping" (directory expansion), too. That allows them to put files in the same directory together (on the assumption that they will be accessed together).

For example, if you create 2 folders of 5x1MB files each, they may leave 5MB space after each directory. And then, if you decide to add another 5 files to the first directory, they can be put right after the first 5 files in the same directory.

The gain from having continuous files and directories far outweigh the difference between where they are stored.

As for how much space to leave, it's probably a heuristic (statistical) estimation. There's no way to know for sure beforehand. They probably did some study to determine the optimal amount of scattering.

It is clear that more packed is not necessarily better, and it's a far more complicated problem than that.

By short-stroking, we are essentially saying "I know better than you (Windows) do, so I will force you to pack all the files tightly together". Performance will likely be lower, unless, of course, you really do know better than those file system driver programmers.

I wouldn't trust defragging programs that shift everything to front and leave as little space as possible. That is very easy to do, and sounds good intuitively, but is probably not in reality, just like those "memory cleaners".

IMHO, the driver probably knows more than we do. So I would just give it as much space as possible, and let it do its job.

cyberfish · Nov 17, 2009

JigPu - thanks! very interesting read (the explanation on CHS).

Steven-1979 · Nov 17, 2009

JigPu said:
That's not how drives allocate their sectors. Drives don't first fill one "head" and then the next -- they fill one "cylinder" and then the next. A "cylinder" is all the sectors a specific distance from the edge of the drive. As cylinders fill up, the heads need to move across the platter to slower and slower cylinders.

Though CHS notation is obsolete nowdays (read about it here if unfamiliar), I find it gives a good visual representation for how the sectors are allocated. Assume your drive has 6 heads (3 two-sided platters) and can fit 1000 sectors on a track.

Block 1 = Cylinder 1, Head 1, Sector 1
Block 2 = Cylinder 1, Head 1, Sector 2
Block 1000 = Cylinder 1, Head 1, Sector 1000
Block 1001 = Cylinder 1, Head 2, Sector 1
Block 1002 = Cylinder 1, Head 2, Sector 2
Block 7000 = Cylinder 1, Head 6, Sector 1000
Block 7001 = Cylinder 2, Head 1, Sector 1
Block 7002 = Cylinder 2, Head 1, Sector 2

The drive first writes "around" a track until it loops back on itself. It then switches heads and again writes "around" a track. Only after writing with every head does it move its seek arm to a different cylinder (at which point the process repeats).

JigPu

ahh ok. That clears up a lot for me. I had always thought heads worked indipendantly and simultaneously (like how entire drives work in a RAID 0 array), but it seems they work in tandem and sequentially. That actually explains a lot of questions I had on other hd issues too. thnx :thup:

Nikooo · Nov 19, 2009

The ideas behind short stroking is to gain speed (lower latency + higher MB/s).

Just wondering if the manufacturers can't place a second actuator arm with another set of heads which will move independently from the 1st one.
This will certainly increase access time and transfer speed especially when dealing with multiple requests or heavily fragmented files.
It seems doable to me ... :-/

And on a similar note maybe they can do some sort of an internal raid 0, reading from all the platters at the same time.
aaaah dreams ...

ratbuddy · Nov 19, 2009

Nikooo said:
The ideas behind short stroking is to gain speed (lower latency + higher MB/s).

Just wondering if the manufacturers can't place a second actuator arm with another set of heads which will move independently from the 1st one.
This will certainly increase access time and transfer speed especially when dealing with multiple requests or heavily fragmented files.
It seems doable to me ...

And on a similar note maybe they can do some sort of an internal raid 0, reading from all the platters at the same time.
aaaah dreams ...

You are not alone, my friend :beer:

http://www.ocforums.com/showthread.php?t=603062

Nikooo · Nov 19, 2009

I see, complicated but already done

still they can do it again. There are ppl that would pay the extra premium for the added performance, me included !

cyberfish · Nov 20, 2009

They can probably make a drive 2x the size with 2 drives inside in RAID.

But what's wrong with good old external RAID? (since you said you are willing to pay)

simcom · Nov 20, 2009

Sorry to interrupt the arguing here, but I do find cylinder-limiting very interesting (and one day useful when I don't use my 640GB as storage). I agree it will result in better performance if you are willing to use only 200GB of a 640GB drive. Thanks OP!

To the defrag argument, I think you are confusing yourselves by mixing logical address and physical address. And as far as SSD goes, there is no static physical address - the SSD controller (the chip on the SSD) does dynamic re-mapping for wear-leveling.

Also just a note, most FS (including NTFS) will try to put a single file in continuous clusters (logical address), but it doesn't handle streams of files. In NTFS's case, whatever the MFT says is the next available cluster, it starts writing to it.

cyberfish · Nov 20, 2009

I agree it will result in better performance if you are willing to use only 200GB of a 640GB drive. Thanks OP!

And of course, you can do that without short-stroking.

To the defrag argument, I think you are confusing yourselves by mixing logical address and physical address. And as far as SSD goes, there is no static physical address - the SSD controller (the chip on the SSD) does dynamic re-mapping for wear-leveling.

Mechanical drives map the fastest (outer) region to low logical address.

Also just a note, most FS (including NTFS) will try to put a single file logically continuously, but not streams of files.

Sources?

simcom · Nov 20, 2009

cyberfish said:
And of course, you can do that without short-stroking.

Mechanical drives map the fastest (outer) region to low logical address.

Sources?

1. SS (using HPA or DCO) beats partitioning because there is less info on the MBR.

2. Yes and no. Maybe YOU can provide a sourse to your claims too. For some logical address it is fixed (sector 0 - 64) but others it is up to the manufacturer and FS. Also note that the physical address may map to non-continous CHS address.

3. Go read up on NTFS and show me a feature that tries to keep a bunch of files in continous logical address. Do you even know what the FS does?

BTW, if you disagree with the OP, just leave the thread. The OP's prove of concept is the key here. If you don't believe this works or have other ways to the same effect, start your own thread. Geez.

Again thanks OP for sharing your results. :thup:

cyberfish · Nov 20, 2009

1. SS (using HPA or DCO) beats partitioning because there is less info on the MBR.

The second half of the thread is arguing whether short stroking helps at all (either by partitioning or any other methods), or would it be better to just let the filesystem decide.

2. Yes and no. Maybe YOU can provide a sourse to your claims too. For some logical address it is fixed (sector 0 - 64) but others it is up to the manufacturer and FS. Also note that the physical address may map to non-continous CHS address.

Any benchmark will tell you this.
http://images.google.ca/images?gbv=2&hl=en&sa=1&q=hdtune&btnG=Search+images&aq=f&oq=&start=0
see how all harddrives are faster towards the beginning?

3. Go read up on NTFS and show me a feature that tries to keep a bunch of files in continous logical address. Do you even know what the FS does?

No, I did no reverse engineering on NTFS and I wasn't on the NTFS driver team.

My "claim" was purely speculative (trying to fit a theory around JigPu's observations). I'm sorry if I didn't make that clear.

Was your statement speculative, too?

BTW, if you disagree with the OP, just leave the thread. The OP's prove of concept is the key here. If you don't believe this works or have other ways to the same effect, start your own thread. Geez.

I thought results are meant to be criticized and questioned. That's how the academic community has been evolving for all these years.

I claim the earth is flat. If you don't agree, just leave the thread

.

cyberfish · Nov 20, 2009

btw, disk block allocation would depend on the driver, not the filesystem. Chances are we won't find much information on the Microsoft driver.

Malakai · Dec 24, 2009

ratbuddy said:
True, but there are some folks who think that just because you make a small partition and RAID 0 it, then put the rest of your stuff on another partition on the rest of the drive, it will perform better. It'll bench better, but perform worse in real use.

FYI what you are describing here is not short stroking. As the OP correctly describes, short stroking either requires you leave the rest of the drive blank, or use it for offline backups, ie put something on it, then remove the partition from your os completely. I have a single 150gb partition on my 640 black, and the rest of the drive is unformatted. The performance increase over having the rest of the drive in use is indeed notable.

You can also use more of the drive for linux, since windows will not access or even see the linux partitions, it has the same effect.

Or you can store stuff on the rest of the drive, then remove that partition from windows via storage managemnt snap in to admin tools. If the partition has no drive letter or mount point, windows will not access it.

Just making 2 partitions as you describe, ie one 200g partition for os and using the rest for storage, is not short stroking, its just plain basic partitioning. Your OS needs to not be moving the disk heads to the rest of the drive, at all, for it to be properly short stroked

I still havent added a second 640 for a short stroked raid0 setup yet, but I made one on my bosses computer. 1 200gb partition on the array, with the rest blank. The performance and access times are fantastic, and while ~1tb unformatted and unused may seem like a waste, they are only 80 bucks a pop. The price of 2 640 blacks, even if only yielding 200gb usable space, is still far cheaper than SSDs or even velociraptors, and frankly short stroking wd640s in raid0 this way is the closest you can get to ssd performance on regular hdd's, and its far far cheaper per usable gb even with less than 1/4 of their total capacity used.

Other stuff you can do with unused space:
install various linux distros to boot into
store backup images of your os partition
back up important files from your other storage drives
put pretty much anything on there, that just needs to be stored, not regularly accessed.

So you can actually still get the performance of short stroking, and make use of that unused space. As long as your OS cannot access the data, and the heads never need to move over the rest of the drive, then you are doing it right. I just left the rest of my drive blank, but if I had 2 in raid 0 I would definitely put that extra terabyte sitting there to use as offline storage. Yo can fit a lot of old tv series into 1tb, and when the need to rewatch something tickles your fancy, just mount the partition, copy it over to another drive, and dismount again. But with 1-2tb storage drives being so cheap, the optimal hdd config not using ssd's is 2 640 blacks short stroked in raid0, and 2 1-2tb storage drives for everything else.
I like WD Green 32mb cache 1-1.5tb drives for storage. They cool, quiet, have good warranties, and last a long time. You dont need 7200rpm all the time for storage drives anyway.

ChinStrap · Dec 24, 2009

Fun thread.

All I can say is since the post thread (Here), I have been playing around with this short-stroke idea and frankly, I like it. For roughly $150 I have equivalent performance to a single SSD and much more size. I use a single partition when building the array and I feel I’m getting the most benefit from this set up as I have a series of external drives I back-up too.

I haven’t really reported much due to my failure to test a single 320 with the whole drive being a single partition. I can say though, single 150g raptor vs. 3 320’s that are short stroked is night and day difference (82MB/s vs. 236MB/s, Average read). The cool thing, if I ever need a little extra space all I have to do is rebuild the array and give myself a little more. I have n/vlited builds (quick) of all the Oss I work with, so reinstalling isn’t a hassle. In my gaming machine I try and only have three or four games installed at one time, so I don’t need that much space.

Jizzzared · Dec 24, 2009

You may have similar read/write speeds with a raided hd setup. However, access time is another story. I've had a raided raptors setup in the past and my single vertex is vastly superior to them in terms of my computing experience.

Mr Alpha · Dec 27, 2009

ChinStrap said:
All I can say is since the post thread (Here), I have been playing around with this short-stroke idea and frankly, I like it. For roughly $150 I have equivalent performance to a single SSD and much more size. I use a single partition when building the array and I feel I’m getting the most benefit from this set up as I have a series of external drives I back-up too.

Not quite. If you look at the first result in this thread where he has short-stroked a 640GB drive down to 15GB and the access time is still almost 70 times slower than an SSD. That is not exactly equivalent performance.

RESULTS: Short-Stroke Single 640gb WD Black

JigPu

Inactive Pokémon Moderator

JigPu

Inactive Pokémon Moderator

tuskenraider

Senior Member

cyberfish

Member

cyberfish

Member

Steven-1979

Member

Nikooo

Registered

ratbuddy

Member

Nikooo

Registered

cyberfish

Member

simcom

Member

cyberfish

Member

simcom

Member

cyberfish

Member

cyberfish

Member

Malakai

New Member

ChinStrap

Member

Jizzzared

Member

Mr Alpha

Senior Member

Similar threads