• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

Why do you lose space when formatting a HDD?

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

sweefu

Member
Joined
Oct 15, 2009
Location
Canberra, Australia
Hey,
i formatted a new 1tb hard drive today, and i'm wondering why you lose so much space? Its a segate and i lost around 70gb just from formatting it.

Why is this?
Thanks.
 

Flurp

Member
Joined
Oct 5, 2009
Location
Washington
Short answer math.

Hard drives are sold as x GB and the computer sees it as x Gb. (could have that backwards :) ) 8 bits to a byte. So your 1000GB hdd shows as a little less.

Wow, that really doesn't make sense on the proof read... Screw it google is your friend.

http://www.google.ca/search?q=why+d...s=org.mozilla:en-US:official&client=firefox-a

haha nice one :)

I would have said that...

HDD manufacturers (ie: the ones who put 100GB/250GB/500GB/1TB/etc on the drive) measure 1GB as 1000000000bytes where as your computer measures it as 1GB is 1073741824bytes(1024^3) So with a 1TB your computer should see it as roughly 931GB prior to formatting. that being said once formatted it should be slightly less

Edit: check out this site :)
 
Last edited:

deadlysyn

Folding Team Content Editor, Who Dolk'd my stars S
Joined
Mar 31, 2005
Location
Stealing your megahurtz at night
Its the secretive HDD space gnomes.:p

Its actually because HDD manufacturers mark the drives with base 10 (10,000,000 or 1,000,000,000 and the like), while MS has the OS reading the hard drive space as base 2 (1024 MB is 1 GB, while HDD manufacturers use 1000MB is 1 GB). You are not actually losing any space, it is because of the difference in the way that manufacturers and programmers measure the hard drive space.
 

visbits

Member
Joined
Jan 20, 2009
Hard drives are done by 1000 bytes vs 1024 bytes. Once you get to sizes that big it becomes noticeable. Whos Stupid idea it was to do it that was should be killed.
 

madhatter256

Special Member
Joined
Jul 5, 2008
Location
CFL
Blame Western Digital. I remember reading somewhere that they started to label their drives that way. Part of the reason was marketing, the other half for technical issues.

It's easy to say 1terabyte drive over 1073741824bytes drive.

They started doing that back when 100mb HDD costs over 4grand.
 

jaymz9350

Member
Joined
May 13, 2006
Blame Western Digital. I remember reading somewhere that they started to label their drives that way. Part of the reason was marketing, the other half for technical issues.

It's easy to say 1terabyte drive over 1073741824bytes drive.

They started doing that back when 100mb HDD costs over 4grand.

Just to be knit picky, The hard drives are correct and the OS is wrong.

http://en.wikipedia.org/wiki/Gigabyte

http://en.wikipedia.org/wiki/Gibibyte

It's funny though how out of hard drive makers, memory makers, and OS makers (well at least Microsoft, not sure if OSX or Linux has it right or not) the only one who actually uses the terms correct is the one who gets all the flack.
 
Last edited:

tinymouse2

Member
Joined
Jun 11, 2009
Location
Surrey, England
It's on wikipedia... you'd better trust it!

+1 on Flurps extension on the definition. Indeed Computers see it in base 2 (hence 0's and 1's) and manufacturers see it in base 10 (1,2,3,4,5,6,7,8,9,0.)
 

jaymz9350

Member
Joined
May 13, 2006
It's on wikipedia... you'd better trust it!

+1 on Flurps extension on the definition. Indeed Computers see it in base 2 (hence 0's and 1's) and manufacturers see it in base 10 (1,2,3,4,5,6,7,8,9,0.)

wikipedia or not it is the correct information.

And also computers only see what they are programmed to. As i've read a little more it seems starting with OSX 10.6 Windows is all that is left that sees hard drive size in base 2.

I'm not trying to argue as I use giga also, just pointing out that everyone seems mad at hard drive makers for using the terms in the correct way the rest of the world does and Windows sees it in the "accepted" form used nowhere else (the best I can find) out side of it's OS and memory manufacturers.

Side note, some Linux releases may also use giga but i'm not 100% on that.
 

TheGreySpectre

Member
Joined
Sep 6, 2003
Gibibyte may be the correct term, but no one is going to use it.

Regardless of the fact that the SI unit is gibibyte there is too much documentation for electronics, both manuals, design documentation and code comments that uses gigabyte for it to change in anything but official official standard. The binary specific words did not come into existance until 1998, a good 15-20 years after words like kilo, and mega had come into common usage as 2^10 and 2^20. I would actually say the powers of two versions of giga/mega/etc are more common then their base 10 versions. The prefixes were already defined though as 10^__ so even though the base 2 varients make lots of sense and are predictiable they did not conform to the standard already established.

I don't think I have ever head someone actually say Gibibyte or seen it written, and I work coding mebibytes kibibytes and gibibytes every day at the signal level.

It is very similar to the unit of calorie. When your food says it has 170 calories it really has 170 kilocalories but because it is on food and is context specific the meaning is understood.

I also happen to think Gibi/mebi/kibi are stupid words because they have no basis in etymology unlike giga mega and kilo.

I am not trying to flame you, you are very correct with your termonolgy. However the techicly correct termonology and the accepted use are fairly different.
 

benbaked

Folding/SETI/Rosetta Team Member
Joined
Oct 20, 2005
Location
WA
Just to be knit picky, The hard drives are correct and the OS is wrong.

http://en.wikipedia.org/wiki/Gigabyte

http://en.wikipedia.org/wiki/Gibibyte

It's funny though how out of hard drive makers, memory makers, and OS makers (well at least Microsoft, not sure if OSX or Linux has it right or not) the only one who actually uses the terms correct is the one who gets all the flack.

Whatever, the IEC and wikipedia nerds can cry all they want about people using the correct terms but the fact is the terms kilobyte/megabyte/gigabyte/terabyte had been around for decades and then the IEC came by a few years ago to try to clarify things. A kilobyte is 1024 bytes and to me it will always be 1024 bytes, I don't care what the IEC or wikipedia tries to say about it. Do they still explain to students how we get these numbers in the first place, that a byte is eight bits? Obviously the majority of the industry is old school like me and also rejects those new-age terms because I never see those terms used outside of wikipedia or some tech writeup on a random website.

And Pluto IS still a planet. :p

edit: TheGreySpectre beat me to it. :)
 

I.M.O.G.

Glorious Leader
Joined
Nov 12, 2002
Location
Rootstown, OH
If you use a term, typically the industry determines the authoritative meaning of that term.

If we're talking about correctness, I'd refer to the way manufacturers use the term as an authoritative answer. Anyone else is simply making up their own "standard".
 

TheGreySpectre

Member
Joined
Sep 6, 2003
It seems to me like they would have had better luck with changing the 10^__ name as it is used less often. like add a D or something do it for decimal. I would have been ok with kilodbyted, megadbytes, gigadbytes and adapting to them would have been easier as I rarely use the need to refereces a decimal number of bytes.

Imog the only problem with that is the fact that the industry uses it both ways. Hard drive manufacturers use it to mean 10^__ but RAM comes in 2^__. Also computers are not the only industry that deal it. ASICs and FPGAs also use the terms for memory and use it in the 2^__ format.
 

dorkbert

Registered
Joined
Oct 28, 2009
Location
California, USA
The gist of it is that some marketing genius (yes, I am mucking them; Quantum started it IIRC) in the 90's decided their drive would look "bigger" than their competitors' if they started counting in base 10 (1k = 1000 bytes, etc.) instead of base 2 (1k = 1024 bytes) Didn't take long before their competitors wised up and follow suit. Soon thereafter they all got hit by class action suit for misrepresenting their products (and was thus followed by notice on packaging that they're counting with base 10, not base 2.)
 

Mugsy323

New Member
Joined
Dec 11, 2012
The gist of it is that some marketing genius (yes, I am mucking them; Quantum started it IIRC) in the 90's decided their drive would look "bigger" than their competitors' if they started counting in base 10 (1k = 1000 bytes, etc.) instead of base 2 (1k = 1024 bytes).
(actually, you mean base-16, not 2).

I know this is a two-year old thread, but I came across it researching this particular issue and there isn't much else out there on the subject, so it bears pointing out the explanation given here is almost completely wrong.

While hard drive manufactures DO use base10 (instead of base16) to claim their drives' capacity is larger than they actually are, that alone does not account for the amount of "wasted space". For example: an "80gb" drive after formatting has only 75.1gb of free space on it. Multiplying 75.1 * 1024 does not = 80.

The problem is a combination of "overhead" and "block size". Some space is lost in formatting creating the "FAT" (File Access Table) that maps out where every file is stored on your drive. Don't get hung up on "FAT" vs other formats like "NTFS". I'm using "FAT" in the general sense.

But "Block size" is the REAL culprit here. Imagine a cigar box, the bottom of which is covered with quarters lying flat. The quarters are the blocks where data is stored, the space between them where the box shows through is wasted space (a better metaphor might be postage stamps in a frying pan, though in reality, neither accurately depict how data is stored). Now, you CAN decrease the wasted space by using smaller coins like pennies or dimes (ie: smaller block size), but file-compatibility becomes a problem as most programs aren't built to tolerate nonstandard block sizes. You may reclaim a LOT of wasted space by formatting your drive using a smaller block size, but expect your computer to crash a lot as instability becomes an issue.

Why no one creates a more stable OS that supports smaller block sizes is a mystery, but it sure would help a lot of people recover a lot of wasted drive space.
 

I.M.O.G.

Glorious Leader
Joined
Nov 12, 2002
Location
Rootstown, OH
Thanks for the comment mugsy.

If you read more, you'll find base2 and base16 are used, depending on what perspective you look at it from.

Ref: http://boards.straightdope.com/sdmb/showpost.php?p=3390900&postcount=12

All so far is true except that derivation of these numbers is from Base2 rather than Base16. Windows, like every other computer, uses Base2 for its math while early programmers, unlike any other humans, wrote code in Base 2 and used Base16 to represent those codes.

The reason for this is (my WAG) two-fold. Firstly, if you were a computer programmer way back you had to be familiar with binary, but binary is difficult to read at any length. (Is this 7-bit or 8-bit code? Damn!!) Hexadecimal condenses 4 digits of binary into 1 digit hex using 16 different characters, instead of just a string of 1's and 0's. Secondly, binary and hexadecimal notations align on every 4th power of two, where decimal aligns with binary on very few--if any--powers of two since it is based on powers of 10 (which is not a power of two).

I'm not sure about your block size statements and OS stability. Block is more of a 90's term, and I don't think its used much outside of linux currently. Do you have a reference to read more about that? Block size is sometimes used interchangeably with cluster size or sector size, and I've changed cluster size in the past with no impact on stability or performance. I'm mainly looking to understand what you are referencing here.
 
Last edited: