Since its debut, Windows Vista has taken nothing but flak from almost every demographic one could think of. Everyone from the casual user looking to browse the web and type up a few reports to the benchmark fanatic obsessed with squeezing all the speed he or she could is likely to complain about Vista being bloated and slow. Windows 7 on the other hand has been hailed as being noticeably better performing, and supposedly as light as XP. And what about XP? How do they really stack up to one another? The examination of these questions follows.
Personally, I’m an avid benchmark junkie, so I could only look from this perspective. I’m unconcerned with how things “feel”, but rather how they score. Hard numbers are what matter to me. They might not matter to many, but they measure speed in its true essence, devoid of any subjectivity. Bearing this in mind, I selected six of the most popular benchmarks used by overclocking enthusiasts, each tending to have unique biases with regards to what part of the system they emphasize.
Editor’s Note: While the author is being modest, Gautam is a world renowned benchmarker, and is an authority on the subject of Windows benchmarks.
The Benchmarks
3DMark03 – Predominantly measures GPU performance
3DMark05 – Predominantly measures CPU and memory performance
3DMark06 – Measures both GPU and CPU/memory, and additionally tests multi-threaded performance
Aquamark3 – Almost exclusively measures CPU and memory performance, with an emphasis to the latter
SuperPi 1M – Measures single-threaded CPU performance and is slightly influenced by memory
wPrime 32M – Measures multi-threaded CPU performance with no influence from memory
Some might be wondering why 3DMark Vantage was omitted. The main reason is that it would be a bit boring. Each operating system appears to score nearly identical in 3DMark Vantage, and any variations are within the margin of error.
System Configuration
I used a setup that I would consider fairly typical for an overclocking enthusiast:
Intel Core i7 965 Extreme at 4 GHz
ASUS P6T Deluxe OC Palm Edition
6GB Corsair Dominator GT 2000C7 at 960MHz CAS 7
2x ATi Radeon HD 4890’s at stock frequency of 850/975
To be perfectly honest, the system configuration will likely have an impact on how the various operating systems compare with each other. Therefore, using one that is modern and high-performing is, in my eyes, the fairest way to compare them.
The Operating Systems
Windows Server 2008 x64
Windows Server 2008 x32
Windows 7 x64
Windows 7 x32
Windows Vista x64
Windows Vista x32
Windows XP x64
Windows XP x32
The operating systems are the usual suspects, all with the latest service packs installed. I added Windows Server 2008, as some people have supposed that it is faster than Vista, which it is based on, and I wished to put that theory to the test. Additionally, I tested both 32-bit and 64-bit variants of each operating system. How they handle the memory subsystem is important when it comes to benchmark performance, as we will see. Lastly, in order to make the tests fairest for the operating systems, I trimmed all eight of them using nLite and vLite. Consequently, I made the running services constant between all of them to rule any out as a factor. My vLite profile is as follows:

vlite profile
Only the important stuff remains, with all the fluff removed.
And nLite (for XP 32 and 64):

nlite profile
The Results

3DMark03 Results
Only one thing predominantly sticks out when viewing the results for 3DMark03—good ol’ XP doesn’t fare too well, while all the others are very close, with Windows 7 being slightly in the lead. Since 3DMark03 is heavily GPU-centric, this dead heat is not too much of a surprise. The benchmark depends mostly on GPU performance and is not heavily influenced by much on the system side, OS included. Still, it certainly shows XP’s obsolescence.

3DMark05 Results
Now is when things start getting interesting. 3DMark05 emphasizes CPU and memory performance, and consequently we can see the operating system having a very noticeable impact on performance. In fact, that the only two operating systems that even perform similarly are Server 2008 and Vista. This does not come as much of a surprise, considering that the two are mostly the same under the hood, and are even more similar after I ensured that the running services and installed components were as close between them as possible. XP once again lags far behind the rest of the pack, but interestingly enough both 7 32 and 7 64 also score considerably lower than Vista and Server 08. 7 and XP being the worst performers certainly flies in the face of conventional beliefs. Another very interesting thing to note is that the 64-bit variants for 7, Vista and Server 08 all perform worse than their 32-bit counterparts. We must bear in mind that this benchmark uses under 1GB of memory, but for this quantity, the 64-bit OSes handle the memory sub-system a bit slower.

3DMark06 Results
The results in 3DMark06 are somewhat similar to those in 05, however, this time around Windows 7 pulls up far ahead, scoring almost evenly with Vista. Also, the hit going from 32-bit to 64-bit in Windows 7 is much smaller than it is going from 32-bit to 64 in Vista and Server 2008. XP is still decisively in last place, but the margin is a bit smaller this time around, thanks to XP scoring better in the CPU test portion of 3DMark06 than the newer OSes.

Aquamark3 Results
The results from Aquamark3 are quite similar to 06. Windows 7 once again makes a strong showing, and once again, 64-bit does not seem to hurt 7 very much, but takes a slight toll on Vista. Both versions of XP are far behind, but curiously enough XP 64 is considerably better than XP 32. Server 2008 is similar to Vista, however it’s only fair to point out that run #3 for Server 08 32 was a bit of an outlier, what one would call an unlucky run. The first two runs had it performing on par with Vista.
One important thing to note about Aquamark3 in particular is that there is a heavy dependence on graphics drivers. These results only look this way on ATi GPU’s, like those used in this test. On nVidia GPU’s, XP is actually slightly ahead of the others. You’ll have to take my word on that since nVidia results aren’t included in this roundup, but curiously enough, running ATi in Windows 7 scores about equal to a comparable nVidia setup in XP.

SuperPi 1M Results
These results are very different indeed from those obtained in the 3D benchmarks, and are almost completely the opposite. XP 64 has a noticeable lead over all the others, and is also the most consistent. Interestingly enough, this is the only benchmark where Server 2008 appears to be considerably faster than Vista. However, just like in the 3D benchmarks, the 64-bit variants of Vista and Server 2008 are slower. In 7 it’s the complete opposite, with 7 64 noticeably outperforming 7 32, further supporting that the 64-bit version of 7 does indeed seem to be optimized in some way that 64-bit Vista is not.

wPrimeResults
I’ll start out by saying that I tried to work out exactly why XP 64 scored so poorly, but I’m afraid I can’t offer any explanation, so it has to be taken at face value. Otherwise, XP 32 is still ahead of the newer OSes, but by a smaller amount than it is in SuperPi. All OSes in fact are very close to each other, barring XP64. Windows 7 though once again shows some weakness on the 2D side of things, but 32-bit and 64-bit are in a dead heat.
Conclusion
So, who’s the winner? Well, if you’ve scrolled to skip past the graphs, every single benchmark has a unique operating system that does best. Overall, the two most solid performers are Server 2008 32 and Vista 32. Both of these are at the top for the 3D benchmarks, and fare okay in the 2D benchmarks as well. Deserving of flak in every day usage or not, in benchmarks, Vista 32 performs very well. Contrary to popular belief, XP and 7’s supposed “lightness” does not really translate in benchmarks. In fact, the more CPU-centric a benchmark is, the worse 7 tends to do. The only thing XP remains good for are 2D benchmarks, falling far behind the pack in all things 3D. Once again, this article only sets out to show which the fastest operating systems are by the numbers. The fastest choice might not necessarily be the best one for you.
Questions and discussion of this article are on Overclockers Forums, join in!
- Gautam
Related posts:
- Windows XP vs Windows 7 – Benchmarks
- Benchmarks: Windows 7 RTM versus Vista, XP
- Deploying Windows 7 in the Office Environment
- XP-Style Buttons and Clickable Search for Windows 7
- K|ngp|n Cooling Summer Sandy Showdown
Tags: benchmarks, comparison, Windows 7, Windows Server, Windows Vista, Windows XP





12-14-09 01:49 PM
12-14-09 01:50 PM
If you like it, and have a Digg account, hit that Digg button at this link!
http://digg.com/microsoft/Windows_Sh...n_6_Benchmarks
12-14-09 01:52 PM
12-14-09 03:46 PM
12-14-09 04:00 PM
I would love to have seen a stock vs. stock OS comparison as that is what most people (non serious benchers) run.
12-14-09 04:05 PM
12-14-09 04:52 PM
12-14-09 04:54 PM
Firstly: how do they perform if you strip them all down to the bare minimum. I understand why you chose to keep them all on an even keel for this article but I think there would be some value in doing the tests again with each OS stripped down as much as possible.
Secondly: How do Vista and Vista SP1 compare?
Thirdly: Would it be worth seeing how XP/XPSP1/XPSP2/XPSP3 compare?
A very interesting article, which also raises some questions :-)
12-14-09 05:08 PM
Glad some one as trusting as you took the time to do all of these .
12-14-09 05:35 PM
12-14-09 05:36 PM
12-14-09 05:43 PM
However I believe he wanted a COMPLETELY stripped down OS as opposed to the version he has. I will hope he clarifies soon.
Personally, I would have rather seen this run at all stock since thats how 99% of people here run their OS's.
12-14-09 05:44 PM
12-14-09 05:55 PM
Taking out all the extra garbage that runs on the OS, and actually evaluating each release on an even scale, performed by someone who knows what they are doing - that hasn't been done anywhere else. (Except maybe on xtremesystems, assuming he might have released his stuff over there too)
12-14-09 05:58 PM
12-14-09 06:00 PM
12-14-09 06:01 PM
And that was my other thought (stock tests here we can TRUST!).
These guys have A LOT to offer and its wonderful to see them even more involved!!! W00t!
12-14-09 06:40 PM
If I went any lower it would be at the point where they start acting goofy. Let me put it this way, I can already fit all my OSes on CD's (not DVD's) and installation takes about 8 minutes tops.
Not only that IMHO "stripping" is overrated anyways. I don't think disabling services or removing components changes how an OS scores much anyways. I made them all even though just for the sake of being thorough. Vista without SP1 is wrought with issues especially with a Crossfire setup like I used, so it wouldn't quite be right to try without it. And that along with your 3rd question can also be answered like this. 3DMark06 takes almost 9 minutes to run all by itself. Then multiply that by 3 and you're easily surpassing 30 minutes per operating system per benchmark. I think you know where I'm going with this.
2008 R2 is based on 7. I'm thinking of giving it a whirl as well, but we can probably expect the difference between 2008 R2 and 7 to be similar to that between 2008 and Vista (in other words, negligible)
12-14-09 06:59 PM
12-14-09 07:37 PM
How does it do with 1156 Intel CPUs? Some benchmarks have proven dual channel to be better...
I would love to see these same tests on a core i7 860 @ 4GHz.... Then those compared to the 965 as well...
12-14-09 08:00 PM
Matt
12-14-09 08:05 PM
12-14-09 08:09 PM
12-14-09 08:25 PM
Admittedly, it's a lot of work for relatively little info.
12-14-09 08:37 PM
Just looking at the XP64 wPrime score, did you try with and with out graphics drivers, to see if it changed ?
It looks like the kind of drop in score by not running drivers, maybe an issue with that OS and it's drivers ?
12-14-09 08:53 PM
I just had one comment. Instead of putting the results of all three runs and an average, I think it might have been more useful to only show the average with some error bars.
12-14-09 09:03 PM
We both took the same considerations in mind however, which is interesting.
12-14-09 09:08 PM
12-14-09 09:16 PM
I'll take another look. All the OSes are still in their same state.
12-14-09 10:30 PM
Anyway, the theoretical implications behind this are that the optimizations in each OS are weighted differently, leading to these results, correct? And thus Win7 + Vista are statistically indistinguishable performance-wise given these benchmarks.
Finally, I don't really have a solution for this given your dataset, but I dislike graphs not starting from zero, since that skews the scale of the real differences. Dunno what really to do here, though.
12-15-09 02:04 AM
12-15-09 02:25 AM
Probably plenty don't like the graphs not starting from zero, but it's conducive to what I'm trying to present. This is not supposed to be an academic paper, it's supposed to tell people clearly what OS scores the best. The real difference is tiny, yes. For example even between the averages of 7 64 and Vista 32, it's just 2.35%, which in real terms is tiny. However, on the 3DMark05 hall of fame, it's actually larger than the difference between 1st place and 5th place, which is a very big deal for anyone competitive.
12-15-09 02:26 AM
12-15-09 02:31 AM
12-15-09 02:53 AM
But it could be a one off issue with that bench/OS, or not
12-15-09 02:55 AM
12-15-09 03:05 AM
A little O.T...
I like the new look of the main page, looks good
12-15-09 03:13 AM
Thank you. iNet and our mods played a large part in making it what it is. Dogsoldier also deserves an immense amount of credit - he took part and won the logo design contest we held on the forums, and the entire design was centered on his logo. I owe him an article still, highlighting his artistic portfolio.
12-15-09 03:23 AM
12-15-09 02:44 PM
http://www.madshrimps.be/vbulletin/f...35/#post250569
Mainly, his comments are a clarification of what the hardware Gautam chose means to the bencmark results. It's good insight, along with most things you'll find John posting at madshrimps.
You can find their articles here:
http://madshrimps.com/?action=articles
As well as their news here:
http://www.madshrimps.be/?action=news
12-15-09 05:58 PM
SP3 for XP32 is a performance killer de luxe- likewise I found Vista without SP1 to perform noticeable worse than with SP1 (never tried SP2). Microsoft did put in a lot of work in SP1 (and probably the same with SP2) to increase Vistas performance because Vista lagged behind XP in performance at that time.
Can Gautam clarify the question regarding which service packs ????
*****I got flashbacks to Nvidias frame jump cheating after M$ released SP1 for Vista
12-15-09 06:37 PM
12-15-09 07:11 PM
Looking at the the 3Dmark03 scores, we see a spread from about 106k to 111k, or a relative difference of 4.7%. The standard deviations for the measurements look pretty small (but I can't calculate without the raw numbers), so the differences appear to be statistically significant, even for the small sample sizes. But they differences aren't as huge as they appear on the graph, so it's worth noting.
Similar story on 3dmark05. About a 3.6% relative spread.
On 3Dmark06, the relative difference from high to low is a paltry 1.5%, with larger measurement variation than the other benchmarks. While there is an observable trend, it's statistically not much greater than the measurement variation. This is a case where the graph may not be worthwhile. Vista32 vs 7-32 are within 0.1% of one another, well within measurement error from what I can tell.
Calling XP "decisively in last place" in this case, with a clipped graph range (focused upon 3% of the data range), is scientifically misleading in my opinion. If I were reviewing this paper for a journal, I would absolutely require the authors to revise these strong claims for this graph. The data simply don't bear them out.
Aquamark3: about 4%.
SuperPi 1.5M is similar: all tests within 1%.
wPrime 32M without the outlier: 3.5%
I find the overall conclusion should be different: for most of these operating systems, the performance is within 5% of one another in benchmarks. The differences may or may not be statistically significant. Given the small difference in the benchmarks and the unknown statistical significance of those differences, other metrics should be used for the basis of choosing the operating system. It is interesting, however, to note that Vista and Win7 can be trimmed to perform on par with XP.
I'm not trying to be harsh or unduly critical; this is just my view as a scientist looking in. The benchmarking techniques are great, but the data analysis and presentation could use improvement. The relative differences give a clear picture and should be included, rather than looking at clipped graphs. I know it's common in benchmarking articles and reviews, but well, there's not a high standard of rigor out there. We can be the best.
12-15-09 07:17 PM
12-15-09 07:19 PM
I initially took similar concerns with the presentation of data. However, most specifically buried in this growing thread above is this rationalization from Gautam:
The graphs are meant to illustrate there's a consistent and reliable performance difference. These differences may not be statistically significant to the average user, but then the article, as well as Gautam's comments also make it clear that's not the intended audience. As a benchmarking piece, talking about the OS differences which mean the difference between 1st in the world and 5th in the world... If I were at the top of the benchmarking foodchain, these small differences begin to look significant.
From the sites I've seen that have picked up the article, the intended audience mostly interpreted it as it was intended. In contrast, I think I saw some posts on the macrumors forum where they may have been taken incorrectly.
12-15-09 07:25 PM
To validly claim 2.3% difference as significant, you need more samples. And there's no getting around the fact that the graphs, without some statement on the small actual relative differences, are misleading, particularly for those younger audience members who haven't had much experience in science, mathematics, statistics, etc.
The conclusion should be "little-to-no significant difference in benchmarks", not "XP definitively worst" etc. I'd hate to see people shell out $200 or $300 to upgrade their OS based upon magnified views of performance differences that may not even be significant.
Regarding 3Dmarkland, perhaps the real lesson is that the top 5 places are statistically equivalent.
12-15-09 07:26 PM
In other words - I am interested for my own sake in the information about which SP's used here...
So for me this test will be "just one of those test's" - I know that I will have to install XP,Vista,2008 and 7 and test all those in my system to find which OS performs best since a lot of the performance is driverdependant. One of the most noticeable performanceimpacts in my system comes from the Areca 1680ix - 4GB raid controller (6x SSD's in raid0) - and this controller seems to favour XP.
I can also dig up articles that says that XP SP3 is (was?) a lot faster (claimed up to 10% faster) than Vista SP1 was in benchmarks...
12-15-09 07:28 PM
The thing I like about Guatam's article is the precision he attempted. Just not thrilled about the data presentation and the conclusions drawn from them.
12-15-09 07:50 PM
I could also see value in it being more complete looking at it from your perspective Paul - the same set of graphs based on a 0 scale would ensure it's more clear to any lay person. Perhaps that would make sense to present it after the conclusion - but I'm not motivated to work up the graphs myself. Perhaps Gautam is. For an astute reader however, all the information is sufficiently presented for one to draw the proper conclusion.
Maybe we set the bar high with expectations for readers to look at the results critically, but really I thought it was sufficiently clear. I can also see danger in setting the bar too low - putting the kid gloves on it is going to make a lot of prominent overclockers yawn when all they are really interested in is the hard data. It's the people at the forefront of the hobby which drive interest, and those are likely the people we should rightly cater to. As a site, we generally do very well at getting the people starting at a base level up to speed - I'd say that's our strong suit.
That wasn't very concise. Essentially, I think there's a balance and I think we're in the right place.
12-15-09 08:02 PM
However, I still think the conclusions are incorrect.
"The only thing XP remains good for are 2D benchmarks, falling far behind the pack in all things 3D. Once again, this article only sets out to show which the fastest operating systems are by the numbers. "
Incorrect conclusion.
All the operating systems perform within 5% of one another on all the benchmarks, and most often within 2%. This is a near statistical dead-heat. The real message is that (1) if properly tuned, Vista isn't much of a hit, and (2) upgrading to Win-7 is not much of a gain. The numbers tell me to stick with what you have, which is a completely different conclusion. A couple of percent is not "far behind." The numbers don't lie per se, but do need to be understood in proper context. In quantitative work, that context is called "statistical significance" or at least "relative differences."
Saying that article told which OS is fastest "by the numbers" implies that the numbers significantly supported that conclusion. They did not. At least not such a strong conclusion.
Ultimately, this is why I think the graphs are a problem: they tend to not just mislead the readers, but the writers. I've found that the workflow usually goes like this: (1) do the work. (2) plot the results (3) analyze the graphs to make sense of the data (4) write the article accordingly. A misleading plot in (2) throws the whole thing off. In fact, I find that clear, well-selected graphs are more important to the writers than the readers, as it makes or breaks the science.
If it takes skewed plot to show a result, then there is probably no result. (Which in itself is a result, just not the one presented here.
12-15-09 08:18 PM
I think he did a superb job and accounted for margin of error very well with the number of times he ran the benches. This isn't a dissertation or peer-reviewed journal article and doesn't pretend to be. It's an article written by an authority on benchmarking, directed at like-minded individuals and attempting to show the differences between operating systems. It accomplished that well.
The largest point that might be important to note regarding his results is the one I.M.O.G. linked to at Mad Shrimps. That may be worth exploring at some point.
For every day use, the layperson can take their pick of operating system. They may see lots of differences, but the largest of those will not be speed. If the article was aimed at that audience, graphing from zero would have made more sense. Indeed, the conclusion you have proposed (the difference is < 5% for any tested OS) would be valid. However, it's not the audience this was written for and I think even the layperson would be able to tell that. Actually, the lay person would probably be sitting there thinking "WTF is Wprime?"
12-15-09 08:21 PM
And if the difference between scores is statistically insignificant, then maybe that's all there is to it: the top 5 places are all statistically equivalent, so the exact ordering really doesn't matter (except as a matter of pride, which I understand).
No, they don't. Magnifying insignificant differences does not make them significant. Significance has a very real and non-fuzzy meaning, and it has not been achieved here.
I'm not a layperson, and even I was "confused" that the article was aimed at me as an overclockers.com reader, even though I'm not a benchmarker.
As I said, we have a diverse audience. Hardcore benchmarkers, hardcore coolers, people who are looking for the best bang for the buck, people who just want to find efficient cooling methods, people who want reliable hardware based upon people who have truly stressed it, etc. So, teh different conclusions are valid, precisely because we aren't a monolithic group. -- Paul
*edit*
As I think about it more, there are really two different benchmarking communities out there.
Group 1: Uses benchmarking to compare product A to product B. Do they differ significantly?
Group 2: Uses benchmarking to compete. Who gets the highest number?
What we have here is a member of group 2 writing an article for the audiences of both groups.
*/edit*
12-15-09 08:52 PM
glad to be apart of these forums .. you guys are awesome !!
12-15-09 11:21 PM
I don't know exactly how to reconcile the other issues. Yes, anyone that's been through a mathematics or engineering background (as I have myself) has been repeatedly taught that it's wrong to present data as I have. Nevertheless, I'm not budging on that point. First of all, the graphs look bad if they're zeroed. Second, ask yourself, how does statistical significance translate into the real world? It's possible to force a statistically significant outcome that's insignificant in the real world. Conversely, some things that are statistically insignificant can be very significant in the real world. Sure, in terms of the GDP, your salary is statistically insignificant, but it would certainly be significant to you if that statistically insignificant part of the GDP were to disappear.
And besides that, even though the sample space is very tiny, the standard deviations are quite tiny in every case, far smaller than the difference between the best and worst OS. A percentage difference doesn't really mean much if you don't consider the standard deviation.
Oh and I should have noted this in the article but everything was as new as possible. SP3 for XP, SP2 for XP64, SP2 for Vista/08.
12-16-09 01:21 AM
12-16-09 03:22 AM
12-16-09 03:44 AM
You make some good points, and I want to reserve this space to write something back.
More later. Thanks -- Paul
12-16-09 06:32 AM
I think I stick with XP SP2 for a while longer - I use my "high-end" system mostly for other things than gaming and benching
But I certainly gonna give Server 2008 x64 a go - x86 os'es which only support 4 GB memory (w. PAE) isn't interesting anymore imo. so at least you as others are pushing me in that direction so to say
12-16-09 07:24 AM
I considered 64-bit only as part of a multi boot so that I would have the option of booting into 64-bit when I really need it... but so far I could not find justification for over 3.5 GB of RAM for my personal use which includes use of older programs incompatible with 64-Bit OS.
On my triple Windows 7 / XP / Vista boot [all 32-Bit], I have certainly found Vista to be slower in a way that I can feel in comparison to Windows 7 and Windows XP.
I suppose benchmarks measure things once they get going and real life also includes getting them to go, which is what I mean by feeling faster vs. slower.
12-16-09 04:28 PM
12-16-09 05:07 PM
12-16-09 05:56 PM
I do use photoshop a lot and especially with some heavy filters I always end up with a system that starts swapping madly to the disks (Raid0 on Areca 1680ix w. 4GB ).
The performance difference is then suddenly very noticeable in a x64 system with 8GB (or more) vs. x86 with it's memory limitations.
If I were a avid gamer or bencher I would have gone much longer than Gautam did in his stripping - I would start off with MicroXP and stripped it to the bone. And of course ; no AV running in the system - actually not a single start up program at all.
If it all is about performance - I think MicroXP is a good place to start
From my experience with both Vista and 7 - I see (saw with Vista - didn't try it after SP1) that XP always loaded programs faster, search was faster (without indexing on), rendering was faster +++
I guess (hope?) that this is things that Microsoft will sort out in 7 - remember that XP had a lot of problems in the start too, XP got good after SP2..
EDIT: I guess a lot of the oldtimers that doesn't get impressed of the eyecandy that 7 offers will keep on running XP till the bitter end
12-16-09 06:04 PM
The ID3 tag bug in 64-bit XP may be what pushes me over to linux, though. :-)
12-16-09 06:15 PM
I understand your motivation in magnifying differences both graphically and in word choice, and I have to say that it hits home for an audience of competitive benchmarkers. Such an audience IMHO rely on statistical deviations as much as hardware in their quest to beat the next guy (since, after all, glory goes to the man with the best single datapoint, not the best statistical average
However, as somebody who doesn't competitively benchmark I agree with macklin -- the magnification of tiny differences just misleads me. I'm not a statistician so I can't comment on just how statistically significant things are (or aren't), but the OSes aren't as differentiated as the language and graphs suggest.
If I may, I have a few suggestions:
JigPu
12-16-09 06:18 PM
Indeed, you could use the data as a follow-up article for non-benchmarkers, because your results are also significant to us, but with different conclusions (as I mentioned above). It's interesting that the same data tell different stories depending upon your target. I'd be happy to help write a very short follow-up note / article-ette.
It's actually funny, because we end up using the same (software) tools for very different purposes.
Since we have both target audiences here, it might be a nice way to get further mileage from your great, hard work. Also, it might be nice to have our "cultures" intermingle.
12-16-09 07:18 PM
I guess it helps to clarify.
12-16-09 10:33 PM
That being said, it is certainly interesting to see the effect that the operating system has on a given system for various applications. I wish more of us had the resources and time to do similar benchmarks in order to be able to compare with different configurations.
12-17-09 12:25 AM
http://www.overclockers.com/windows-...-6-benchmarks/
12-17-09 01:06 AM
About the tone and all of that...I guess what I probably should have stated up front is that this began for the benching team. In fact, it was in the private team lounge in a much less refined state for months, but I was asked to make it public. So the nature of the testing and the conclusions was from the getgo intended for them. (It's also why it remained private...using Vista over XP was somewhat of a "trade secret" that's been used successfully to grab some records)
One other example that might hit home to a lot of people here is that if you were to take 3% off of 4000MHz, it'd put you 3880. However, I can ensure that many members of this forum have gone through great lengths to get that extra 3%.
12-17-09 01:31 AM
Out of all the sites that picked up your article (about half a dozen highly relevant community sites), www.hwbot.org and www.madshrimps.be had the most on point evaluation and commentary. Props to them.
12-17-09 02:26 AM
12-17-09 02:42 AM
12-17-09 02:57 AM
12-17-09 03:10 AM
12-17-09 03:31 AM
....sorry to see some people giving you headaches.
..... lol Bob, you might catch grief for saying that, but +1 brother I'm with you.
12-17-09 03:32 AM
There's a great risk when a group keeps itself isolated because it doesn't want to hear contrary opinions or analyses. The group loses out because it develops a monoculture that's susceptible to unchallenged dogma. The broader community loses out because they don't get the group's in-depth expertise. When both work together, both are enriched. They just have to learn one anothers' vocabularies and motivations.
12-17-09 03:36 AM
I believe that Guatam's work is in this category: well-done work that will emerge all the stronger.
Firewalling ourselves from differing points of view isn't healthy or conducive to understanding. If our analyses can only convince people who agree with us, then they probably aren't very good analyses. Fortunately, that's not the case here.
I think there's a good opportunity here to intermingle and strengthen the bonds within our diverse community. Again, I'd like to extend my offer to G to do something together as a follow-up. I'm learning a lot as I read through here.
12-17-09 03:58 AM
12-17-09 04:15 AM
I wouldn't say that. In fact, I greatly appreciate and admire how difficult it is. I myself would never have the time, patience, or budget to do that. But I admire seeing what's possible, and I appreciate that pushing the envelope of the hardware helps advance the state of hardware for the rest of us. At the absolute very least, what you do (1) helps us figure out what hardware has enough quality to survive 24-7 heavy-duty use in less extreme settings (e.g., a 5% overclock applied to a cancer simulation), and (2) pushes the hardware manufacturers to improve their top-end products, which in turn improves the mid- and lower-end products as well. It's a win for everyone. I don't think anybody denies that. And nobody denies that there are benefits to the broader community far beyond this.
What we have here is an interesting discussion. You're presenting work that started in a niche but is interesting to everyone. You're finding different points of view on the same data. That's enlightening for all of us. It's not that somebody or other "doesn't get it." It's that they have a different frame of reference.
The data may or may not be statistically significant. Some plots are, some may not be. I believe most individually are. Nonetheless, a near-NULL result is extremely interesting for the general readership, and the individual results are interesting to the benchers. We all win here. And I think taking care to remember that we are a broader audience is valuable. We gain data that we didn't have before, even if for different conclusions. It's a beautiful case of getting twice as much out of the same data than previously thought. That's a benefit of opening up to a broader group--you find things you would not have otherwise expected.
That's been the case for me. I've been exposed to the thoughts and methods of a completely new group. Aside from reading a few "world record LN2 overclock" articles here and there, this is new to me. And I gained for it. So thanks for opening up. Don't let constructive critiques scare anyone away--it means that we're genuinely interested and want to learn more. You might just get some new recruits for it.
Opening yourself up and presenting your work to a broader, often skeptical audience is challenging and scary. I know exactly how this feels, because I do it every day as a mathematician working on cancer and molecular/cellular biology. The discussions can be heated and draining, but you learn so much and advance your knowledge and your presentation skills so much, that you always come out the stronger for it.
I've also found that the more I learn, the more education I acquire, the more I find myself able and willing to say "I was wrong. I hadn't thought of it that way. That's interesting. That has so much more meaning than I had appreciated. That's deep, and I think I can use it."
12-17-09 04:22 AM
12-17-09 04:33 AM
I hope that Guatam realizes that I wouldn't even be commenting if I didn't think it was a great article worth discussing.
12-17-09 04:34 AM
But perhaps I should do one benchmark with something like 20-50 trials which will also exhibit that the error between results is very small, and when you have even a couple of percent worth of difference, it is significant.
12-17-09 04:34 AM
For example, Vista vs Server 08 vs Win7 in 3DMark03, 3DMark05, minus Vista64 in 3DMark06, etc. Thus my original conclusion(s). I definitely agree with you that there are non-random differences in the mix.
macklin01 got to this first, so I'll let his word stand.
But for myself, I learned alot here from this back and forth, and especially about what benchmarkers look for. I understand now that this is an especially great guide for choosing which OS to run when targeting different benchmakrs. This is something I wouldn't have gotten out of this without this discussion.
12-17-09 04:36 AM
12-17-09 04:41 AM
How about I focus on just 05, just Vista 32 and 7 32 for example, and give them each a much larger amount of trials?
12-17-09 04:55 AM
Anyway, do you think a fair conclusion given these numbers, for an average user interested in upgrading to Win7 from Vista only for performance reasons is "Don't bother - many insignificant results, couple significant ones but only resulting in small differences in both directions depending on benchmark"?
12-17-09 05:13 AM
The response has been overwhelmingly positive, and even gautam would agree it was time to release his work. 6 major community outlets picked up his article, as well as many other smaller ones.
The negativity in response to open discussion is the only thing out of place here.
12-17-09 05:44 AM
It might be good to use the term "small" rather than "insignificant." The differences may well turn out to be statistically significant but not large enough to justify the time spent in an OS reinstallation. Again, depending upon the purpose of the system.
Another funny thought: for some of the benches, there may not be a statistically significant "winner." In those cases, a bencher would be better served by running the benchmark multiple times and waiting for a random event to push them higher than reinstalling their OS.
12-17-09 03:14 PM
Futuremark (former Mad Onion) have created a hype - I did early understand their goal ; earn money on others work.... so I just jumped off
12-30-09 08:42 PM
Hopefully I can afford to implement Win 7 on all of my PCs and laptops.
GREAT ARTICLE!!
01-07-10 09:57 AM
05-12-10 01:33 PM
First of all "Vista is tangibly slower" is based on a subjective conclusion, and is contrary to your claims about the article not being "scientific" and has been known since the OS was in BETA that it was a UI effect to make the OS seem more appealing.
To deal with the Vista comment. It is faster than XP, the difference is in the UI. The "aero theme" has a 1000ms delay that you can adjust. This will make Vista "tangibly appear faster" than XP. But it gets rid of the nice effects. Way back in the day XP had the same issue. They added a delay to the start menu and tweakers hacked the hell out of that OS to make it a benchable system over 2000.
(subjective)For me Vista boots faster, loads programs faster and runs a lot more solidly than XP does. I DREAD having to work on peoples PCs that still use XP. Sad but true. I am even starting to appreciate 7 a bit now that I have forced myself to use it for more than a month. Its still no Vista64 but, it might be. (7-64 would not let me run a ton of software I like so that choice was not an option
As for the basis of the article. It is quite clear and would be too hard to read if it started at 0%. I like seeing them start at 0, and oft times when I see a review that does not start there.. I anticipate a biased report. I can see why they chose to work it how they did. Yes 5% is small in terms of "desktop readiness" it is HUGE when talking about benchmarking though. 5% boost in performance could lead to 50-2000% improvement in boints. (not a typo).
Just saying the article is great. Thanks guatum for your diligence. I find myself linking to or referring to this article quite a bit