Compiling

Add Your Comments

First, the news. Intel announced new compilers optimized for Willamette (and Itanium).

Zzzzzzzzzzzzzzzzzzzzzzz.

I know, BORING.

Silicon is sexy. It’s something you can hold in your hand; it’s real. It’s the athlete in any computer sport.

However, an athlete isn’t going to do very well if he doesn’t train right. Watching an athlete compete is exciting, watching him train is not.

Part of the unseen “training” of a CPU is the programming environment in which it lives. If a program uses all the advantages a particular CPU has, the CPU will perform its best. If it doesn’t, the CPU won’t.

A CPU’s “diet” is machine code. It tells the CPU what to do, and it does whatever the code tells it to do. It never second-guesses or code or figures out a better way to do something.

Usually**, a programmer doesn’t directly tell the CPU what to do by writing out the equivalent of machine code, which is called assembly language. While assembly language is best for squeezing out the last ounce of performance from a CPU, writing a program in assembly language is much harder and slower to do
than to write it in an advanced programming language.

So what most programmers do in most of their programs is write the program in a computer language like C++, then have a compiler turn the code in the programming language into machine code.

A compiler is essentially a big translator, and like a multi-language translator, it provides different code for different recipients. If it’s told to do that.

In the last few years, there have been “additions” to the standard x86 “language.” They’re called MMX, 3DNow, SSE, and very soon SSE2. Consider them to be new vocabulary to the x86 language.

These new vocabularies tell the CPU to use new features built into it, but again, it never volunteers to do these things on its own. The machine code has to tell it.

When you hear about programs being “optimized” for something like 3DNow or SSE, what that means is that the machine code knows these features exists, and uses them when appropriate.

Usually, optimization is the compiler’s job. It’s supposed to know the best way how to tell the CPU to do something. However, if it doesn’t have the latest vocabulary down, it can’t.

Expecting an older compiler to tell a new CPU with all kinds of new features what to do is like expecting grandpa to tell grandson, “Bitch-slap that ho!” or telling him how to roller-blade. It’s just not in grandpa’s vocabulary.

While this is not the only way a compiler can take advantage of new features, you get the idea.

Why This Matters

AMD and Intel have been taking two different paths to better FPU performance. AMD has generally looked more towards improving the hardware (3DNow notwithstanding). Intel has looked more towards software optimization, and SSE2 is the latest step down that road.

The problem with depending on software optimization for CPU performance is that the programs have to use it for it to do any good. Usually, that doesn’t happen until the compilers let the programmers optimize just by checking a box, and you can’t check a box that isn’t there.

Most of you have seen benchmarks which show Willamette not doing too well against high-end TBirds. This is part of the reason in some of the benchmarks.

So that’s why the announcement is important, even though you’ll probably never become aware of it. Intel’s announcing the immediate availability of these compilers earlier than we normally see these. That means programs will begin being optimized for Willamette a bit sooner than expected.

That helps Intel (though AMD chips can also gain some advantage from this).

Does optimizing really make a difference?

Most of you are aware of a program called Photoshop. Photoshop was initially created for the Mac, and later ported to Windows.

On the whole, Photoshop (and especially many of its filters) are far more optimized for the Mac than for a PC. You can run a Photoshop benchmark using a lot of these filters on a 1Ghz PIII, and you won’t beat a
500Mhz G4 by much.

An extreme case, to be sure, but if you’ve ever wondered why a PIII generally did better than an equivalent TBird in some games, optimization for SSE compared to less or no 3DNow optimization is a pretty big reason.

It’s not snazzy or sexy, but good optimization can be worth more than an extra 100-200Mhz of raw processing power.

**You’ll see assembly language programming used more in critical parts of games, but even there, most games are mostly written using an advanced language which is then compiled.

Email Ed


Leave a Reply

Your email address will not be published. Required fields are marked *