The case of a memory fetch, write etc are a special case. You are indeed correct, a fetch will stall an instruction for a given amount of time depending on where the fetch is from. My intent was to show that generally speaking, in the case of an operative stage of the pipeline, the speeds of the other stages are generally based on the slowest one. If you want to get really technical, some of the stages are actually longer stages split over a period of time, but the circuitry needs to be designed to handle that. You can't have a stage that is wired to run in one clock cycle run in two just because it will take that long, it will instead give unexpected outputs.
Some engineers would actively split the 10 unit time I gave into two seperate 5 unit intervals, so that the new longest cycle time was actually 8. However, introducing a new stage introduces another interlock penalty, so in effect it could potentially worsen performance. Such is the nature of electrical engineering
it's all about hunting for that sweet spot. I don't know, I guess I was trying to not be too technical about it...
All stages in the pipeline (other than a fetch/write) are given the same amount of time to complete because we're working with synchronous machines. The way the timing is handled is generally the worst case scenario, depending on headroom; the headroom is extra time left during the cycle after the stage has actually completed its work. Overclocking effectively reduces the headroom available, and eventually you will find the limit of the particular chip when you not only run out of headroom but start reducing the amount of time available for the stage's work itself to complete. Then, the stage is unable to complete and 'interesting' errors will likely (almost guaranteed, actually) occur.
I'll edit later tonight, I have to leave for work shortly.
peace.