It isn't easy to use multiple cores, and not all software tasks benefit from multiple-cores even if implemented. I think a lot of software that can use make use of multiple cores already do. The only difference I think we might see going forwards is how good is that multi-core scaling? For example, on Prime95 in multi-thread mode, I suspect there is some limitation in it that means once you go above about 8 cores per system, the scaling falls off. It can't efficiently spread the work out. Light workloads like Cinebench R15 are pretty much best case, in that each thread can work independently from others. With some other tasks, where there is inter-thread synchronisation required, things get difficult fast.
I'm waiting for the day when people wake up and realise core count isn't everything even if the software scales. It is only one contribution to total performance. Especially when we get to higher (double digit) core counts as mainstream, this will be more important to consider. As a generalisation, within a given architecture/process technology, more cores enables more throughput by offering more performance in a given power budget. But you need to scale well to achieve that, and it isn't easy.