Intel calls their die stacking technology Foveros, and their first product to use it is Lakefield. You can read a lot more about it here:https://www.anandtech.com/show/15877/intel-hybrid-cpu-lakefield-all-you-need-to-know
There's no intrinsic reason why you can't stack dies on top of each other. Memory chips have been doing this for quite some time. But for logic dies, it's a lot harder.
One thing that makes memory chips easier to stack is that they put out very little power. If one chip that uses a tiny fraction of a watt is stacked on top of another that also uses a tiny fraction of a watt, keeping them sufficiently cooled is easy. So long as they're not particularly insulated, they'll be fine.
If you want to have one chip that is putting out 100 W and stack that on top of another chip that is also putting out 100 W, you have a problem. Traditional CPUs and GPUs have a planar design where whatever is putting out a lot of heat is as close as possible to a heatspreader and then heatsink to suck that heat away. If you have another hot chip in between them, then the heatsink may be able to properly cool the top chip, but that bottom one is going to get rather toasty.
Cooling is hardly the only problem. There's also thermal expansion. If one die gets hotter than the other and expands more, you can crack whatever you're using to hold them together. That can be a problem with the bumps to attach the die to a package even without stacking dies, of course. But having two dies with independent hotspots makes it into a bigger problem. And if you want the two dies to have lots of little connections to allow them to communicate directly, that makes it a much bigger problem. When memory dies are stacked on top of each other, the dies are independent and don't need to communicate with each other at all.
With Lakefield, Intel tries to start off small, with a 7 W TDP. Naturally, that makes the problem much simpler than if you had a 100 W TDP. One logic die is for the CPU cores, while the other is for I/O. AMD does something similar with their Zen 2 desktop and server CPUs, except that AMD doesn't try to stack the dies on top of each other. Lakefield also sticks memory on top of the entire package, which makes it harder to cool it. But you can get away with that with a 7 W TDP, even if you couldn't with a higher TDP.
What Intel does with CPU cores is to have one Sunny Cove (Ice Lake) core, and four Tremont atom cores. All previous x86 CPUs have used exactly the same cores for all cores in the entire CPU. Smartphones have had mixed cores for quite some time, with the idea that you can get better energy efficiency by having both big cores for the tasks that really need them, while also having small cores where you can shove background tasks that don't need much performance.
But this creates all sorts of problems. For starters, in order to put the tasks that need high performance on a high performance core and those that don't on a low power core, the operating system needs to know how to do this. And Windows doesn't. Nor does Linux. Nor Mac OS X, though this probably isn't coming to a Mac at all.
Another problem is that while the big and little cores can do things somewhat differently internally, they need to be able to execute exactly the same instructions or else programs won't work. That means that Lakefield can only afford to expose the instructions common to both the Sunny Cove and Tremont cores--and yes, the latter has some that the former doesn't. So you get a crippled big core combined with some crippled small cores.
So why? Why take on all of these problems? For hybrid CPU cores, the answer is that, if it all works right, it's the best way to get the maximum amount of performance out of a fixed TDP. Of course, it's pretty much guaranteed not to work how you'd hope, so that alone will probably make it a dumb product.
Of course, that maximum performance will surely get squashed by the stacked dies of Lakefield. Normally, if you build an x86 CPU with a "7 W" TDP, you can go way above that 7 W for short periods of time. It's okay to use 20 W briefly, then throttle back as the chip heats up. The "7 W" is the maximum sustained power usage. But if stacking dies will cause mismatched hot spots to overheat faster or even crack apart, then you can't have turbo use 20 W. Intel claims that the max turbo power is 9.5 W. So much for maximizing performance.
As for stacking dies, the immediate point seems to be that you can have a smaller package. So instead of your CPU package taking somewhat under a square inch, the Lakefield package with a CPU and memory together can be a square only 12 mm on a side. That's a big advantage in all of those situations where an x86 CPU is used, but a package size of a square inch is just too big. That's a decent enough description of a phone (which won't use x86), or of Intel's compute stick, but not a laptop whose physical size is driven by the monitor, let alone desktops or servers.
So the upshot is that you can pay a lot of money to get a low performance device that will cause all sorts of problems for software. That sure sounds like a terrible product to me.
So what's the point? As best as I can tell, it's more a test vehicle than a real consumer product, even if it will be offered to consumers. Rather than having die stacking kill one product while you try to work out its kinks, and hybrid CPU cores kill another product while you try to fix its problems, just sacrifice one product to be very useless for a lot of reasons. See what goes wrong and fix it in the next die stacking product. And also in the next hybrid core product. Assuming that there is a next product for both technologies. Or either of them.
Longer term, hybrid CPU core approaches could have a future if you want to push x86 into lower power envelopes. They're already ubiquitous in phones, and the same reasons should apply to x86 as to ARM. Of course, that assumes that you actually want to use x86 cores and not ARM when you need low power.
Stacking logic dies in the manner of Lakefield doesn't have any obvious applications in the near future, however. Rather, This could be an attempt at making a longer run development effort. Everyone knows that you can't keep doing die shrinks forever. How do you keep improving performance after you can't shrink silicon any further? There have been some proposals to do so by going to other materials, but whatever you use, it can only shrink so far.
Another proposal is to stack dies on top of each other in a 3D manner. Rather than having a single 20x20 mm die, why not have four 10x10 mm dies stacked on top of each other? If you can have the dies communicate in a fine-grained manner, you only need to move data a fraction of a millimeter to the next die down, rather than 10 mm for the next die over. That could save power, and ultimately allow more performance in the same TDP as before.
Of course, actually building all those tiny wires to connect stacked dies is hard. Lakefield has some communication between dies, but it's still on a much larger scale than the metal layers within a single chip. But you have to start somewhere, and this might be Intel's attempt at starting to build toward full 3D logic chips. Start seeing what goes wrong now and fix it before it actually needs to work right. Then shrink as you go. Or at least, shrink them if you can. Trying to shrink the wires means it takes that much less thermal expansion to snap them off entirely.