Originally posted here by Tim_axe
I came across this not too long ago, and didn't have time to read it. Now that you brought it up, I'm taking the time to read it and I find it quite interesting.

I'm actually in the process of learning about things like the AX/BX/CX/DX registers in x86 arch and haven't quite made it to how an OS manages the different caches, but I have gone far enough to know a little bit about the data bus and some of the even/odd address issues of the memory subsystem including waitstates and buffer delays.

So seeing some of this stuff come together in the paper is really cool. Fortunately (perhaps) some web/sql servers are using Dual Xeon systems with HT, tossing 4 cores (2 phys 2 virt) for the execution to happen on and hopefully dissrupt the ability of getting information specific to a process. On my own dual CPU machine Win2K has a habit of splitting a process across seperate cores for some odd reason, which sucks for performance, but maybe this shortcoming can be sold as a feature? (j/k)


Cheers and good read, though I'm not sure of how many people would really understand what he is working so hard to explain in those 12 pages
The cool thing is that the same concept works for different contexts (AES, HT,...), the particular problem spoken of in the paper is therefor not unique in his approach but in use and that makes it so interesting. Another thing to watch for... cache timing...