Am I stupid or the files attached to that post demonstrate than "new"
code isn't much better and sometimes worse (aligned 4096 byte memcpy
went from 558 to 648 for Core 2)?
Beware that text files in test-memcpy.tar.bz2 seem to have
simple_memcpy / builtin_memcpy / memcpy columns swapped
(-old and -new files have them in different order).