I started investigating and eventually managed to reproduce the issue with just the following code:
void __fastcall TAnsiStringTesterThread::Execute()Delphi doesn't seem to suffer the same problem at first sight when I ran the above in its Delphi equivalent. That is until I tried the following (instead of assigning straight to a literal string, I did an IntToStr): procedure TAnsiStringTesterThread.Execute; var i: Integer; str: string; begin for i := 0 to 10000000 - 1 do str := IntToStr(10000000); end; The similarity in both these codes is that they both call LStrFromPCharLen, which eventually leads to a call to GetMem and when the string's ref count goes back to zero, a call to FreeMem. Could GetMem and FreeMem be the culprit? As it turns out, yes. To put the hypothesis to the test, I did a tight GetMemory / FreeMemory loop in 2 threads and observed the CPU usage. Unsurprisingly, only 50% of my dual cores are utilized, even though the utilization spreads across both quite evenly. In my search for a better memory manager, I came across the Intel Threading Building Block library. Among other useful things like parallel loops, concurrent hash maps and lock-free queue, it has a scalable memory manager. With that, I wrote a BorlndMM.dll wrapper and called it TBBMM. Here's the result:
for (int i=0; i<10000000; i++)
str = " something ";
In the single-threaded test, the TBBMM is only faster than FastMM by 20%. However, from 2 to 8 threads on a Dual Core machine, the improvements are from 2.25x to a staggering 2.5x.
For more information / to download TBBMM, visit my TBBMM webpage.