|
||||||
| Forums FAQ | Members List | Search | Today's Posts | Mark Forums Read |
| ARCHIVES - General Questions If your question doesn't fit in any other category below ask in here. |
|
|
LinkBack | Thread Tools | Display Modes |
|
|||
|
I'm new to the list... howdy to all. I'm wondering whether someone might have an
explanation to an anomaly that I've observed as I was testing our server. Before I get into the details let me first say that I've seen this problem(?) present itself on various hardware and OSs however since SuSE is our preferred OS I've come to visit the SuSE forum. We have a test that forks many processes, each of which briefly flock a file. If this test is run on a single CPU core, the test essentially single threads and completes in the fastest time. As we increase the numbers of cores, the performance degrades linearly. Now you're probably thinking "You're locking a file so what do you expect?" Well we wrote a program that simulates the problem which simply flocks a file briefly and then releases it. It performs the same locking effect consistently each run, yet as we run this test under an ever increasing number of cores, performance degrades. Just as an example, a test runs that completes in 38 seconds using a single core becomes 1100 seconds when run under 32 cores (2 chassis IBM x3950). BTW, I've been using taskset to specify the number and location of the cores used. We've had theories about issues relating to shared L2 caches and context switching however I'm not sure if we're thinking in the right direction. I would be happy to send our standalone C++ flock lock test program to anyone who might be interested in pursuing this. I've kinda resigned myself that this is some sort or architectural problem but I'd really like to understand what's going on. If this is not the place to direct this type of question, please forgive me and tell me where I should go ![]() TIA - Tim kernel - 2.6.16.46-0.12-smp #1 SMP Thu May 17 14:00:09 UTC 2007 x86_64 x86_64 x86_64 GNU/Linux |
|
|||
|
Try the mailing lists though tbh with my very limited knowledge would this not be app/kernel specific rather than distro specific.
|
|
|||
|
well 38x32 is only 1216seconds. so it's hardly much of a deviation, is it? this sort of thing is way out of my depth but it looks to me that it's setting the task for each core, and not all cores together (otherwise, instead of 38x32, you'd expect 38/32=1.1875s)
maybe i'm well off in which case i'll go now h34r:
|
|
|||
|
Quote:
|
| Bookmarks |
| Thread Tools | |
| Display Modes | |
|
|