XGL will certainly add overhead, since you're effectively running a nested X server within the XGL server.
But the real issue with that approach is that since the XGL server "owns" the access to the graphical interface, the nested server (the one you're really using for your desktop) cannot see or access the hardware adapter directly, it assumes you do not have hardware acceleration available and falls back to using software libraries for openGL rendering.
If you use an app that is "XGL-aware", then it can take advantage of hardware acceleration, but right now compiz is pretty much the only app that does. Any other standard X based GL apps will be unable to utilize hardware acceleration under XGL, which means you take a substantial performance hit when running 3d apps on your 3d desktop. The one advantage that XGL offers is that by implementing compositing transparently, it will work with almost any 3d-capable hardware without requiring special support built into the driver (as in AIGLX).
AIGLX permits the compositing to be supported directly by the driver, rather than requiring an extra layer like XGL that interferes with other apps being able to utilize hardware acceleration. It's still a relatively new technology, and driver support is still a bit hit and miss depending on the adapter, particularly if you're using ATI, but for the most part this is the direction the technology is heading. KDE4, for instance, didn't even bother building in XGL support for their standard compositing.
So you're totally correct, but the performance is a limitation of the design of XGL. It's a feature, not a bug...
Hope this all makes sense...
Cheers,
KV