Malcom is correct, of course. It’s not actually about ‘overload’, though I usually explain it that way to give the broad overview if someone hasn’t had a lot of experience with processors. If you want more details, they’re below.
The biggest physical difference between GPUs and CPUs is that GPUs are specialized for parallel processing. What does that mean? It means that they are specialized for doing computing that involves a lot of arithmetic calculations and not as much memory access.
For a super simple example:
Take a river. If you’re trying to make a river seem realistic in a computer game, one way to do it is by modelling the flow of water via a vector field. Imagine trying to calculate the position of a droplet of water in the river, first it starts at location 0, then you have some value about the distance it travels every second (call it d). So if it’s at 0 to start, it’s at 0+d, an then 0+d+d…and so on. (I’m skipping the calculus, linear algebra, and the conversion to binary here). That’s a lot of addition, but honestly not much memory. We don’t need to store much to follow this droplet of water. If we want to follow lots of droplets of water to make a whole river, we would want to be able to calculate the position of all those droplets at the same time. To do that, a discrete/separate GPU can send each calculation to one of thousands of processors optimized for doing arithmetic without using the memory that much. (Discrete GPUs often have on the order of 2000-10000 computation units, aka ‘cores’ or ‘shading units’, so they can do that many calculations simultaneously).
An integrated/onboard GPU also is a little GPU built in with the CPU, but, due to physical space constraints, it can’t have as many of the GPU-type specialized units. So, it can’t model as many droplets of water at the same time. This means that it would ‘render’ your graphics less well. If you demand as many calculations from the integrated chip, it actually can do them, but it will have do to them sequentially (the first 500, store those, then do the second 500, for instance), this can appear as ‘lag’ or ‘freezing’ in game while it completes all the 10,000 calculations. (an integrated GPU usually has around 200-700 ‘cores’). Consequently, people may turn down the graphics settings and feel it’s a lower quality graphical experience.
Forcing games to use the discrete GPU does ‘improve performance’ because it enhances the ability of to parallel process, thereby allowing more calculations to be done at the same time. At least, if doing as many calculations as possible at once matches your definition of ‘performance’. Unless for some reason the discrete GPU has fewer processors than the integrated one (i.e. you have a really old GPU and new CPU, which you don’t), the discrete GPU will be able to do more calculations in parallel. If your definition of performance matches this, then it’s an automatic improvement. If your definition of performance is to be able to play games not necessarily how it looks or how fast it’s working, then it really doesn’t matter.
Malcolm also hit on the crux of the Windows/Linux difference for most of us: Linux is more about you doing what you want with your hardware (i.e. I want to run math on my GPU, because that’s what it’s good at). Windows is a bit more about someone else deciding what is best for you to do with your hardware (i.e. my GPU should make pictures pretty because that is what it says it should do on the box).
So, long winded, but now you have more details.
Cheers,
SisPenguin