I’m experiencing some overheating problems when building large simulation software on my laptop using make -j <N> where N is the number of cores on my machine. At max capacity, the fans on my system aren’t turning on max, or at least are very ‘jumpy’, in the sense that they are maybe going on for 2-3 seconds before going back to idle speeds. I have had to resort to canceling the build just to maintain temps on my laptop because the fan isn’t kicking on automatically or for long enough.
When the fan runs, the temps go down to ‘normal’ levels of what I expect a laptop at full capacity to be operating at (8 cores @ 100% around 70-80 degC, idling is usually around 55 degC). The problem is that the fan doesn’t always run, and I haven’t found a setting to change so that the fan turns on at a lower threshold that it does now.
I have a Toshiba laptop running Tumbleweed:
$ uname -a
Linux tumbleweed 4.14.2-1-default #1 SMP PREEMPT Fri Nov 24 08:20:07 UTC 2017 (b0610fc) x86_64 x86_64 x86_64 GNU/Linux
Running sensors-detect shows no signs of discovering a fan anywhere:
$ sensors-detect --auto
# sensors-detect revision 6284 (2015-05-31 14:00:33 +0200)
# System: TOSHIBA Satellite L855 [PSKFUU-02Y003] (laptop)
# Board: TOSHIBA Portable PC
# Kernel: 4.14.2-1-default x86_64
# Processor: Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz (6/58/9)
Running in automatic mode, default answers to all questions
are assumed.
Some south bridges, CPUs or memory controllers contain embedded sensors.
Do you want to scan for them? This is totally safe. (YES/no):
Module cpuid loaded successfully.
Silicon Integrated Systems SIS5595... No
VIA VT82C686 Integrated Sensors... No
VIA VT8231 Integrated Sensors... No
AMD K8 thermal sensors... No
AMD Family 10h thermal sensors... No
AMD Family 11h thermal sensors... No
AMD Family 12h and 14h thermal sensors... No
AMD Family 15h thermal sensors... No
AMD Family 16h thermal sensors... No
AMD Family 15h power sensors... No
AMD Family 16h power sensors... No
Intel digital thermal sensor... Success!
(driver `coretemp')
Intel AMB FB-DIMM thermal sensor... No
Intel 5500/5520/X58 thermal sensor... No
VIA C7 thermal sensor... No
VIA Nano thermal sensor... No
Some Super I/O chips contain embedded sensors. We have to write to
standard I/O ports to probe them. This is usually safe.
Do you want to scan for Super I/O sensors? (YES/no):
Probing for Super-I/O at 0x2e/0x2f
Trying family `National Semiconductor/ITE'... Yes
Found unknown chip with ID 0xfc11
Probing for Super-I/O at 0x4e/0x4f
Trying family `National Semiconductor/ITE'... No
Trying family `SMSC'... No
Trying family `VIA/Winbond/Nuvoton/Fintek'... No
Trying family `ITE'... No
Some hardware monitoring chips are accessible through the ISA I/O ports.
We have to write to arbitrary I/O ports to probe them. This is usually
safe though. Yes, you do have ISA I/O ports even if you do not have any
ISA slots! Do you want to scan the ISA I/O ports? (YES/no):
Probing for `National Semiconductor LM78' at 0x290... No
Probing for `National Semiconductor LM79' at 0x290... No
Probing for `Winbond W83781D' at 0x290... No
Probing for `Winbond W83782D' at 0x290... No
Lastly, we can probe the I2C/SMBus adapters for connected hardware
monitoring devices. This is the most risky part, and while it works
reasonably well on most systems, it has been reported to cause trouble
on some systems.
Do you want to probe the I2C/SMBus adapters now? (YES/no):
Using driver `i2c-i801' for device 0000:00:1f.3: Intel Panther Point (PCH)
Module i2c-dev loaded successfully.
Next adapter: i915 gmbus ssc (i2c-0)
Do you want to scan it? (yes/NO/selectively):
Next adapter: i915 gmbus vga (i2c-1)
Do you want to scan it? (yes/NO/selectively):
Next adapter: i915 gmbus panel (i2c-2)
Do you want to scan it? (yes/NO/selectively):
Next adapter: i915 gmbus dpc (i2c-3)
Do you want to scan it? (yes/NO/selectively):
Next adapter: i915 gmbus dpb (i2c-4)
Do you want to scan it? (yes/NO/selectively):
Next adapter: i915 gmbus dpd (i2c-5)
Do you want to scan it? (yes/NO/selectively):
Next adapter: DPDDC-C (i2c-6)
Do you want to scan it? (yes/NO/selectively):
Next adapter: SMBus I801 adapter at 4040 (i2c-7)
Do you want to scan it? (YES/no/selectively):
Client found at address 0x50
Probing for `Analog Devices ADM1033'... No
Probing for `Analog Devices ADM1034'... No
Probing for `SPD EEPROM'... Yes
(confidence 8, not a hardware monitoring chip)
Probing for `EDID EEPROM'... No
Client found at address 0x52
Probing for `Analog Devices ADM1033'... No
Probing for `Analog Devices ADM1034'... No
Probing for `SPD EEPROM'... Yes
(confidence 8, not a hardware monitoring chip)
Now follows a summary of the probes I have just done.
Driver `coretemp':
* Chip `Intel digital thermal sensor' (confidence: 9)
Do you want to overwrite /etc/sysconfig/lm_sensors? (YES/no):
Unloading i2c-dev... OK
Unloading cpuid... OK
Sensors also shows no sign of a fan - just core temperature sensors:
$ sensors
acpitz-virtual-0
Adapter: Virtual device
temp1: +103.0°C (crit = +110.0°C)
coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +103.0°C (high = +87.0°C, crit = +105.0°C)
Core 0: +100.0°C (high = +87.0°C, crit = +105.0°C)
Core 1: +102.0°C (high = +87.0°C, crit = +105.0°C)
Core 2: +103.0°C (high = +87.0°C, crit = +105.0°C)
Core 3: +99.0°C (high = +87.0°C, crit = +105.0°C)
That being said, when looking through lsmod, I can see that there is some fan related settings here, but it seems to be set at 0:
$ lsmod | grep -iE 'temp|fan|toshiba'
coretemp 16384 0
x86_pkg_temp_thermal 16384 0
toshiba_acpi 49152 0
sparse_keymap 16384 1 toshiba_acpi
industrialio 81920 1 toshiba_acpi
wmi 28672 1 toshiba_acpi
toshiba_bluetooth 16384 0
rfkill 28672 8 toshiba_bluetooth,bluetooth,toshiba_acpi,cfg80211
fan 16384 0
video 45056 2 toshiba_acpi,i915
acpi can see that there is at least 1 fan on my system:
$ acpi -V
Battery 0: Unknown, 100%
Battery 0: design capacity 4400 mAh, last full capacity 1238 mAh = 28%
Adapter 0: on-line
Thermal 0: active, 76.0 degrees C
Thermal 0: trip point 0 switches to mode critical at temperature 110.0 degrees C
Thermal 0: trip point 1 switches to mode passive at temperature 110.0 degrees C
Thermal 0: trip point 2 switches to mode active at temperature 70.0 degrees C
Cooling 0: Processor 0 of 10
Cooling 1: Processor 0 of 10
Cooling 2: Processor 0 of 10
Cooling 3: Processor 0 of 10
Cooling 4: Fan 1 of 1
Cooling 5: Processor 0 of 10
Cooling 6: Processor 0 of 10
Cooling 7: Processor 0 of 10
Cooling 8: x86_pkg_temp no state information available
Cooling 9: Processor 0 of 10
Cooling 10: intel_powerclamp no state information available
There are cooling devices listed in /sys/ where all have cur_state of 0 and max_state of 10:
$ ls /sys/devices/virtual/thermal/*
/sys/devices/virtual/thermal/cooling_device0:
cur_state device max_state power subsystem type uevent
/sys/devices/virtual/thermal/cooling_device1:
cur_state device max_state power subsystem type uevent
/sys/devices/virtual/thermal/cooling_device2:
cur_state device max_state power subsystem type uevent
/sys/devices/virtual/thermal/cooling_device3:
cur_state device max_state power subsystem type uevent
/sys/devices/virtual/thermal/cooling_device4:
cur_state device max_state power subsystem type uevent
/sys/devices/virtual/thermal/cooling_device5:
cur_state device max_state power subsystem type uevent
/sys/devices/virtual/thermal/cooling_device6:
cur_state device max_state power subsystem type uevent
/sys/devices/virtual/thermal/cooling_device7:
cur_state device max_state power subsystem type uevent
/sys/devices/virtual/thermal/cooling_device8:
cur_state device max_state power subsystem type uevent
/sys/devices/virtual/thermal/cooling_device9:
cur_state max_state power subsystem type uevent
/sys/devices/virtual/thermal/thermal_zone0:
available_policies cdev4 cdev8_trip_point subsystem
cdev0 cdev4_trip_point cdev8_weight sustainable_power
cdev0_trip_point cdev4_weight device temp
cdev0_weight cdev5 integral_cutoff trip_point_0_temp
cdev1 cdev5_trip_point k_d trip_point_0_type
cdev1_trip_point cdev5_weight k_i trip_point_1_temp
cdev1_weight cdev6 k_po trip_point_1_type
cdev2 cdev6_trip_point k_pu trip_point_2_temp
cdev2_trip_point cdev6_weight mode trip_point_2_type
cdev2_weight cdev7 offset type
cdev3 cdev7_trip_point policy uevent
cdev3_trip_point cdev7_weight power
cdev3_weight cdev8 slope
/sys/devices/virtual/thermal/thermal_zone1:
available_policies k_pu subsystem trip_point_1_temp
integral_cutoff offset sustainable_power trip_point_1_type
k_d policy temp type
k_i power trip_point_0_temp uevent
k_po slope trip_point_0_type
pwmconfig can’t find any capable sensor modules installed:
$ pwmconfig
[sudo] password for root:
# pwmconfig revision 6243 (2014-03-20)
This program will search your sensors for pulse width modulation (pwm)
controls, and test each one to see if it controls a fan on
your motherboard. Note that many motherboards do not have pwm
circuitry installed, even if your sensor chip supports pwm.
We will attempt to briefly stop each fan using the pwm controls.
The program will attempt to restore each fan to full speed
after testing. However, it is ** very important ** that you
physically verify that the fans have been to full speed
after the program has completed.
/usr/sbin/pwmconfig: There are no pwm-capable sensor modules installed
sensors-detect finds that ‘coretemp’ should be used, and it is loaded as far as I can tell (see lsmod output above).
The cpupower command doesn’t list any fan related output:
$ cpupower frequency-info
analyzing CPU 0:
driver: intel_pstate
CPUs which run at the same hardware frequency: Not Available
CPUs which need to have their frequency coordinated by software: Not Available
maximum transition latency: Cannot determine or is not supported.
hardware limits: 1.20 GHz - 3.40 GHz
available cpufreq governors: performance
current policy: frequency should be within 1.20 GHz and 3.40 GHz.
The governor "powersave" may decide which speed to use
within this range.
current CPU frequency: Unable to call hardware
current CPU frequency: 1.35 GHz (asserted by call to kernel)
boost state support:
Supported: yes
Active: yes
3200 MHz max turbo 4 active cores
3200 MHz max turbo 3 active cores
3300 MHz max turbo 2 active cores
3400 MHz max turbo 1 active cores
I got some clues from this thread on what to check (https://forums.opensuse.org/showthread.php/492873-Toshiba-Qosmio-X770-107-openSUSE-13-1-fans-not-running-no-fan-in-proc-but-yes-in-sys?highlight=fan+toshiba), but no one answered so it didn’t really help beyond giving hints.
Any help is greatly appreciated.