android_kernel_msm-6.1_noth.../include
Peter Zijlstra d153b15344 sched/core: Fix wake_affine() performance regression
Eric reported a sysbench regression against commit:

  3fed382b46 ("sched/numa: Implement NUMA node level wake_affine()")

Similarly, Rik was looking at the NAS-lu.C benchmark, which regressed
against his v3.10 enterprise kernel.

PRE (current tip/master):

 ivb-ep sysbench:

   2: [30 secs]     transactions:                        64110  (2136.94 per sec.)
   5: [30 secs]     transactions:                        143644 (4787.99 per sec.)
  10: [30 secs]     transactions:                        274298 (9142.93 per sec.)
  20: [30 secs]     transactions:                        418683 (13955.45 per sec.)
  40: [30 secs]     transactions:                        320731 (10690.15 per sec.)
  80: [30 secs]     transactions:                        355096 (11834.28 per sec.)

 hsw-ex NAS:

 OMP_PROC_BIND/lu.C.x_threads_144_run_1.log: Time in seconds =                    18.01
 OMP_PROC_BIND/lu.C.x_threads_144_run_2.log: Time in seconds =                    17.89
 OMP_PROC_BIND/lu.C.x_threads_144_run_3.log: Time in seconds =                    17.93
 lu.C.x_threads_144_run_1.log: Time in seconds =                   434.68
 lu.C.x_threads_144_run_2.log: Time in seconds =                   405.36
 lu.C.x_threads_144_run_3.log: Time in seconds =                   433.83

POST (+patch):

 ivb-ep sysbench:

   2: [30 secs]     transactions:                        64494  (2149.75 per sec.)
   5: [30 secs]     transactions:                        145114 (4836.99 per sec.)
  10: [30 secs]     transactions:                        278311 (9276.69 per sec.)
  20: [30 secs]     transactions:                        437169 (14571.60 per sec.)
  40: [30 secs]     transactions:                        669837 (22326.73 per sec.)
  80: [30 secs]     transactions:                        631739 (21055.88 per sec.)

 hsw-ex NAS:

 lu.C.x_threads_144_run_1.log: Time in seconds =                    23.36
 lu.C.x_threads_144_run_2.log: Time in seconds =                    22.96
 lu.C.x_threads_144_run_3.log: Time in seconds =                    22.52

This patch takes out all the shiny wake_affine() stuff and goes back to
utter basics. Between the two CPUs involved with the wakeup (the CPU
doing the wakeup and the CPU we ran on previously) pick the CPU we can
run on _now_.

This restores much of the regressions against the older kernels,
but leaves some ground in the overloaded case. The default-enabled
WA_WEIGHT (which will be introduced in the next patch) is an attempt
to address the overloaded situation.

Reported-by: Eric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: jinpuwang@gmail.com
Cc: vcaputo@pengaru.com
Fixes: 3fed382b46 ("sched/numa: Implement NUMA node level wake_affine()")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-10 10:14:02 +02:00
..
acpi ACPI / bus: Make ACPI_HANDLE() work for non-GPL code again 2017-09-19 22:42:31 +02:00
asm-generic percpu: make this_cpu_generic_read() atomic w.r.t. interrupts 2017-09-26 07:37:33 -07:00
clocksource
crypto
drm lib/interval_tree: fast overlap detection 2017-09-08 18:26:49 -07:00
dt-bindings ARC: reset: remove the misleading v1 suffix all over 2017-09-18 13:02:03 +02:00
keys
kvm
linux sched/core: Fix wake_affine() performance regression 2017-10-10 10:14:02 +02:00
math-emu
media media updates for v4.14-rc1 2017-09-07 12:53:14 -07:00
memory
misc
net udp: perform source validation for mcast early demux 2017-10-01 03:55:47 +01:00
pcmcia
ras
rdma IB: Correct MR length field to be 64-bit 2017-09-25 11:47:23 -04:00
scsi scsi: libiscsi: Remove iscsi_destroy_session 2017-10-02 22:23:21 -04:00
soc ARM: SoC driver updates for v4.14 2017-09-10 20:40:00 -07:00
sound sound fixes for 4.14-rc4 2017-10-05 10:39:29 -07:00
target
trace sched/debug: Add explicit TASK_PARKED printing 2017-09-29 11:02:57 +02:00
uapi Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf 2017-10-09 10:39:52 -07:00
video
xen xen, arm64: drop dummy lookup_address() 2017-09-19 09:25:05 -04:00