android_kernel_msm-6.1_noth.../include
Yu Zhao 3f963725af UPSTREAM: mm: multi-gen LRU: rename lru_gen_struct to lru_gen_folio
Patch series "mm: multi-gen LRU: memcg LRU", v3.

Overview
========

An memcg LRU is a per-node LRU of memcgs.  It is also an LRU of LRUs,
since each node and memcg combination has an LRU of folios (see
mem_cgroup_lruvec()).

Its goal is to improve the scalability of global reclaim, which is
critical to system-wide memory overcommit in data centers.  Note that
memcg reclaim is currently out of scope.

Its memory bloat is a pointer to each lruvec and negligible to each
pglist_data.  In terms of traversing memcgs during global reclaim, it
improves the best-case complexity from O(n) to O(1) and does not affect
the worst-case complexity O(n).  Therefore, on average, it has a sublinear
complexity in contrast to the current linear complexity.

The basic structure of an memcg LRU can be understood by an analogy to
the active/inactive LRU (of folios):
1. It has the young and the old (generations), i.e., the counterparts
   to the active and the inactive;
2. The increment of max_seq triggers promotion, i.e., the counterpart
   to activation;
3. Other events trigger similar operations, e.g., offlining an memcg
   triggers demotion, i.e., the counterpart to deactivation.

In terms of global reclaim, it has two distinct features:
1. Sharding, which allows each thread to start at a random memcg (in
   the old generation) and improves parallelism;
2. Eventual fairness, which allows direct reclaim to bail out at will
   and reduces latency without affecting fairness over some time.

The commit message in patch 6 details the workflow:
https://lore.kernel.org/r/20221222041905.2431096-7-yuzhao@google.com/

The following is a simple test to quickly verify its effectiveness.

  Test design:
  1. Create multiple memcgs.
  2. Each memcg contains a job (fio).
  3. All jobs access the same amount of memory randomly.
  4. The system does not experience global memory pressure.
  5. Periodically write to the root memory.reclaim.

  Desired outcome:
  1. All memcgs have similar pgsteal counts, i.e., stddev(pgsteal)
     over mean(pgsteal) is close to 0%.
  2. The total pgsteal is close to the total requested through
     memory.reclaim, i.e., sum(pgsteal) over sum(requested) is close
     to 100%.

  Actual outcome [1]:
                                     MGLRU off    MGLRU on
  stddev(pgsteal) / mean(pgsteal)    75%          20%
  sum(pgsteal) / sum(requested)      425%         95%

  ####################################################################
  MEMCGS=128

  for ((memcg = 0; memcg < $MEMCGS; memcg++)); do
      mkdir /sys/fs/cgroup/memcg$memcg
  done

  start() {
      echo $BASHPID > /sys/fs/cgroup/memcg$memcg/cgroup.procs

      fio -name=memcg$memcg --numjobs=1 --ioengine=mmap \
          --filename=/dev/zero --size=1920M --rw=randrw \
          --rate=64m,64m --random_distribution=random \
          --fadvise_hint=0 --time_based --runtime=10h \
          --group_reporting --minimal
  }

  for ((memcg = 0; memcg < $MEMCGS; memcg++)); do
      start &
  done

  sleep 600

  for ((i = 0; i < 600; i++)); do
      echo 256m >/sys/fs/cgroup/memory.reclaim
      sleep 6
  done

  for ((memcg = 0; memcg < $MEMCGS; memcg++)); do
      grep "pgsteal " /sys/fs/cgroup/memcg$memcg/memory.stat
  done
  ####################################################################

[1]: This was obtained from running the above script (touches less
     than 256GB memory) on an EPYC 7B13 with 512GB DRAM for over an
     hour.

This patch (of 8):

The new name lru_gen_folio will be more distinct from the coming
lru_gen_memcg.

Link: https://lkml.kernel.org/r/20221222041905.2431096-1-yuzhao@google.com
Link: https://lkml.kernel.org/r/20221222041905.2431096-2-yuzhao@google.com
Signed-off-by: Yu Zhao <yuzhao@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Larabel <Michael@MichaelLarabel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Bug: 274865848
(cherry picked from commit 391655fe08d1f942359a11148aa9aaf3f99d6d6f)
Change-Id: I7df67e0e2435ba28f10eaa57d28d98b61a9210a6
Signed-off-by: T.J. Mercier <tjmercier@google.com>
2023-03-30 10:23:50 +00:00
..
acpi PCI/ACPI: Account for _S0W of the target bridge in acpi_pci_bridge_d3() 2023-03-11 13:55:33 +01:00
asm-generic This is the 6.1.14 stable release 2023-02-25 15:37:47 +00:00
clocksource
crypto crypto: scatterwalk - Remove unused inline function scatterwalk_aligned() 2022-09-30 13:59:13 +08:00
drm Revert "drm/msm/gem: Prevent blocking within shrinker loop" 2023-03-24 16:51:25 +00:00
dt-bindings dt-bindings: clocks: imx8mp: Add ID for usb suspend clock 2022-12-31 13:33:09 +01:00
keys
kunit kunit: fix kunit_test_init_section_suites(...) 2023-02-09 11:28:08 +01:00
kvm ANDROID: KVM: arm64: Move some kvm_psci functions to a shared header 2022-12-15 16:12:56 +00:00
linux UPSTREAM: mm: multi-gen LRU: rename lru_gen_struct to lru_gen_folio 2023-03-30 10:23:50 +00:00
math-emu
media Merge 6.1.18 into android14-6.1 2023-03-21 08:22:15 +00:00
memory memory: renesas-rpc-if: Split-off private data from struct rpcif 2023-03-11 13:55:17 +01:00
misc
net UPSTREAM: wifi: cfg80211: include puncturing bitmap in channel switch events 2023-03-30 10:23:50 +00:00
pcmcia
ras
rdma RDMA/core: Add UVERBS_ATTR_RAW_FD 2022-09-27 10:15:24 -03:00
rv
scsi Revert "scsi: core: Add BLIST_NO_VPD_SIZE for some VDASD" 2023-03-24 16:51:25 +00:00
soc ARM: at91: pm: avoid soft resetting AC DLL 2022-11-01 12:25:19 +02:00
sound Merge 6.1.16 into android14-6.1 2023-03-13 15:45:34 +00:00
target
trace Revert "ANDROID: module: Add vendor hooks" 2023-03-29 08:05:07 +00:00
uapi UPSTREAM: wifi: nl80211: Allow authentication frames and set keys on NAN interface 2023-03-30 10:23:50 +00:00
ufs UPSTREAM: scsi: ufs: core: Initialize devfreq synchronously 2023-03-15 16:17:14 +00:00
vdso
video
xen xen/virtio: enable grant based virtio on x86 2022-10-10 14:31:26 +02:00
OWNERS