android_kernel_msm-6.1_noth.../include
Vladimir Davydov 503c358cf1 list_lru: introduce list_lru_shrink_{count,walk}
Kmem accounting of memcg is unusable now, because it lacks slab shrinker
support.  That means when we hit the limit we will get ENOMEM w/o any
chance to recover.  What we should do then is to call shrink_slab, which
would reclaim old inode/dentry caches from this cgroup.  This is what
this patch set is intended to do.

Basically, it does two things.  First, it introduces the notion of
per-memcg slab shrinker.  A shrinker that wants to reclaim objects per
cgroup should mark itself as SHRINKER_MEMCG_AWARE.  Then it will be
passed the memory cgroup to scan from in shrink_control->memcg.  For
such shrinkers shrink_slab iterates over the whole cgroup subtree under
the target cgroup and calls the shrinker for each kmem-active memory
cgroup.

Secondly, this patch set makes the list_lru structure per-memcg.  It's
done transparently to list_lru users - everything they have to do is to
tell list_lru_init that they want memcg-aware list_lru.  Then the
list_lru will automatically distribute objects among per-memcg lists
basing on which cgroup the object is accounted to.  This way to make FS
shrinkers (icache, dcache) memcg-aware we only need to make them use
memcg-aware list_lru, and this is what this patch set does.

As before, this patch set only enables per-memcg kmem reclaim when the
pressure goes from memory.limit, not from memory.kmem.limit.  Handling
memory.kmem.limit is going to be tricky due to GFP_NOFS allocations, and
it is still unclear whether we will have this knob in the unified
hierarchy.

This patch (of 9):

NUMA aware slab shrinkers use the list_lru structure to distribute
objects coming from different NUMA nodes to different lists.  Whenever
such a shrinker needs to count or scan objects from a particular node,
it issues commands like this:

        count = list_lru_count_node(lru, sc->nid);
        freed = list_lru_walk_node(lru, sc->nid, isolate_func,
                                   isolate_arg, &sc->nr_to_scan);

where sc is an instance of the shrink_control structure passed to it
from vmscan.

To simplify this, let's add special list_lru functions to be used by
shrinkers, list_lru_shrink_count() and list_lru_shrink_walk(), which
consolidate the nid and nr_to_scan arguments in the shrink_control
structure.

This will also allow us to avoid patching shrinkers that use list_lru
when we make shrink_slab() per-memcg - all we will have to do is extend
the shrink_control structure to include the target memcg and make
list_lru_shrink_{count,walk} handle this appropriately.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Suggested-by: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Glauber Costa <glommer@gmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:08 -08:00
..
acpi ACPICA: Events: Introduce ACPI_GPE_DISPATCH_RAW_HANDLER to fix 2 issues for the current GPE APIs 2015-02-05 15:34:51 +01:00
asm-generic mm: remove remaining references to NUMA hinting bits and helpers 2015-02-12 18:54:08 -08:00
clocksource time: move the timecounter/cyclecounter code into its own file. 2014-12-30 18:29:25 -05:00
crypto crypto: switch af_alg_make_sg() to iov_iter 2015-02-04 01:34:15 -05:00
drm drm/i915: remove unused power_well/get_cdclk_freq api 2015-01-12 02:48:24 +01:00
dt-bindings This is the bulk of pin control changes for the v3.20 cycle: 2015-02-11 11:23:13 -08:00
keys
kvm arm/arm64: KVM: Require in-kernel vgic for the arch timers 2014-12-15 11:50:42 +01:00
linux list_lru: introduce list_lru_shrink_{count,walk} 2015-02-12 18:54:08 -08:00
math-emu
media [media] tea575x: split and export functions 2015-01-27 10:13:50 -02:00
memory
misc
net Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2015-02-11 20:25:11 -08:00
pcmcia
ras
rdma Revert "IB/core: Add support for extended query device caps" 2015-02-06 00:54:33 -08:00
rxrpc
scsi scsi_logging: return void for dev_printk() functions 2015-02-04 08:00:24 -08:00
soc Merge branch 'at91/cleanup5' into next/drivers 2014-12-08 18:29:20 +01:00
sound ALSA: pcm: allow for trigger_tstamp snapshot in .trigger 2015-02-09 16:01:53 +01:00
target target: Drop left-over fabric_max_sectors attribute 2015-01-09 15:22:05 -08:00
trace IOMMU Updates for Linux v3.20 2015-02-12 09:16:56 -08:00
uapi mm: convert p[te|md]_numa users to p[te|md]_protnone_numa 2015-02-12 18:54:08 -08:00
video OMAPDSS: add define for DRA7xx HW version 2015-02-04 12:32:04 +02:00
xen Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-02-10 20:01:30 -08:00
Kbuild