Merge e15e06a839 ("lib/test_maple_tree: add testing for maple tree") into android-mainline
Steps on the way to 6.1-rc1 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Id639701170c1b427abbd641d3f133dbdc1d93e06
This commit is contained in:
commit
cd41ea0975
20 changed files with 46737 additions and 7 deletions
|
|
@ -37,6 +37,7 @@ Library functionality that is used throughout the kernel.
|
|||
kref
|
||||
assoc_array
|
||||
xarray
|
||||
maple_tree
|
||||
idr
|
||||
circular-buffers
|
||||
rbtree
|
||||
|
|
|
|||
217
Documentation/core-api/maple_tree.rst
Normal file
217
Documentation/core-api/maple_tree.rst
Normal file
|
|
@ -0,0 +1,217 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0+
|
||||
|
||||
|
||||
==========
|
||||
Maple Tree
|
||||
==========
|
||||
|
||||
:Author: Liam R. Howlett
|
||||
|
||||
Overview
|
||||
========
|
||||
|
||||
The Maple Tree is a B-Tree data type which is optimized for storing
|
||||
non-overlapping ranges, including ranges of size 1. The tree was designed to
|
||||
be simple to use and does not require a user written search method. It
|
||||
supports iterating over a range of entries and going to the previous or next
|
||||
entry in a cache-efficient manner. The tree can also be put into an RCU-safe
|
||||
mode of operation which allows reading and writing concurrently. Writers must
|
||||
synchronize on a lock, which can be the default spinlock, or the user can set
|
||||
the lock to an external lock of a different type.
|
||||
|
||||
The Maple Tree maintains a small memory footprint and was designed to use
|
||||
modern processor cache efficiently. The majority of the users will be able to
|
||||
use the normal API. An :ref:`maple-tree-advanced-api` exists for more complex
|
||||
scenarios. The most important usage of the Maple Tree is the tracking of the
|
||||
virtual memory areas.
|
||||
|
||||
The Maple Tree can store values between ``0`` and ``ULONG_MAX``. The Maple
|
||||
Tree reserves values with the bottom two bits set to '10' which are below 4096
|
||||
(ie 2, 6, 10 .. 4094) for internal use. If the entries may use reserved
|
||||
entries then the users can convert the entries using xa_mk_value() and convert
|
||||
them back by calling xa_to_value(). If the user needs to use a reserved
|
||||
value, then the user can convert the value when using the
|
||||
:ref:`maple-tree-advanced-api`, but are blocked by the normal API.
|
||||
|
||||
The Maple Tree can also be configured to support searching for a gap of a given
|
||||
size (or larger).
|
||||
|
||||
Pre-allocating of nodes is also supported using the
|
||||
:ref:`maple-tree-advanced-api`. This is useful for users who must guarantee a
|
||||
successful store operation within a given
|
||||
code segment when allocating cannot be done. Allocations of nodes are
|
||||
relatively small at around 256 bytes.
|
||||
|
||||
.. _maple-tree-normal-api:
|
||||
|
||||
Normal API
|
||||
==========
|
||||
|
||||
Start by initialising a maple tree, either with DEFINE_MTREE() for statically
|
||||
allocated maple trees or mt_init() for dynamically allocated ones. A
|
||||
freshly-initialised maple tree contains a ``NULL`` pointer for the range ``0``
|
||||
- ``ULONG_MAX``. There are currently two types of maple trees supported: the
|
||||
allocation tree and the regular tree. The regular tree has a higher branching
|
||||
factor for internal nodes. The allocation tree has a lower branching factor
|
||||
but allows the user to search for a gap of a given size or larger from either
|
||||
``0`` upwards or ``ULONG_MAX`` down. An allocation tree can be used by
|
||||
passing in the ``MT_FLAGS_ALLOC_RANGE`` flag when initialising the tree.
|
||||
|
||||
You can then set entries using mtree_store() or mtree_store_range().
|
||||
mtree_store() will overwrite any entry with the new entry and return 0 on
|
||||
success or an error code otherwise. mtree_store_range() works in the same way
|
||||
but takes a range. mtree_load() is used to retrieve the entry stored at a
|
||||
given index. You can use mtree_erase() to erase an entire range by only
|
||||
knowing one value within that range, or mtree_store() call with an entry of
|
||||
NULL may be used to partially erase a range or many ranges at once.
|
||||
|
||||
If you want to only store a new entry to a range (or index) if that range is
|
||||
currently ``NULL``, you can use mtree_insert_range() or mtree_insert() which
|
||||
return -EEXIST if the range is not empty.
|
||||
|
||||
You can search for an entry from an index upwards by using mt_find().
|
||||
|
||||
You can walk each entry within a range by calling mt_for_each(). You must
|
||||
provide a temporary variable to store a cursor. If you want to walk each
|
||||
element of the tree then ``0`` and ``ULONG_MAX`` may be used as the range. If
|
||||
the caller is going to hold the lock for the duration of the walk then it is
|
||||
worth looking at the mas_for_each() API in the :ref:`maple-tree-advanced-api`
|
||||
section.
|
||||
|
||||
Sometimes it is necessary to ensure the next call to store to a maple tree does
|
||||
not allocate memory, please see :ref:`maple-tree-advanced-api` for this use case.
|
||||
|
||||
Finally, you can remove all entries from a maple tree by calling
|
||||
mtree_destroy(). If the maple tree entries are pointers, you may wish to free
|
||||
the entries first.
|
||||
|
||||
Allocating Nodes
|
||||
----------------
|
||||
|
||||
The allocations are handled by the internal tree code. See
|
||||
:ref:`maple-tree-advanced-alloc` for other options.
|
||||
|
||||
Locking
|
||||
-------
|
||||
|
||||
You do not have to worry about locking. See :ref:`maple-tree-advanced-locks`
|
||||
for other options.
|
||||
|
||||
The Maple Tree uses RCU and an internal spinlock to synchronise access:
|
||||
|
||||
Takes RCU read lock:
|
||||
* mtree_load()
|
||||
* mt_find()
|
||||
* mt_for_each()
|
||||
* mt_next()
|
||||
* mt_prev()
|
||||
|
||||
Takes ma_lock internally:
|
||||
* mtree_store()
|
||||
* mtree_store_range()
|
||||
* mtree_insert()
|
||||
* mtree_insert_range()
|
||||
* mtree_erase()
|
||||
* mtree_destroy()
|
||||
* mt_set_in_rcu()
|
||||
* mt_clear_in_rcu()
|
||||
|
||||
If you want to take advantage of the internal lock to protect the data
|
||||
structures that you are storing in the Maple Tree, you can call mtree_lock()
|
||||
before calling mtree_load(), then take a reference count on the object you
|
||||
have found before calling mtree_unlock(). This will prevent stores from
|
||||
removing the object from the tree between looking up the object and
|
||||
incrementing the refcount. You can also use RCU to avoid dereferencing
|
||||
freed memory, but an explanation of that is beyond the scope of this
|
||||
document.
|
||||
|
||||
.. _maple-tree-advanced-api:
|
||||
|
||||
Advanced API
|
||||
============
|
||||
|
||||
The advanced API offers more flexibility and better performance at the
|
||||
cost of an interface which can be harder to use and has fewer safeguards.
|
||||
You must take care of your own locking while using the advanced API.
|
||||
You can use the ma_lock, RCU or an external lock for protection.
|
||||
You can mix advanced and normal operations on the same array, as long
|
||||
as the locking is compatible. The :ref:`maple-tree-normal-api` is implemented
|
||||
in terms of the advanced API.
|
||||
|
||||
The advanced API is based around the ma_state, this is where the 'mas'
|
||||
prefix originates. The ma_state struct keeps track of tree operations to make
|
||||
life easier for both internal and external tree users.
|
||||
|
||||
Initialising the maple tree is the same as in the :ref:`maple-tree-normal-api`.
|
||||
Please see above.
|
||||
|
||||
The maple state keeps track of the range start and end in mas->index and
|
||||
mas->last, respectively.
|
||||
|
||||
mas_walk() will walk the tree to the location of mas->index and set the
|
||||
mas->index and mas->last according to the range for the entry.
|
||||
|
||||
You can set entries using mas_store(). mas_store() will overwrite any entry
|
||||
with the new entry and return the first existing entry that is overwritten.
|
||||
The range is passed in as members of the maple state: index and last.
|
||||
|
||||
You can use mas_erase() to erase an entire range by setting index and
|
||||
last of the maple state to the desired range to erase. This will erase
|
||||
the first range that is found in that range, set the maple state index
|
||||
and last as the range that was erased and return the entry that existed
|
||||
at that location.
|
||||
|
||||
You can walk each entry within a range by using mas_for_each(). If you want
|
||||
to walk each element of the tree then ``0`` and ``ULONG_MAX`` may be used as
|
||||
the range. If the lock needs to be periodically dropped, see the locking
|
||||
section mas_pause().
|
||||
|
||||
Using a maple state allows mas_next() and mas_prev() to function as if the
|
||||
tree was a linked list. With such a high branching factor the amortized
|
||||
performance penalty is outweighed by cache optimization. mas_next() will
|
||||
return the next entry which occurs after the entry at index. mas_prev()
|
||||
will return the previous entry which occurs before the entry at index.
|
||||
|
||||
mas_find() will find the first entry which exists at or above index on
|
||||
the first call, and the next entry from every subsequent calls.
|
||||
|
||||
mas_find_rev() will find the fist entry which exists at or below the last on
|
||||
the first call, and the previous entry from every subsequent calls.
|
||||
|
||||
If the user needs to yield the lock during an operation, then the maple state
|
||||
must be paused using mas_pause().
|
||||
|
||||
There are a few extra interfaces provided when using an allocation tree.
|
||||
If you wish to search for a gap within a range, then mas_empty_area()
|
||||
or mas_empty_area_rev() can be used. mas_empty_area() searches for a gap
|
||||
starting at the lowest index given up to the maximum of the range.
|
||||
mas_empty_area_rev() searches for a gap starting at the highest index given
|
||||
and continues downward to the lower bound of the range.
|
||||
|
||||
.. _maple-tree-advanced-alloc:
|
||||
|
||||
Advanced Allocating Nodes
|
||||
-------------------------
|
||||
|
||||
Allocations are usually handled internally to the tree, however if allocations
|
||||
need to occur before a write occurs then calling mas_expected_entries() will
|
||||
allocate the worst-case number of needed nodes to insert the provided number of
|
||||
ranges. This also causes the tree to enter mass insertion mode. Once
|
||||
insertions are complete calling mas_destroy() on the maple state will free the
|
||||
unused allocations.
|
||||
|
||||
.. _maple-tree-advanced-locks:
|
||||
|
||||
Advanced Locking
|
||||
----------------
|
||||
|
||||
The maple tree uses a spinlock by default, but external locks can be used for
|
||||
tree updates as well. To use an external lock, the tree must be initialized
|
||||
with the ``MT_FLAGS_LOCK_EXTERN flag``, this is usually done with the
|
||||
MTREE_INIT_EXT() #define, which takes an external lock as an argument.
|
||||
|
||||
Functions and structures
|
||||
========================
|
||||
|
||||
.. kernel-doc:: include/linux/maple_tree.h
|
||||
.. kernel-doc:: lib/maple_tree.c
|
||||
12
MAINTAINERS
12
MAINTAINERS
|
|
@ -12175,6 +12175,18 @@ L: linux-man@vger.kernel.org
|
|||
S: Maintained
|
||||
W: http://www.kernel.org/doc/man-pages
|
||||
|
||||
MAPLE TREE
|
||||
M: Liam R. Howlett <Liam.Howlett@oracle.com>
|
||||
L: linux-mm@kvack.org
|
||||
S: Supported
|
||||
F: Documentation/core-api/maple_tree.rst
|
||||
F: include/linux/maple_tree.h
|
||||
F: include/trace/events/maple_tree.h
|
||||
F: lib/maple_tree.c
|
||||
F: lib/test_maple_tree.c
|
||||
F: tools/testing/radix-tree/linux/maple_tree.h
|
||||
F: tools/testing/radix-tree/maple.c
|
||||
|
||||
MARDUK (CREATOR CI40) DEVICE TREE SUPPORT
|
||||
M: Rahul Bedarkar <rahulbedarkar89@gmail.com>
|
||||
L: linux-mips@vger.kernel.org
|
||||
|
|
|
|||
685
include/linux/maple_tree.h
Normal file
685
include/linux/maple_tree.h
Normal file
|
|
@ -0,0 +1,685 @@
|
|||
/* SPDX-License-Identifier: GPL-2.0+ */
|
||||
#ifndef _LINUX_MAPLE_TREE_H
|
||||
#define _LINUX_MAPLE_TREE_H
|
||||
/*
|
||||
* Maple Tree - An RCU-safe adaptive tree for storing ranges
|
||||
* Copyright (c) 2018-2022 Oracle
|
||||
* Authors: Liam R. Howlett <Liam.Howlett@Oracle.com>
|
||||
* Matthew Wilcox <willy@infradead.org>
|
||||
*/
|
||||
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/rcupdate.h>
|
||||
#include <linux/spinlock.h>
|
||||
/* #define CONFIG_MAPLE_RCU_DISABLED */
|
||||
/* #define CONFIG_DEBUG_MAPLE_TREE_VERBOSE */
|
||||
|
||||
/*
|
||||
* Allocated nodes are mutable until they have been inserted into the tree,
|
||||
* at which time they cannot change their type until they have been removed
|
||||
* from the tree and an RCU grace period has passed.
|
||||
*
|
||||
* Removed nodes have their ->parent set to point to themselves. RCU readers
|
||||
* check ->parent before relying on the value that they loaded from the
|
||||
* slots array. This lets us reuse the slots array for the RCU head.
|
||||
*
|
||||
* Nodes in the tree point to their parent unless bit 0 is set.
|
||||
*/
|
||||
#if defined(CONFIG_64BIT) || defined(BUILD_VDSO32_64)
|
||||
/* 64bit sizes */
|
||||
#define MAPLE_NODE_SLOTS 31 /* 256 bytes including ->parent */
|
||||
#define MAPLE_RANGE64_SLOTS 16 /* 256 bytes */
|
||||
#define MAPLE_ARANGE64_SLOTS 10 /* 240 bytes */
|
||||
#define MAPLE_ARANGE64_META_MAX 15 /* Out of range for metadata */
|
||||
#define MAPLE_ALLOC_SLOTS (MAPLE_NODE_SLOTS - 1)
|
||||
#else
|
||||
/* 32bit sizes */
|
||||
#define MAPLE_NODE_SLOTS 63 /* 256 bytes including ->parent */
|
||||
#define MAPLE_RANGE64_SLOTS 32 /* 256 bytes */
|
||||
#define MAPLE_ARANGE64_SLOTS 21 /* 240 bytes */
|
||||
#define MAPLE_ARANGE64_META_MAX 31 /* Out of range for metadata */
|
||||
#define MAPLE_ALLOC_SLOTS (MAPLE_NODE_SLOTS - 2)
|
||||
#endif /* defined(CONFIG_64BIT) || defined(BUILD_VDSO32_64) */
|
||||
|
||||
#define MAPLE_NODE_MASK 255UL
|
||||
|
||||
/*
|
||||
* The node->parent of the root node has bit 0 set and the rest of the pointer
|
||||
* is a pointer to the tree itself. No more bits are available in this pointer
|
||||
* (on m68k, the data structure may only be 2-byte aligned).
|
||||
*
|
||||
* Internal non-root nodes can only have maple_range_* nodes as parents. The
|
||||
* parent pointer is 256B aligned like all other tree nodes. When storing a 32
|
||||
* or 64 bit values, the offset can fit into 4 bits. The 16 bit values need an
|
||||
* extra bit to store the offset. This extra bit comes from a reuse of the last
|
||||
* bit in the node type. This is possible by using bit 1 to indicate if bit 2
|
||||
* is part of the type or the slot.
|
||||
*
|
||||
* Once the type is decided, the decision of an allocation range type or a range
|
||||
* type is done by examining the immutable tree flag for the MAPLE_ALLOC_RANGE
|
||||
* flag.
|
||||
*
|
||||
* Node types:
|
||||
* 0x??1 = Root
|
||||
* 0x?00 = 16 bit nodes
|
||||
* 0x010 = 32 bit nodes
|
||||
* 0x110 = 64 bit nodes
|
||||
*
|
||||
* Slot size and location in the parent pointer:
|
||||
* type : slot location
|
||||
* 0x??1 : Root
|
||||
* 0x?00 : 16 bit values, type in 0-1, slot in 2-6
|
||||
* 0x010 : 32 bit values, type in 0-2, slot in 3-6
|
||||
* 0x110 : 64 bit values, type in 0-2, slot in 3-6
|
||||
*/
|
||||
|
||||
/*
|
||||
* This metadata is used to optimize the gap updating code and in reverse
|
||||
* searching for gaps or any other code that needs to find the end of the data.
|
||||
*/
|
||||
struct maple_metadata {
|
||||
unsigned char end;
|
||||
unsigned char gap;
|
||||
};
|
||||
|
||||
/*
|
||||
* Leaf nodes do not store pointers to nodes, they store user data. Users may
|
||||
* store almost any bit pattern. As noted above, the optimisation of storing an
|
||||
* entry at 0 in the root pointer cannot be done for data which have the bottom
|
||||
* two bits set to '10'. We also reserve values with the bottom two bits set to
|
||||
* '10' which are below 4096 (ie 2, 6, 10 .. 4094) for internal use. Some APIs
|
||||
* return errnos as a negative errno shifted right by two bits and the bottom
|
||||
* two bits set to '10', and while choosing to store these values in the array
|
||||
* is not an error, it may lead to confusion if you're testing for an error with
|
||||
* mas_is_err().
|
||||
*
|
||||
* Non-leaf nodes store the type of the node pointed to (enum maple_type in bits
|
||||
* 3-6), bit 2 is reserved. That leaves bits 0-1 unused for now.
|
||||
*
|
||||
* In regular B-Tree terms, pivots are called keys. The term pivot is used to
|
||||
* indicate that the tree is specifying ranges, Pivots may appear in the
|
||||
* subtree with an entry attached to the value whereas keys are unique to a
|
||||
* specific position of a B-tree. Pivot values are inclusive of the slot with
|
||||
* the same index.
|
||||
*/
|
||||
|
||||
struct maple_range_64 {
|
||||
struct maple_pnode *parent;
|
||||
unsigned long pivot[MAPLE_RANGE64_SLOTS - 1];
|
||||
union {
|
||||
void __rcu *slot[MAPLE_RANGE64_SLOTS];
|
||||
struct {
|
||||
void __rcu *pad[MAPLE_RANGE64_SLOTS - 1];
|
||||
struct maple_metadata meta;
|
||||
};
|
||||
};
|
||||
};
|
||||
|
||||
/*
|
||||
* At tree creation time, the user can specify that they're willing to trade off
|
||||
* storing fewer entries in a tree in return for storing more information in
|
||||
* each node.
|
||||
*
|
||||
* The maple tree supports recording the largest range of NULL entries available
|
||||
* in this node, also called gaps. This optimises the tree for allocating a
|
||||
* range.
|
||||
*/
|
||||
struct maple_arange_64 {
|
||||
struct maple_pnode *parent;
|
||||
unsigned long pivot[MAPLE_ARANGE64_SLOTS - 1];
|
||||
void __rcu *slot[MAPLE_ARANGE64_SLOTS];
|
||||
unsigned long gap[MAPLE_ARANGE64_SLOTS];
|
||||
struct maple_metadata meta;
|
||||
};
|
||||
|
||||
struct maple_alloc {
|
||||
unsigned long total;
|
||||
unsigned char node_count;
|
||||
unsigned int request_count;
|
||||
struct maple_alloc *slot[MAPLE_ALLOC_SLOTS];
|
||||
};
|
||||
|
||||
struct maple_topiary {
|
||||
struct maple_pnode *parent;
|
||||
struct maple_enode *next; /* Overlaps the pivot */
|
||||
};
|
||||
|
||||
enum maple_type {
|
||||
maple_dense,
|
||||
maple_leaf_64,
|
||||
maple_range_64,
|
||||
maple_arange_64,
|
||||
};
|
||||
|
||||
|
||||
/**
|
||||
* DOC: Maple tree flags
|
||||
*
|
||||
* * MT_FLAGS_ALLOC_RANGE - Track gaps in this tree
|
||||
* * MT_FLAGS_USE_RCU - Operate in RCU mode
|
||||
* * MT_FLAGS_HEIGHT_OFFSET - The position of the tree height in the flags
|
||||
* * MT_FLAGS_HEIGHT_MASK - The mask for the maple tree height value
|
||||
* * MT_FLAGS_LOCK_MASK - How the mt_lock is used
|
||||
* * MT_FLAGS_LOCK_IRQ - Acquired irq-safe
|
||||
* * MT_FLAGS_LOCK_BH - Acquired bh-safe
|
||||
* * MT_FLAGS_LOCK_EXTERN - mt_lock is not used
|
||||
*
|
||||
* MAPLE_HEIGHT_MAX The largest height that can be stored
|
||||
*/
|
||||
#define MT_FLAGS_ALLOC_RANGE 0x01
|
||||
#define MT_FLAGS_USE_RCU 0x02
|
||||
#define MT_FLAGS_HEIGHT_OFFSET 0x02
|
||||
#define MT_FLAGS_HEIGHT_MASK 0x7C
|
||||
#define MT_FLAGS_LOCK_MASK 0x300
|
||||
#define MT_FLAGS_LOCK_IRQ 0x100
|
||||
#define MT_FLAGS_LOCK_BH 0x200
|
||||
#define MT_FLAGS_LOCK_EXTERN 0x300
|
||||
|
||||
#define MAPLE_HEIGHT_MAX 31
|
||||
|
||||
|
||||
#define MAPLE_NODE_TYPE_MASK 0x0F
|
||||
#define MAPLE_NODE_TYPE_SHIFT 0x03
|
||||
|
||||
#define MAPLE_RESERVED_RANGE 4096
|
||||
|
||||
#ifdef CONFIG_LOCKDEP
|
||||
typedef struct lockdep_map *lockdep_map_p;
|
||||
#define mt_lock_is_held(mt) lock_is_held(mt->ma_external_lock)
|
||||
#define mt_set_external_lock(mt, lock) \
|
||||
(mt)->ma_external_lock = &(lock)->dep_map
|
||||
#else
|
||||
typedef struct { /* nothing */ } lockdep_map_p;
|
||||
#define mt_lock_is_held(mt) 1
|
||||
#define mt_set_external_lock(mt, lock) do { } while (0)
|
||||
#endif
|
||||
|
||||
/*
|
||||
* If the tree contains a single entry at index 0, it is usually stored in
|
||||
* tree->ma_root. To optimise for the page cache, an entry which ends in '00',
|
||||
* '01' or '11' is stored in the root, but an entry which ends in '10' will be
|
||||
* stored in a node. Bits 3-6 are used to store enum maple_type.
|
||||
*
|
||||
* The flags are used both to store some immutable information about this tree
|
||||
* (set at tree creation time) and dynamic information set under the spinlock.
|
||||
*
|
||||
* Another use of flags are to indicate global states of the tree. This is the
|
||||
* case with the MAPLE_USE_RCU flag, which indicates the tree is currently in
|
||||
* RCU mode. This mode was added to allow the tree to reuse nodes instead of
|
||||
* re-allocating and RCU freeing nodes when there is a single user.
|
||||
*/
|
||||
struct maple_tree {
|
||||
union {
|
||||
spinlock_t ma_lock;
|
||||
lockdep_map_p ma_external_lock;
|
||||
};
|
||||
void __rcu *ma_root;
|
||||
unsigned int ma_flags;
|
||||
};
|
||||
|
||||
/**
|
||||
* MTREE_INIT() - Initialize a maple tree
|
||||
* @name: The maple tree name
|
||||
* @__flags: The maple tree flags
|
||||
*
|
||||
*/
|
||||
#define MTREE_INIT(name, __flags) { \
|
||||
.ma_lock = __SPIN_LOCK_UNLOCKED((name).ma_lock), \
|
||||
.ma_flags = __flags, \
|
||||
.ma_root = NULL, \
|
||||
}
|
||||
|
||||
/**
|
||||
* MTREE_INIT_EXT() - Initialize a maple tree with an external lock.
|
||||
* @name: The tree name
|
||||
* @__flags: The maple tree flags
|
||||
* @__lock: The external lock
|
||||
*/
|
||||
#ifdef CONFIG_LOCKDEP
|
||||
#define MTREE_INIT_EXT(name, __flags, __lock) { \
|
||||
.ma_external_lock = &(__lock).dep_map, \
|
||||
.ma_flags = (__flags), \
|
||||
.ma_root = NULL, \
|
||||
}
|
||||
#else
|
||||
#define MTREE_INIT_EXT(name, __flags, __lock) MTREE_INIT(name, __flags)
|
||||
#endif
|
||||
|
||||
#define DEFINE_MTREE(name) \
|
||||
struct maple_tree name = MTREE_INIT(name, 0)
|
||||
|
||||
#define mtree_lock(mt) spin_lock((&(mt)->ma_lock))
|
||||
#define mtree_unlock(mt) spin_unlock((&(mt)->ma_lock))
|
||||
|
||||
/*
|
||||
* The Maple Tree squeezes various bits in at various points which aren't
|
||||
* necessarily obvious. Usually, this is done by observing that pointers are
|
||||
* N-byte aligned and thus the bottom log_2(N) bits are available for use. We
|
||||
* don't use the high bits of pointers to store additional information because
|
||||
* we don't know what bits are unused on any given architecture.
|
||||
*
|
||||
* Nodes are 256 bytes in size and are also aligned to 256 bytes, giving us 8
|
||||
* low bits for our own purposes. Nodes are currently of 4 types:
|
||||
* 1. Single pointer (Range is 0-0)
|
||||
* 2. Non-leaf Allocation Range nodes
|
||||
* 3. Non-leaf Range nodes
|
||||
* 4. Leaf Range nodes All nodes consist of a number of node slots,
|
||||
* pivots, and a parent pointer.
|
||||
*/
|
||||
|
||||
struct maple_node {
|
||||
union {
|
||||
struct {
|
||||
struct maple_pnode *parent;
|
||||
void __rcu *slot[MAPLE_NODE_SLOTS];
|
||||
};
|
||||
struct {
|
||||
void *pad;
|
||||
struct rcu_head rcu;
|
||||
struct maple_enode *piv_parent;
|
||||
unsigned char parent_slot;
|
||||
enum maple_type type;
|
||||
unsigned char slot_len;
|
||||
unsigned int ma_flags;
|
||||
};
|
||||
struct maple_range_64 mr64;
|
||||
struct maple_arange_64 ma64;
|
||||
struct maple_alloc alloc;
|
||||
};
|
||||
};
|
||||
|
||||
/*
|
||||
* More complicated stores can cause two nodes to become one or three and
|
||||
* potentially alter the height of the tree. Either half of the tree may need
|
||||
* to be rebalanced against the other. The ma_topiary struct is used to track
|
||||
* which nodes have been 'cut' from the tree so that the change can be done
|
||||
* safely at a later date. This is done to support RCU.
|
||||
*/
|
||||
struct ma_topiary {
|
||||
struct maple_enode *head;
|
||||
struct maple_enode *tail;
|
||||
struct maple_tree *mtree;
|
||||
};
|
||||
|
||||
void *mtree_load(struct maple_tree *mt, unsigned long index);
|
||||
|
||||
int mtree_insert(struct maple_tree *mt, unsigned long index,
|
||||
void *entry, gfp_t gfp);
|
||||
int mtree_insert_range(struct maple_tree *mt, unsigned long first,
|
||||
unsigned long last, void *entry, gfp_t gfp);
|
||||
int mtree_alloc_range(struct maple_tree *mt, unsigned long *startp,
|
||||
void *entry, unsigned long size, unsigned long min,
|
||||
unsigned long max, gfp_t gfp);
|
||||
int mtree_alloc_rrange(struct maple_tree *mt, unsigned long *startp,
|
||||
void *entry, unsigned long size, unsigned long min,
|
||||
unsigned long max, gfp_t gfp);
|
||||
|
||||
int mtree_store_range(struct maple_tree *mt, unsigned long first,
|
||||
unsigned long last, void *entry, gfp_t gfp);
|
||||
int mtree_store(struct maple_tree *mt, unsigned long index,
|
||||
void *entry, gfp_t gfp);
|
||||
void *mtree_erase(struct maple_tree *mt, unsigned long index);
|
||||
|
||||
void mtree_destroy(struct maple_tree *mt);
|
||||
void __mt_destroy(struct maple_tree *mt);
|
||||
|
||||
/**
|
||||
* mtree_empty() - Determine if a tree has any present entries.
|
||||
* @mt: Maple Tree.
|
||||
*
|
||||
* Context: Any context.
|
||||
* Return: %true if the tree contains only NULL pointers.
|
||||
*/
|
||||
static inline bool mtree_empty(const struct maple_tree *mt)
|
||||
{
|
||||
return mt->ma_root == NULL;
|
||||
}
|
||||
|
||||
/* Advanced API */
|
||||
|
||||
/*
|
||||
* The maple state is defined in the struct ma_state and is used to keep track
|
||||
* of information during operations, and even between operations when using the
|
||||
* advanced API.
|
||||
*
|
||||
* If state->node has bit 0 set then it references a tree location which is not
|
||||
* a node (eg the root). If bit 1 is set, the rest of the bits are a negative
|
||||
* errno. Bit 2 (the 'unallocated slots' bit) is clear. Bits 3-6 indicate the
|
||||
* node type.
|
||||
*
|
||||
* state->alloc either has a request number of nodes or an allocated node. If
|
||||
* stat->alloc has a requested number of nodes, the first bit will be set (0x1)
|
||||
* and the remaining bits are the value. If state->alloc is a node, then the
|
||||
* node will be of type maple_alloc. maple_alloc has MAPLE_NODE_SLOTS - 1 for
|
||||
* storing more allocated nodes, a total number of nodes allocated, and the
|
||||
* node_count in this node. node_count is the number of allocated nodes in this
|
||||
* node. The scaling beyond MAPLE_NODE_SLOTS - 1 is handled by storing further
|
||||
* nodes into state->alloc->slot[0]'s node. Nodes are taken from state->alloc
|
||||
* by removing a node from the state->alloc node until state->alloc->node_count
|
||||
* is 1, when state->alloc is returned and the state->alloc->slot[0] is promoted
|
||||
* to state->alloc. Nodes are pushed onto state->alloc by putting the current
|
||||
* state->alloc into the pushed node's slot[0].
|
||||
*
|
||||
* The state also contains the implied min/max of the state->node, the depth of
|
||||
* this search, and the offset. The implied min/max are either from the parent
|
||||
* node or are 0-oo for the root node. The depth is incremented or decremented
|
||||
* every time a node is walked down or up. The offset is the slot/pivot of
|
||||
* interest in the node - either for reading or writing.
|
||||
*
|
||||
* When returning a value the maple state index and last respectively contain
|
||||
* the start and end of the range for the entry. Ranges are inclusive in the
|
||||
* Maple Tree.
|
||||
*/
|
||||
struct ma_state {
|
||||
struct maple_tree *tree; /* The tree we're operating in */
|
||||
unsigned long index; /* The index we're operating on - range start */
|
||||
unsigned long last; /* The last index we're operating on - range end */
|
||||
struct maple_enode *node; /* The node containing this entry */
|
||||
unsigned long min; /* The minimum index of this node - implied pivot min */
|
||||
unsigned long max; /* The maximum index of this node - implied pivot max */
|
||||
struct maple_alloc *alloc; /* Allocated nodes for this operation */
|
||||
unsigned char depth; /* depth of tree descent during write */
|
||||
unsigned char offset;
|
||||
unsigned char mas_flags;
|
||||
};
|
||||
|
||||
struct ma_wr_state {
|
||||
struct ma_state *mas;
|
||||
struct maple_node *node; /* Decoded mas->node */
|
||||
unsigned long r_min; /* range min */
|
||||
unsigned long r_max; /* range max */
|
||||
enum maple_type type; /* mas->node type */
|
||||
unsigned char offset_end; /* The offset where the write ends */
|
||||
unsigned char node_end; /* mas->node end */
|
||||
unsigned long *pivots; /* mas->node->pivots pointer */
|
||||
unsigned long end_piv; /* The pivot at the offset end */
|
||||
void __rcu **slots; /* mas->node->slots pointer */
|
||||
void *entry; /* The entry to write */
|
||||
void *content; /* The existing entry that is being overwritten */
|
||||
};
|
||||
|
||||
#define mas_lock(mas) spin_lock(&((mas)->tree->ma_lock))
|
||||
#define mas_unlock(mas) spin_unlock(&((mas)->tree->ma_lock))
|
||||
|
||||
|
||||
/*
|
||||
* Special values for ma_state.node.
|
||||
* MAS_START means we have not searched the tree.
|
||||
* MAS_ROOT means we have searched the tree and the entry we found lives in
|
||||
* the root of the tree (ie it has index 0, length 1 and is the only entry in
|
||||
* the tree).
|
||||
* MAS_NONE means we have searched the tree and there is no node in the
|
||||
* tree for this entry. For example, we searched for index 1 in an empty
|
||||
* tree. Or we have a tree which points to a full leaf node and we
|
||||
* searched for an entry which is larger than can be contained in that
|
||||
* leaf node.
|
||||
* MA_ERROR represents an errno. After dropping the lock and attempting
|
||||
* to resolve the error, the walk would have to be restarted from the
|
||||
* top of the tree as the tree may have been modified.
|
||||
*/
|
||||
#define MAS_START ((struct maple_enode *)1UL)
|
||||
#define MAS_ROOT ((struct maple_enode *)5UL)
|
||||
#define MAS_NONE ((struct maple_enode *)9UL)
|
||||
#define MAS_PAUSE ((struct maple_enode *)17UL)
|
||||
#define MA_ERROR(err) \
|
||||
((struct maple_enode *)(((unsigned long)err << 2) | 2UL))
|
||||
|
||||
#define MA_STATE(name, mt, first, end) \
|
||||
struct ma_state name = { \
|
||||
.tree = mt, \
|
||||
.index = first, \
|
||||
.last = end, \
|
||||
.node = MAS_START, \
|
||||
.min = 0, \
|
||||
.max = ULONG_MAX, \
|
||||
.alloc = NULL, \
|
||||
}
|
||||
|
||||
#define MA_WR_STATE(name, ma_state, wr_entry) \
|
||||
struct ma_wr_state name = { \
|
||||
.mas = ma_state, \
|
||||
.content = NULL, \
|
||||
.entry = wr_entry, \
|
||||
}
|
||||
|
||||
#define MA_TOPIARY(name, tree) \
|
||||
struct ma_topiary name = { \
|
||||
.head = NULL, \
|
||||
.tail = NULL, \
|
||||
.mtree = tree, \
|
||||
}
|
||||
|
||||
void *mas_walk(struct ma_state *mas);
|
||||
void *mas_store(struct ma_state *mas, void *entry);
|
||||
void *mas_erase(struct ma_state *mas);
|
||||
int mas_store_gfp(struct ma_state *mas, void *entry, gfp_t gfp);
|
||||
void mas_store_prealloc(struct ma_state *mas, void *entry);
|
||||
void *mas_find(struct ma_state *mas, unsigned long max);
|
||||
void *mas_find_rev(struct ma_state *mas, unsigned long min);
|
||||
int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp);
|
||||
bool mas_is_err(struct ma_state *mas);
|
||||
|
||||
bool mas_nomem(struct ma_state *mas, gfp_t gfp);
|
||||
void mas_pause(struct ma_state *mas);
|
||||
void maple_tree_init(void);
|
||||
void mas_destroy(struct ma_state *mas);
|
||||
int mas_expected_entries(struct ma_state *mas, unsigned long nr_entries);
|
||||
|
||||
void *mas_prev(struct ma_state *mas, unsigned long min);
|
||||
void *mas_next(struct ma_state *mas, unsigned long max);
|
||||
|
||||
int mas_empty_area(struct ma_state *mas, unsigned long min, unsigned long max,
|
||||
unsigned long size);
|
||||
|
||||
/* Checks if a mas has not found anything */
|
||||
static inline bool mas_is_none(struct ma_state *mas)
|
||||
{
|
||||
return mas->node == MAS_NONE;
|
||||
}
|
||||
|
||||
/* Checks if a mas has been paused */
|
||||
static inline bool mas_is_paused(struct ma_state *mas)
|
||||
{
|
||||
return mas->node == MAS_PAUSE;
|
||||
}
|
||||
|
||||
void mas_dup_tree(struct ma_state *oldmas, struct ma_state *mas);
|
||||
void mas_dup_store(struct ma_state *mas, void *entry);
|
||||
|
||||
/*
|
||||
* This finds an empty area from the highest address to the lowest.
|
||||
* AKA "Topdown" version,
|
||||
*/
|
||||
int mas_empty_area_rev(struct ma_state *mas, unsigned long min,
|
||||
unsigned long max, unsigned long size);
|
||||
/**
|
||||
* mas_reset() - Reset a Maple Tree operation state.
|
||||
* @mas: Maple Tree operation state.
|
||||
*
|
||||
* Resets the error or walk state of the @mas so future walks of the
|
||||
* array will start from the root. Use this if you have dropped the
|
||||
* lock and want to reuse the ma_state.
|
||||
*
|
||||
* Context: Any context.
|
||||
*/
|
||||
static inline void mas_reset(struct ma_state *mas)
|
||||
{
|
||||
mas->node = MAS_START;
|
||||
}
|
||||
|
||||
/**
|
||||
* mas_for_each() - Iterate over a range of the maple tree.
|
||||
* @__mas: Maple Tree operation state (maple_state)
|
||||
* @__entry: Entry retrieved from the tree
|
||||
* @__max: maximum index to retrieve from the tree
|
||||
*
|
||||
* When returned, mas->index and mas->last will hold the entire range for the
|
||||
* entry.
|
||||
*
|
||||
* Note: may return the zero entry.
|
||||
*
|
||||
*/
|
||||
#define mas_for_each(__mas, __entry, __max) \
|
||||
while (((__entry) = mas_find((__mas), (__max))) != NULL)
|
||||
|
||||
|
||||
/**
|
||||
* mas_set_range() - Set up Maple Tree operation state for a different index.
|
||||
* @mas: Maple Tree operation state.
|
||||
* @start: New start of range in the Maple Tree.
|
||||
* @last: New end of range in the Maple Tree.
|
||||
*
|
||||
* Move the operation state to refer to a different range. This will
|
||||
* have the effect of starting a walk from the top; see mas_next()
|
||||
* to move to an adjacent index.
|
||||
*/
|
||||
static inline
|
||||
void mas_set_range(struct ma_state *mas, unsigned long start, unsigned long last)
|
||||
{
|
||||
mas->index = start;
|
||||
mas->last = last;
|
||||
mas->node = MAS_START;
|
||||
}
|
||||
|
||||
/**
|
||||
* mas_set() - Set up Maple Tree operation state for a different index.
|
||||
* @mas: Maple Tree operation state.
|
||||
* @index: New index into the Maple Tree.
|
||||
*
|
||||
* Move the operation state to refer to a different index. This will
|
||||
* have the effect of starting a walk from the top; see mas_next()
|
||||
* to move to an adjacent index.
|
||||
*/
|
||||
static inline void mas_set(struct ma_state *mas, unsigned long index)
|
||||
{
|
||||
|
||||
mas_set_range(mas, index, index);
|
||||
}
|
||||
|
||||
static inline bool mt_external_lock(const struct maple_tree *mt)
|
||||
{
|
||||
return (mt->ma_flags & MT_FLAGS_LOCK_MASK) == MT_FLAGS_LOCK_EXTERN;
|
||||
}
|
||||
|
||||
/**
|
||||
* mt_init_flags() - Initialise an empty maple tree with flags.
|
||||
* @mt: Maple Tree
|
||||
* @flags: maple tree flags.
|
||||
*
|
||||
* If you need to initialise a Maple Tree with special flags (eg, an
|
||||
* allocation tree), use this function.
|
||||
*
|
||||
* Context: Any context.
|
||||
*/
|
||||
static inline void mt_init_flags(struct maple_tree *mt, unsigned int flags)
|
||||
{
|
||||
mt->ma_flags = flags;
|
||||
if (!mt_external_lock(mt))
|
||||
spin_lock_init(&mt->ma_lock);
|
||||
rcu_assign_pointer(mt->ma_root, NULL);
|
||||
}
|
||||
|
||||
/**
|
||||
* mt_init() - Initialise an empty maple tree.
|
||||
* @mt: Maple Tree
|
||||
*
|
||||
* An empty Maple Tree.
|
||||
*
|
||||
* Context: Any context.
|
||||
*/
|
||||
static inline void mt_init(struct maple_tree *mt)
|
||||
{
|
||||
mt_init_flags(mt, 0);
|
||||
}
|
||||
|
||||
static inline bool mt_in_rcu(struct maple_tree *mt)
|
||||
{
|
||||
#ifdef CONFIG_MAPLE_RCU_DISABLED
|
||||
return false;
|
||||
#endif
|
||||
return mt->ma_flags & MT_FLAGS_USE_RCU;
|
||||
}
|
||||
|
||||
/**
|
||||
* mt_clear_in_rcu() - Switch the tree to non-RCU mode.
|
||||
* @mt: The Maple Tree
|
||||
*/
|
||||
static inline void mt_clear_in_rcu(struct maple_tree *mt)
|
||||
{
|
||||
if (!mt_in_rcu(mt))
|
||||
return;
|
||||
|
||||
if (mt_external_lock(mt)) {
|
||||
BUG_ON(!mt_lock_is_held(mt));
|
||||
mt->ma_flags &= ~MT_FLAGS_USE_RCU;
|
||||
} else {
|
||||
mtree_lock(mt);
|
||||
mt->ma_flags &= ~MT_FLAGS_USE_RCU;
|
||||
mtree_unlock(mt);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* mt_set_in_rcu() - Switch the tree to RCU safe mode.
|
||||
* @mt: The Maple Tree
|
||||
*/
|
||||
static inline void mt_set_in_rcu(struct maple_tree *mt)
|
||||
{
|
||||
if (mt_in_rcu(mt))
|
||||
return;
|
||||
|
||||
if (mt_external_lock(mt)) {
|
||||
BUG_ON(!mt_lock_is_held(mt));
|
||||
mt->ma_flags |= MT_FLAGS_USE_RCU;
|
||||
} else {
|
||||
mtree_lock(mt);
|
||||
mt->ma_flags |= MT_FLAGS_USE_RCU;
|
||||
mtree_unlock(mt);
|
||||
}
|
||||
}
|
||||
|
||||
void *mt_find(struct maple_tree *mt, unsigned long *index, unsigned long max);
|
||||
void *mt_find_after(struct maple_tree *mt, unsigned long *index,
|
||||
unsigned long max);
|
||||
void *mt_prev(struct maple_tree *mt, unsigned long index, unsigned long min);
|
||||
void *mt_next(struct maple_tree *mt, unsigned long index, unsigned long max);
|
||||
|
||||
/**
|
||||
* mt_for_each - Iterate over each entry starting at index until max.
|
||||
* @__tree: The Maple Tree
|
||||
* @__entry: The current entry
|
||||
* @__index: The index to update to track the location in the tree
|
||||
* @__max: The maximum limit for @index
|
||||
*
|
||||
* Note: Will not return the zero entry.
|
||||
*/
|
||||
#define mt_for_each(__tree, __entry, __index, __max) \
|
||||
for (__entry = mt_find(__tree, &(__index), __max); \
|
||||
__entry; __entry = mt_find_after(__tree, &(__index), __max))
|
||||
|
||||
|
||||
#ifdef CONFIG_DEBUG_MAPLE_TREE
|
||||
extern atomic_t maple_tree_tests_run;
|
||||
extern atomic_t maple_tree_tests_passed;
|
||||
|
||||
void mt_dump(const struct maple_tree *mt);
|
||||
void mt_validate(struct maple_tree *mt);
|
||||
#define MT_BUG_ON(__tree, __x) do { \
|
||||
atomic_inc(&maple_tree_tests_run); \
|
||||
if (__x) { \
|
||||
pr_info("BUG at %s:%d (%u)\n", \
|
||||
__func__, __LINE__, __x); \
|
||||
mt_dump(__tree); \
|
||||
pr_info("Pass: %u Run:%u\n", \
|
||||
atomic_read(&maple_tree_tests_passed), \
|
||||
atomic_read(&maple_tree_tests_run)); \
|
||||
dump_stack(); \
|
||||
} else { \
|
||||
atomic_inc(&maple_tree_tests_passed); \
|
||||
} \
|
||||
} while (0)
|
||||
#else
|
||||
#define MT_BUG_ON(__tree, __x) BUG_ON(__x)
|
||||
#endif /* CONFIG_DEBUG_MAPLE_TREE */
|
||||
|
||||
#endif /*_LINUX_MAPLE_TREE_H */
|
||||
123
include/trace/events/maple_tree.h
Normal file
123
include/trace/events/maple_tree.h
Normal file
|
|
@ -0,0 +1,123 @@
|
|||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
#undef TRACE_SYSTEM
|
||||
#define TRACE_SYSTEM maple_tree
|
||||
|
||||
#if !defined(_TRACE_MM_H) || defined(TRACE_HEADER_MULTI_READ)
|
||||
#define _TRACE_MM_H
|
||||
|
||||
|
||||
#include <linux/tracepoint.h>
|
||||
|
||||
struct ma_state;
|
||||
|
||||
TRACE_EVENT(ma_op,
|
||||
|
||||
TP_PROTO(const char *fn, struct ma_state *mas),
|
||||
|
||||
TP_ARGS(fn, mas),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field(const char *, fn)
|
||||
__field(unsigned long, min)
|
||||
__field(unsigned long, max)
|
||||
__field(unsigned long, index)
|
||||
__field(unsigned long, last)
|
||||
__field(void *, node)
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__entry->fn = fn;
|
||||
__entry->min = mas->min;
|
||||
__entry->max = mas->max;
|
||||
__entry->index = mas->index;
|
||||
__entry->last = mas->last;
|
||||
__entry->node = mas->node;
|
||||
),
|
||||
|
||||
TP_printk("%s\tNode: %p (%lu %lu) range: %lu-%lu",
|
||||
__entry->fn,
|
||||
(void *) __entry->node,
|
||||
(unsigned long) __entry->min,
|
||||
(unsigned long) __entry->max,
|
||||
(unsigned long) __entry->index,
|
||||
(unsigned long) __entry->last
|
||||
)
|
||||
)
|
||||
TRACE_EVENT(ma_read,
|
||||
|
||||
TP_PROTO(const char *fn, struct ma_state *mas),
|
||||
|
||||
TP_ARGS(fn, mas),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field(const char *, fn)
|
||||
__field(unsigned long, min)
|
||||
__field(unsigned long, max)
|
||||
__field(unsigned long, index)
|
||||
__field(unsigned long, last)
|
||||
__field(void *, node)
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__entry->fn = fn;
|
||||
__entry->min = mas->min;
|
||||
__entry->max = mas->max;
|
||||
__entry->index = mas->index;
|
||||
__entry->last = mas->last;
|
||||
__entry->node = mas->node;
|
||||
),
|
||||
|
||||
TP_printk("%s\tNode: %p (%lu %lu) range: %lu-%lu",
|
||||
__entry->fn,
|
||||
(void *) __entry->node,
|
||||
(unsigned long) __entry->min,
|
||||
(unsigned long) __entry->max,
|
||||
(unsigned long) __entry->index,
|
||||
(unsigned long) __entry->last
|
||||
)
|
||||
)
|
||||
|
||||
TRACE_EVENT(ma_write,
|
||||
|
||||
TP_PROTO(const char *fn, struct ma_state *mas, unsigned long piv,
|
||||
void *val),
|
||||
|
||||
TP_ARGS(fn, mas, piv, val),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field(const char *, fn)
|
||||
__field(unsigned long, min)
|
||||
__field(unsigned long, max)
|
||||
__field(unsigned long, index)
|
||||
__field(unsigned long, last)
|
||||
__field(unsigned long, piv)
|
||||
__field(void *, val)
|
||||
__field(void *, node)
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__entry->fn = fn;
|
||||
__entry->min = mas->min;
|
||||
__entry->max = mas->max;
|
||||
__entry->index = mas->index;
|
||||
__entry->last = mas->last;
|
||||
__entry->piv = piv;
|
||||
__entry->val = val;
|
||||
__entry->node = mas->node;
|
||||
),
|
||||
|
||||
TP_printk("%s\tNode %p (%lu %lu) range:%lu-%lu piv (%lu) val %p",
|
||||
__entry->fn,
|
||||
(void *) __entry->node,
|
||||
(unsigned long) __entry->min,
|
||||
(unsigned long) __entry->max,
|
||||
(unsigned long) __entry->index,
|
||||
(unsigned long) __entry->last,
|
||||
(unsigned long) __entry->piv,
|
||||
(void *) __entry->val
|
||||
)
|
||||
)
|
||||
#endif /* _TRACE_MM_H */
|
||||
|
||||
/* This part must be outside protection */
|
||||
#include <trace/define_trace.h>
|
||||
|
|
@ -115,6 +115,7 @@ static int kernel_init(void *);
|
|||
|
||||
extern void init_IRQ(void);
|
||||
extern void radix_tree_init(void);
|
||||
extern void maple_tree_init(void);
|
||||
|
||||
/*
|
||||
* Debug helper: via this flag we know that we are in 'early bootup code'
|
||||
|
|
@ -1006,6 +1007,7 @@ asmlinkage __visible void __init __no_sanitize_address start_kernel(void)
|
|||
"Interrupts were enabled *very* early, fixing it\n"))
|
||||
local_irq_disable();
|
||||
radix_tree_init();
|
||||
maple_tree_init();
|
||||
|
||||
/*
|
||||
* Set up housekeeping before setting up workqueues to allow the unbound
|
||||
|
|
|
|||
|
|
@ -825,6 +825,13 @@ config DEBUG_VM_VMACACHE
|
|||
can cause significant overhead, so only enable it in non-production
|
||||
environments.
|
||||
|
||||
config DEBUG_VM_MAPLE_TREE
|
||||
bool "Debug VM maple trees"
|
||||
depends on DEBUG_VM
|
||||
select DEBUG_MAPLE_TREE
|
||||
help
|
||||
Enable VM maple tree debugging information and extra validations.
|
||||
|
||||
If unsure, say N.
|
||||
|
||||
config DEBUG_VM_RB
|
||||
|
|
@ -1640,6 +1647,14 @@ config BUG_ON_DATA_CORRUPTION
|
|||
|
||||
If unsure, say N.
|
||||
|
||||
config DEBUG_MAPLE_TREE
|
||||
bool "Debug maple trees"
|
||||
depends on DEBUG_KERNEL
|
||||
help
|
||||
Enable maple tree debugging information and extra validations.
|
||||
|
||||
If unsure, say N.
|
||||
|
||||
endmenu
|
||||
|
||||
config DEBUG_CREDENTIALS
|
||||
|
|
|
|||
|
|
@ -29,7 +29,7 @@ endif
|
|||
|
||||
lib-y := ctype.o string.o vsprintf.o cmdline.o \
|
||||
rbtree.o radix-tree.o timerqueue.o xarray.o \
|
||||
idr.o extable.o irq_regs.o argv_split.o \
|
||||
maple_tree.o idr.o extable.o irq_regs.o argv_split.o \
|
||||
flex_proportions.o ratelimit.o show_mem.o \
|
||||
is_single_threaded.o plist.o decompress.o kobject_uevent.o \
|
||||
earlycpio.o seq_buf.o siphash.o dec_and_lock.o \
|
||||
|
|
|
|||
7130
lib/maple_tree.c
Normal file
7130
lib/maple_tree.c
Normal file
File diff suppressed because it is too large
Load diff
38307
lib/test_maple_tree.c
Normal file
38307
lib/test_maple_tree.c
Normal file
File diff suppressed because it is too large
Load diff
|
|
@ -41,4 +41,8 @@ struct kmem_cache *kmem_cache_create(const char *name, unsigned int size,
|
|||
unsigned int align, unsigned int flags,
|
||||
void (*ctor)(void *));
|
||||
|
||||
void kmem_cache_free_bulk(struct kmem_cache *cachep, size_t size, void **list);
|
||||
int kmem_cache_alloc_bulk(struct kmem_cache *cachep, gfp_t gfp, size_t size,
|
||||
void **list);
|
||||
|
||||
#endif /* _TOOLS_SLAB_H */
|
||||
|
|
|
|||
2
tools/testing/radix-tree/.gitignore
vendored
2
tools/testing/radix-tree/.gitignore
vendored
|
|
@ -6,3 +6,5 @@ main
|
|||
multiorder
|
||||
radix-tree.c
|
||||
xarray
|
||||
maple
|
||||
ma_xa_benchmark
|
||||
|
|
|
|||
|
|
@ -4,9 +4,9 @@ CFLAGS += -I. -I../../include -g -Og -Wall -D_LGPL_SOURCE -fsanitize=address \
|
|||
-fsanitize=undefined
|
||||
LDFLAGS += -fsanitize=address -fsanitize=undefined
|
||||
LDLIBS+= -lpthread -lurcu
|
||||
TARGETS = main idr-test multiorder xarray
|
||||
TARGETS = main idr-test multiorder xarray maple
|
||||
CORE_OFILES := xarray.o radix-tree.o idr.o linux.o test.o find_bit.o bitmap.o \
|
||||
slab.o
|
||||
slab.o maple.o
|
||||
OFILES = main.o $(CORE_OFILES) regression1.o regression2.o regression3.o \
|
||||
regression4.o tag_check.o multiorder.o idr-test.o iteration_check.o \
|
||||
iteration_check_2.o benchmark.o
|
||||
|
|
@ -29,6 +29,8 @@ idr-test: idr-test.o $(CORE_OFILES)
|
|||
|
||||
xarray: $(CORE_OFILES)
|
||||
|
||||
maple: $(CORE_OFILES)
|
||||
|
||||
multiorder: multiorder.o $(CORE_OFILES)
|
||||
|
||||
clean:
|
||||
|
|
@ -40,6 +42,7 @@ $(OFILES): Makefile *.h */*.h generated/map-shift.h \
|
|||
../../include/linux/*.h \
|
||||
../../include/asm/*.h \
|
||||
../../../include/linux/xarray.h \
|
||||
../../../include/linux/maple_tree.h \
|
||||
../../../include/linux/radix-tree.h \
|
||||
../../../include/linux/idr.h
|
||||
|
||||
|
|
@ -51,6 +54,8 @@ idr.c: ../../../lib/idr.c
|
|||
|
||||
xarray.o: ../../../lib/xarray.c ../../../lib/test_xarray.c
|
||||
|
||||
maple.o: ../../../lib/maple_tree.c ../../../lib/test_maple_tree.c
|
||||
|
||||
generated/map-shift.h:
|
||||
@if ! grep -qws $(SHIFT) generated/map-shift.h; then \
|
||||
echo "#define XA_CHUNK_SHIFT $(SHIFT)" > \
|
||||
|
|
|
|||
|
|
@ -1 +1,2 @@
|
|||
#define CONFIG_XARRAY_MULTI 1
|
||||
#define CONFIG_64BIT 1
|
||||
|
|
|
|||
|
|
@ -23,15 +23,47 @@ struct kmem_cache {
|
|||
int nr_objs;
|
||||
void *objs;
|
||||
void (*ctor)(void *);
|
||||
unsigned int non_kernel;
|
||||
unsigned long nr_allocated;
|
||||
unsigned long nr_tallocated;
|
||||
};
|
||||
|
||||
void kmem_cache_set_non_kernel(struct kmem_cache *cachep, unsigned int val)
|
||||
{
|
||||
cachep->non_kernel = val;
|
||||
}
|
||||
|
||||
unsigned long kmem_cache_get_alloc(struct kmem_cache *cachep)
|
||||
{
|
||||
return cachep->size * cachep->nr_allocated;
|
||||
}
|
||||
|
||||
unsigned long kmem_cache_nr_allocated(struct kmem_cache *cachep)
|
||||
{
|
||||
return cachep->nr_allocated;
|
||||
}
|
||||
|
||||
unsigned long kmem_cache_nr_tallocated(struct kmem_cache *cachep)
|
||||
{
|
||||
return cachep->nr_tallocated;
|
||||
}
|
||||
|
||||
void kmem_cache_zero_nr_tallocated(struct kmem_cache *cachep)
|
||||
{
|
||||
cachep->nr_tallocated = 0;
|
||||
}
|
||||
|
||||
void *kmem_cache_alloc_lru(struct kmem_cache *cachep, struct list_lru *lru,
|
||||
int gfp)
|
||||
{
|
||||
void *p;
|
||||
|
||||
if (!(gfp & __GFP_DIRECT_RECLAIM))
|
||||
return NULL;
|
||||
if (!(gfp & __GFP_DIRECT_RECLAIM)) {
|
||||
if (!cachep->non_kernel)
|
||||
return NULL;
|
||||
|
||||
cachep->non_kernel--;
|
||||
}
|
||||
|
||||
pthread_mutex_lock(&cachep->lock);
|
||||
if (cachep->nr_objs) {
|
||||
|
|
@ -53,19 +85,21 @@ void *kmem_cache_alloc_lru(struct kmem_cache *cachep, struct list_lru *lru,
|
|||
memset(p, 0, cachep->size);
|
||||
}
|
||||
|
||||
uatomic_inc(&cachep->nr_allocated);
|
||||
uatomic_inc(&nr_allocated);
|
||||
uatomic_inc(&cachep->nr_tallocated);
|
||||
if (kmalloc_verbose)
|
||||
printf("Allocating %p from slab\n", p);
|
||||
return p;
|
||||
}
|
||||
|
||||
void kmem_cache_free(struct kmem_cache *cachep, void *objp)
|
||||
void kmem_cache_free_locked(struct kmem_cache *cachep, void *objp)
|
||||
{
|
||||
assert(objp);
|
||||
uatomic_dec(&nr_allocated);
|
||||
uatomic_dec(&cachep->nr_allocated);
|
||||
if (kmalloc_verbose)
|
||||
printf("Freeing %p to slab\n", objp);
|
||||
pthread_mutex_lock(&cachep->lock);
|
||||
if (cachep->nr_objs > 10 || cachep->align) {
|
||||
memset(objp, POISON_FREE, cachep->size);
|
||||
free(objp);
|
||||
|
|
@ -75,9 +109,80 @@ void kmem_cache_free(struct kmem_cache *cachep, void *objp)
|
|||
node->parent = cachep->objs;
|
||||
cachep->objs = node;
|
||||
}
|
||||
}
|
||||
|
||||
void kmem_cache_free(struct kmem_cache *cachep, void *objp)
|
||||
{
|
||||
pthread_mutex_lock(&cachep->lock);
|
||||
kmem_cache_free_locked(cachep, objp);
|
||||
pthread_mutex_unlock(&cachep->lock);
|
||||
}
|
||||
|
||||
void kmem_cache_free_bulk(struct kmem_cache *cachep, size_t size, void **list)
|
||||
{
|
||||
if (kmalloc_verbose)
|
||||
pr_debug("Bulk free %p[0-%lu]\n", list, size - 1);
|
||||
|
||||
pthread_mutex_lock(&cachep->lock);
|
||||
for (int i = 0; i < size; i++)
|
||||
kmem_cache_free_locked(cachep, list[i]);
|
||||
pthread_mutex_unlock(&cachep->lock);
|
||||
}
|
||||
|
||||
int kmem_cache_alloc_bulk(struct kmem_cache *cachep, gfp_t gfp, size_t size,
|
||||
void **p)
|
||||
{
|
||||
size_t i;
|
||||
|
||||
if (kmalloc_verbose)
|
||||
pr_debug("Bulk alloc %lu\n", size);
|
||||
|
||||
if (!(gfp & __GFP_DIRECT_RECLAIM)) {
|
||||
if (cachep->non_kernel < size)
|
||||
return 0;
|
||||
|
||||
cachep->non_kernel -= size;
|
||||
}
|
||||
|
||||
pthread_mutex_lock(&cachep->lock);
|
||||
if (cachep->nr_objs >= size) {
|
||||
struct radix_tree_node *node;
|
||||
|
||||
for (i = 0; i < size; i++) {
|
||||
node = cachep->objs;
|
||||
cachep->nr_objs--;
|
||||
cachep->objs = node->parent;
|
||||
p[i] = node;
|
||||
node->parent = NULL;
|
||||
}
|
||||
pthread_mutex_unlock(&cachep->lock);
|
||||
} else {
|
||||
pthread_mutex_unlock(&cachep->lock);
|
||||
for (i = 0; i < size; i++) {
|
||||
if (cachep->align) {
|
||||
posix_memalign(&p[i], cachep->align,
|
||||
cachep->size * size);
|
||||
} else {
|
||||
p[i] = malloc(cachep->size * size);
|
||||
}
|
||||
if (cachep->ctor)
|
||||
cachep->ctor(p[i]);
|
||||
else if (gfp & __GFP_ZERO)
|
||||
memset(p[i], 0, cachep->size);
|
||||
}
|
||||
}
|
||||
|
||||
for (i = 0; i < size; i++) {
|
||||
uatomic_inc(&nr_allocated);
|
||||
uatomic_inc(&cachep->nr_allocated);
|
||||
uatomic_inc(&cachep->nr_tallocated);
|
||||
if (kmalloc_verbose)
|
||||
printf("Allocating %p from slab\n", p[i]);
|
||||
}
|
||||
|
||||
return size;
|
||||
}
|
||||
|
||||
struct kmem_cache *
|
||||
kmem_cache_create(const char *name, unsigned int size, unsigned int align,
|
||||
unsigned int flags, void (*ctor)(void *))
|
||||
|
|
@ -88,7 +193,54 @@ kmem_cache_create(const char *name, unsigned int size, unsigned int align,
|
|||
ret->size = size;
|
||||
ret->align = align;
|
||||
ret->nr_objs = 0;
|
||||
ret->nr_allocated = 0;
|
||||
ret->nr_tallocated = 0;
|
||||
ret->objs = NULL;
|
||||
ret->ctor = ctor;
|
||||
ret->non_kernel = 0;
|
||||
return ret;
|
||||
}
|
||||
|
||||
/*
|
||||
* Test the test infrastructure for kem_cache_alloc/free and bulk counterparts.
|
||||
*/
|
||||
void test_kmem_cache_bulk(void)
|
||||
{
|
||||
int i;
|
||||
void *list[12];
|
||||
static struct kmem_cache *test_cache, *test_cache2;
|
||||
|
||||
/*
|
||||
* Testing the bulk allocators without aligned kmem_cache to force the
|
||||
* bulk alloc/free to reuse
|
||||
*/
|
||||
test_cache = kmem_cache_create("test_cache", 256, 0, SLAB_PANIC, NULL);
|
||||
|
||||
for (i = 0; i < 5; i++)
|
||||
list[i] = kmem_cache_alloc(test_cache, __GFP_DIRECT_RECLAIM);
|
||||
|
||||
for (i = 0; i < 5; i++)
|
||||
kmem_cache_free(test_cache, list[i]);
|
||||
assert(test_cache->nr_objs == 5);
|
||||
|
||||
kmem_cache_alloc_bulk(test_cache, __GFP_DIRECT_RECLAIM, 5, list);
|
||||
kmem_cache_free_bulk(test_cache, 5, list);
|
||||
|
||||
for (i = 0; i < 12 ; i++)
|
||||
list[i] = kmem_cache_alloc(test_cache, __GFP_DIRECT_RECLAIM);
|
||||
|
||||
for (i = 0; i < 12; i++)
|
||||
kmem_cache_free(test_cache, list[i]);
|
||||
|
||||
/* The last free will not be kept around */
|
||||
assert(test_cache->nr_objs == 11);
|
||||
|
||||
/* Aligned caches will immediately free */
|
||||
test_cache2 = kmem_cache_create("test_cache2", 128, 128, SLAB_PANIC, NULL);
|
||||
|
||||
kmem_cache_alloc_bulk(test_cache2, __GFP_DIRECT_RECLAIM, 10, list);
|
||||
kmem_cache_free_bulk(test_cache2, 10, list);
|
||||
assert(!test_cache2->nr_objs);
|
||||
|
||||
|
||||
}
|
||||
|
|
|
|||
|
|
@ -14,6 +14,7 @@
|
|||
#include "../../../include/linux/kconfig.h"
|
||||
|
||||
#define printk printf
|
||||
#define pr_err printk
|
||||
#define pr_info printk
|
||||
#define pr_debug printk
|
||||
#define pr_cont printk
|
||||
|
|
|
|||
|
|
@ -11,4 +11,6 @@ static inline void lockdep_set_class(spinlock_t *lock,
|
|||
struct lock_class_key *key)
|
||||
{
|
||||
}
|
||||
|
||||
extern int lockdep_is_held(const void *);
|
||||
#endif /* _LINUX_LOCKDEP_H */
|
||||
|
|
|
|||
7
tools/testing/radix-tree/linux/maple_tree.h
Normal file
7
tools/testing/radix-tree/linux/maple_tree.h
Normal file
|
|
@ -0,0 +1,7 @@
|
|||
/* SPDX-License-Identifier: GPL-2.0+ */
|
||||
#define atomic_t int32_t
|
||||
#include "../../../../include/linux/maple_tree.h"
|
||||
#define atomic_inc(x) uatomic_inc(x)
|
||||
#define atomic_read(x) uatomic_read(x)
|
||||
#define atomic_set(x, y) do {} while (0)
|
||||
#define U8_MAX UCHAR_MAX
|
||||
59
tools/testing/radix-tree/maple.c
Normal file
59
tools/testing/radix-tree/maple.c
Normal file
|
|
@ -0,0 +1,59 @@
|
|||
// SPDX-License-Identifier: GPL-2.0+
|
||||
/*
|
||||
* maple_tree.c: Userspace shim for maple tree test-suite
|
||||
* Copyright (c) 2018 Liam R. Howlett <Liam.Howlett@Oracle.com>
|
||||
*/
|
||||
|
||||
#define CONFIG_DEBUG_MAPLE_TREE
|
||||
#define CONFIG_MAPLE_SEARCH
|
||||
#include "test.h"
|
||||
|
||||
#define module_init(x)
|
||||
#define module_exit(x)
|
||||
#define MODULE_AUTHOR(x)
|
||||
#define MODULE_LICENSE(x)
|
||||
#define dump_stack() assert(0)
|
||||
|
||||
#include "../../../lib/maple_tree.c"
|
||||
#undef CONFIG_DEBUG_MAPLE_TREE
|
||||
#include "../../../lib/test_maple_tree.c"
|
||||
|
||||
void farmer_tests(void)
|
||||
{
|
||||
struct maple_node *node;
|
||||
DEFINE_MTREE(tree);
|
||||
|
||||
mt_dump(&tree);
|
||||
|
||||
tree.ma_root = xa_mk_value(0);
|
||||
mt_dump(&tree);
|
||||
|
||||
node = mt_alloc_one(GFP_KERNEL);
|
||||
node->parent = (void *)((unsigned long)(&tree) | 1);
|
||||
node->slot[0] = xa_mk_value(0);
|
||||
node->slot[1] = xa_mk_value(1);
|
||||
node->mr64.pivot[0] = 0;
|
||||
node->mr64.pivot[1] = 1;
|
||||
node->mr64.pivot[2] = 0;
|
||||
tree.ma_root = mt_mk_node(node, maple_leaf_64);
|
||||
mt_dump(&tree);
|
||||
|
||||
ma_free_rcu(node);
|
||||
}
|
||||
|
||||
void maple_tree_tests(void)
|
||||
{
|
||||
farmer_tests();
|
||||
maple_tree_seed();
|
||||
maple_tree_harvest();
|
||||
}
|
||||
|
||||
int __weak main(void)
|
||||
{
|
||||
maple_tree_init();
|
||||
maple_tree_tests();
|
||||
rcu_barrier();
|
||||
if (nr_allocated)
|
||||
printf("nr_allocated = %d\n", nr_allocated);
|
||||
return 0;
|
||||
}
|
||||
5
tools/testing/radix-tree/trace/events/maple_tree.h
Normal file
5
tools/testing/radix-tree/trace/events/maple_tree.h
Normal file
|
|
@ -0,0 +1,5 @@
|
|||
/* SPDX-License-Identifier: GPL-2.0+ */
|
||||
|
||||
#define trace_ma_op(a, b) do {} while (0)
|
||||
#define trace_ma_read(a, b) do {} while (0)
|
||||
#define trace_ma_write(a, b, c, d) do {} while (0)
|
||||
Loading…
Add table
Add a link
Reference in a new issue