mirror of
https://github.com/lkl/linux.git
synced 2025-12-19 16:13:19 +09:00
Merge tag 'rcu.2022.12.02a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU updates from Paul McKenney: - Documentation updates. This is the second in a series from an ongoing review of the RCU documentation. - Miscellaneous fixes. - Introduce a default-off Kconfig option that depends on RCU_NOCB_CPU that, on CPUs mentioned in the nohz_full or rcu_nocbs boot-argument CPU lists, causes call_rcu() to introduce delays. These delays result in significant power savings on nearly idle Android and ChromeOS systems. These savings range from a few percent to more than ten percent. This series also includes several commits that change call_rcu() to a new call_rcu_hurry() function that avoids these delays in a few cases, for example, where timely wakeups are required. Several of these are outside of RCU and thus have acks and reviews from the relevant maintainers. - Create an srcu_read_lock_nmisafe() and an srcu_read_unlock_nmisafe() for architectures that support NMIs, but which do not provide NMI-safe this_cpu_inc(). These NMI-safe SRCU functions are required by the upcoming lockless printk() work by John Ogness et al. - Changes providing minor but important increases in torture test coverage for the new RCU polled-grace-period APIs. - Changes to torturescript that avoid redundant kernel builds, thus providing about a 30% speedup for the torture.sh acceptance test. * tag 'rcu.2022.12.02a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (49 commits) net: devinet: Reduce refcount before grace period net: Use call_rcu_hurry() for dst_release() workqueue: Make queue_rcu_work() use call_rcu_hurry() percpu-refcount: Use call_rcu_hurry() for atomic switch scsi/scsi_error: Use call_rcu_hurry() instead of call_rcu() rcu/rcutorture: Use call_rcu_hurry() where needed rcu/rcuscale: Use call_rcu_hurry() for async reader test rcu/sync: Use call_rcu_hurry() instead of call_rcu rcuscale: Add laziness and kfree tests rcu: Shrinker for lazy rcu rcu: Refactor code a bit in rcu_nocb_do_flush_bypass() rcu: Make call_rcu() lazy to save power rcu: Implement lockdep_rcu_enabled for !CONFIG_DEBUG_LOCK_ALLOC srcu: Debug NMI safety even on archs that don't require it srcu: Explain the reason behind the read side critical section on GP start srcu: Warn when NMI-unsafe API is used in NMI arch/s390: Add ARCH_HAS_NMI_SAFE_THIS_CPU_OPS Kconfig option arch/loongarch: Add ARCH_HAS_NMI_SAFE_THIS_CPU_OPS Kconfig option rcu: Fix __this_cpu_read() lockdep warning in rcu_force_quiescent_state() rcu-tasks: Make grace-period-age message human-readable ...
This commit is contained in:
@@ -1,165 +0,0 @@
|
|||||||
.. _array_rcu_doc:
|
|
||||||
|
|
||||||
Using RCU to Protect Read-Mostly Arrays
|
|
||||||
=======================================
|
|
||||||
|
|
||||||
Although RCU is more commonly used to protect linked lists, it can
|
|
||||||
also be used to protect arrays. Three situations are as follows:
|
|
||||||
|
|
||||||
1. :ref:`Hash Tables <hash_tables>`
|
|
||||||
|
|
||||||
2. :ref:`Static Arrays <static_arrays>`
|
|
||||||
|
|
||||||
3. :ref:`Resizable Arrays <resizable_arrays>`
|
|
||||||
|
|
||||||
Each of these three situations involves an RCU-protected pointer to an
|
|
||||||
array that is separately indexed. It might be tempting to consider use
|
|
||||||
of RCU to instead protect the index into an array, however, this use
|
|
||||||
case is **not** supported. The problem with RCU-protected indexes into
|
|
||||||
arrays is that compilers can play way too many optimization games with
|
|
||||||
integers, which means that the rules governing handling of these indexes
|
|
||||||
are far more trouble than they are worth. If RCU-protected indexes into
|
|
||||||
arrays prove to be particularly valuable (which they have not thus far),
|
|
||||||
explicit cooperation from the compiler will be required to permit them
|
|
||||||
to be safely used.
|
|
||||||
|
|
||||||
That aside, each of the three RCU-protected pointer situations are
|
|
||||||
described in the following sections.
|
|
||||||
|
|
||||||
.. _hash_tables:
|
|
||||||
|
|
||||||
Situation 1: Hash Tables
|
|
||||||
------------------------
|
|
||||||
|
|
||||||
Hash tables are often implemented as an array, where each array entry
|
|
||||||
has a linked-list hash chain. Each hash chain can be protected by RCU
|
|
||||||
as described in listRCU.rst. This approach also applies to other
|
|
||||||
array-of-list situations, such as radix trees.
|
|
||||||
|
|
||||||
.. _static_arrays:
|
|
||||||
|
|
||||||
Situation 2: Static Arrays
|
|
||||||
--------------------------
|
|
||||||
|
|
||||||
Static arrays, where the data (rather than a pointer to the data) is
|
|
||||||
located in each array element, and where the array is never resized,
|
|
||||||
have not been used with RCU. Rik van Riel recommends using seqlock in
|
|
||||||
this situation, which would also have minimal read-side overhead as long
|
|
||||||
as updates are rare.
|
|
||||||
|
|
||||||
Quick Quiz:
|
|
||||||
Why is it so important that updates be rare when using seqlock?
|
|
||||||
|
|
||||||
:ref:`Answer to Quick Quiz <answer_quick_quiz_seqlock>`
|
|
||||||
|
|
||||||
.. _resizable_arrays:
|
|
||||||
|
|
||||||
Situation 3: Resizable Arrays
|
|
||||||
------------------------------
|
|
||||||
|
|
||||||
Use of RCU for resizable arrays is demonstrated by the grow_ary()
|
|
||||||
function formerly used by the System V IPC code. The array is used
|
|
||||||
to map from semaphore, message-queue, and shared-memory IDs to the data
|
|
||||||
structure that represents the corresponding IPC construct. The grow_ary()
|
|
||||||
function does not acquire any locks; instead its caller must hold the
|
|
||||||
ids->sem semaphore.
|
|
||||||
|
|
||||||
The grow_ary() function, shown below, does some limit checks, allocates a
|
|
||||||
new ipc_id_ary, copies the old to the new portion of the new, initializes
|
|
||||||
the remainder of the new, updates the ids->entries pointer to point to
|
|
||||||
the new array, and invokes ipc_rcu_putref() to free up the old array.
|
|
||||||
Note that rcu_assign_pointer() is used to update the ids->entries pointer,
|
|
||||||
which includes any memory barriers required on whatever architecture
|
|
||||||
you are running on::
|
|
||||||
|
|
||||||
static int grow_ary(struct ipc_ids* ids, int newsize)
|
|
||||||
{
|
|
||||||
struct ipc_id_ary* new;
|
|
||||||
struct ipc_id_ary* old;
|
|
||||||
int i;
|
|
||||||
int size = ids->entries->size;
|
|
||||||
|
|
||||||
if(newsize > IPCMNI)
|
|
||||||
newsize = IPCMNI;
|
|
||||||
if(newsize <= size)
|
|
||||||
return newsize;
|
|
||||||
|
|
||||||
new = ipc_rcu_alloc(sizeof(struct kern_ipc_perm *)*newsize +
|
|
||||||
sizeof(struct ipc_id_ary));
|
|
||||||
if(new == NULL)
|
|
||||||
return size;
|
|
||||||
new->size = newsize;
|
|
||||||
memcpy(new->p, ids->entries->p,
|
|
||||||
sizeof(struct kern_ipc_perm *)*size +
|
|
||||||
sizeof(struct ipc_id_ary));
|
|
||||||
for(i=size;i<newsize;i++) {
|
|
||||||
new->p[i] = NULL;
|
|
||||||
}
|
|
||||||
old = ids->entries;
|
|
||||||
|
|
||||||
/*
|
|
||||||
* Use rcu_assign_pointer() to make sure the memcpyed
|
|
||||||
* contents of the new array are visible before the new
|
|
||||||
* array becomes visible.
|
|
||||||
*/
|
|
||||||
rcu_assign_pointer(ids->entries, new);
|
|
||||||
|
|
||||||
ipc_rcu_putref(old);
|
|
||||||
return newsize;
|
|
||||||
}
|
|
||||||
|
|
||||||
The ipc_rcu_putref() function decrements the array's reference count
|
|
||||||
and then, if the reference count has dropped to zero, uses call_rcu()
|
|
||||||
to free the array after a grace period has elapsed.
|
|
||||||
|
|
||||||
The array is traversed by the ipc_lock() function. This function
|
|
||||||
indexes into the array under the protection of rcu_read_lock(),
|
|
||||||
using rcu_dereference() to pick up the pointer to the array so
|
|
||||||
that it may later safely be dereferenced -- memory barriers are
|
|
||||||
required on the Alpha CPU. Since the size of the array is stored
|
|
||||||
with the array itself, there can be no array-size mismatches, so
|
|
||||||
a simple check suffices. The pointer to the structure corresponding
|
|
||||||
to the desired IPC object is placed in "out", with NULL indicating
|
|
||||||
a non-existent entry. After acquiring "out->lock", the "out->deleted"
|
|
||||||
flag indicates whether the IPC object is in the process of being
|
|
||||||
deleted, and, if not, the pointer is returned::
|
|
||||||
|
|
||||||
struct kern_ipc_perm* ipc_lock(struct ipc_ids* ids, int id)
|
|
||||||
{
|
|
||||||
struct kern_ipc_perm* out;
|
|
||||||
int lid = id % SEQ_MULTIPLIER;
|
|
||||||
struct ipc_id_ary* entries;
|
|
||||||
|
|
||||||
rcu_read_lock();
|
|
||||||
entries = rcu_dereference(ids->entries);
|
|
||||||
if(lid >= entries->size) {
|
|
||||||
rcu_read_unlock();
|
|
||||||
return NULL;
|
|
||||||
}
|
|
||||||
out = entries->p[lid];
|
|
||||||
if(out == NULL) {
|
|
||||||
rcu_read_unlock();
|
|
||||||
return NULL;
|
|
||||||
}
|
|
||||||
spin_lock(&out->lock);
|
|
||||||
|
|
||||||
/* ipc_rmid() may have already freed the ID while ipc_lock
|
|
||||||
* was spinning: here verify that the structure is still valid
|
|
||||||
*/
|
|
||||||
if (out->deleted) {
|
|
||||||
spin_unlock(&out->lock);
|
|
||||||
rcu_read_unlock();
|
|
||||||
return NULL;
|
|
||||||
}
|
|
||||||
return out;
|
|
||||||
}
|
|
||||||
|
|
||||||
.. _answer_quick_quiz_seqlock:
|
|
||||||
|
|
||||||
Answer to Quick Quiz:
|
|
||||||
Why is it so important that updates be rare when using seqlock?
|
|
||||||
|
|
||||||
The reason that it is important that updates be rare when
|
|
||||||
using seqlock is that frequent updates can livelock readers.
|
|
||||||
One way to avoid this problem is to assign a seqlock for
|
|
||||||
each array entry rather than to the entire array.
|
|
||||||
@@ -32,8 +32,8 @@ over a rather long period of time, but improvements are always welcome!
|
|||||||
for lockless updates. This does result in the mildly
|
for lockless updates. This does result in the mildly
|
||||||
counter-intuitive situation where rcu_read_lock() and
|
counter-intuitive situation where rcu_read_lock() and
|
||||||
rcu_read_unlock() are used to protect updates, however, this
|
rcu_read_unlock() are used to protect updates, however, this
|
||||||
approach provides the same potential simplifications that garbage
|
approach can provide the same simplifications to certain types
|
||||||
collectors do.
|
of lockless algorithms that garbage collectors do.
|
||||||
|
|
||||||
1. Does the update code have proper mutual exclusion?
|
1. Does the update code have proper mutual exclusion?
|
||||||
|
|
||||||
@@ -49,12 +49,12 @@ over a rather long period of time, but improvements are always welcome!
|
|||||||
them -- even x86 allows later loads to be reordered to precede
|
them -- even x86 allows later loads to be reordered to precede
|
||||||
earlier stores), and be prepared to explain why this added
|
earlier stores), and be prepared to explain why this added
|
||||||
complexity is worthwhile. If you choose #c, be prepared to
|
complexity is worthwhile. If you choose #c, be prepared to
|
||||||
explain how this single task does not become a major bottleneck on
|
explain how this single task does not become a major bottleneck
|
||||||
big multiprocessor machines (for example, if the task is updating
|
on large systems (for example, if the task is updating information
|
||||||
information relating to itself that other tasks can read, there
|
relating to itself that other tasks can read, there by definition
|
||||||
by definition can be no bottleneck). Note that the definition
|
can be no bottleneck). Note that the definition of "large" has
|
||||||
of "large" has changed significantly: Eight CPUs was "large"
|
changed significantly: Eight CPUs was "large" in the year 2000,
|
||||||
in the year 2000, but a hundred CPUs was unremarkable in 2017.
|
but a hundred CPUs was unremarkable in 2017.
|
||||||
|
|
||||||
2. Do the RCU read-side critical sections make proper use of
|
2. Do the RCU read-side critical sections make proper use of
|
||||||
rcu_read_lock() and friends? These primitives are needed
|
rcu_read_lock() and friends? These primitives are needed
|
||||||
@@ -97,33 +97,38 @@ over a rather long period of time, but improvements are always welcome!
|
|||||||
|
|
||||||
b. Proceed as in (a) above, but also maintain per-element
|
b. Proceed as in (a) above, but also maintain per-element
|
||||||
locks (that are acquired by both readers and writers)
|
locks (that are acquired by both readers and writers)
|
||||||
that guard per-element state. Of course, fields that
|
that guard per-element state. Fields that the readers
|
||||||
the readers refrain from accessing can be guarded by
|
refrain from accessing can be guarded by some other lock
|
||||||
some other lock acquired only by updaters, if desired.
|
acquired only by updaters, if desired.
|
||||||
|
|
||||||
This works quite well, also.
|
This also works quite well.
|
||||||
|
|
||||||
c. Make updates appear atomic to readers. For example,
|
c. Make updates appear atomic to readers. For example,
|
||||||
pointer updates to properly aligned fields will
|
pointer updates to properly aligned fields will
|
||||||
appear atomic, as will individual atomic primitives.
|
appear atomic, as will individual atomic primitives.
|
||||||
Sequences of operations performed under a lock will *not*
|
Sequences of operations performed under a lock will *not*
|
||||||
appear to be atomic to RCU readers, nor will sequences
|
appear to be atomic to RCU readers, nor will sequences
|
||||||
of multiple atomic primitives.
|
of multiple atomic primitives. One alternative is to
|
||||||
|
move multiple individual fields to a separate structure,
|
||||||
|
thus solving the multiple-field problem by imposing an
|
||||||
|
additional level of indirection.
|
||||||
|
|
||||||
This can work, but is starting to get a bit tricky.
|
This can work, but is starting to get a bit tricky.
|
||||||
|
|
||||||
d. Carefully order the updates and the reads so that
|
d. Carefully order the updates and the reads so that readers
|
||||||
readers see valid data at all phases of the update.
|
see valid data at all phases of the update. This is often
|
||||||
This is often more difficult than it sounds, especially
|
more difficult than it sounds, especially given modern
|
||||||
given modern CPUs' tendency to reorder memory references.
|
CPUs' tendency to reorder memory references. One must
|
||||||
One must usually liberally sprinkle memory barriers
|
usually liberally sprinkle memory-ordering operations
|
||||||
(smp_wmb(), smp_rmb(), smp_mb()) through the code,
|
through the code, making it difficult to understand and
|
||||||
making it difficult to understand and to test.
|
to test. Where it works, it is better to use things
|
||||||
|
like smp_store_release() and smp_load_acquire(), but in
|
||||||
|
some cases the smp_mb() full memory barrier is required.
|
||||||
|
|
||||||
It is usually better to group the changing data into
|
As noted earlier, it is usually better to group the
|
||||||
a separate structure, so that the change may be made
|
changing data into a separate structure, so that the
|
||||||
to appear atomic by updating a pointer to reference
|
change may be made to appear atomic by updating a pointer
|
||||||
a new structure containing updated values.
|
to reference a new structure containing updated values.
|
||||||
|
|
||||||
4. Weakly ordered CPUs pose special challenges. Almost all CPUs
|
4. Weakly ordered CPUs pose special challenges. Almost all CPUs
|
||||||
are weakly ordered -- even x86 CPUs allow later loads to be
|
are weakly ordered -- even x86 CPUs allow later loads to be
|
||||||
@@ -188,26 +193,29 @@ over a rather long period of time, but improvements are always welcome!
|
|||||||
when publicizing a pointer to a structure that can
|
when publicizing a pointer to a structure that can
|
||||||
be traversed by an RCU read-side critical section.
|
be traversed by an RCU read-side critical section.
|
||||||
|
|
||||||
5. If call_rcu() or call_srcu() is used, the callback function will
|
5. If any of call_rcu(), call_srcu(), call_rcu_tasks(),
|
||||||
be called from softirq context. In particular, it cannot block.
|
call_rcu_tasks_rude(), or call_rcu_tasks_trace() is used,
|
||||||
If you need the callback to block, run that code in a workqueue
|
the callback function may be invoked from softirq context,
|
||||||
handler scheduled from the callback. The queue_rcu_work()
|
and in any case with bottom halves disabled. In particular,
|
||||||
function does this for you in the case of call_rcu().
|
this callback function cannot block. If you need the callback
|
||||||
|
to block, run that code in a workqueue handler scheduled from
|
||||||
|
the callback. The queue_rcu_work() function does this for you
|
||||||
|
in the case of call_rcu().
|
||||||
|
|
||||||
6. Since synchronize_rcu() can block, it cannot be called
|
6. Since synchronize_rcu() can block, it cannot be called
|
||||||
from any sort of irq context. The same rule applies
|
from any sort of irq context. The same rule applies
|
||||||
for synchronize_srcu(), synchronize_rcu_expedited(), and
|
for synchronize_srcu(), synchronize_rcu_expedited(),
|
||||||
synchronize_srcu_expedited().
|
synchronize_srcu_expedited(), synchronize_rcu_tasks(),
|
||||||
|
synchronize_rcu_tasks_rude(), and synchronize_rcu_tasks_trace().
|
||||||
|
|
||||||
The expedited forms of these primitives have the same semantics
|
The expedited forms of these primitives have the same semantics
|
||||||
as the non-expedited forms, but expediting is both expensive and
|
as the non-expedited forms, but expediting is more CPU intensive.
|
||||||
(with the exception of synchronize_srcu_expedited()) unfriendly
|
Use of the expedited primitives should be restricted to rare
|
||||||
to real-time workloads. Use of the expedited primitives should
|
configuration-change operations that would not normally be
|
||||||
be restricted to rare configuration-change operations that would
|
undertaken while a real-time workload is running. Note that
|
||||||
not normally be undertaken while a real-time workload is running.
|
IPI-sensitive real-time workloads can use the rcupdate.rcu_normal
|
||||||
However, real-time workloads can use rcupdate.rcu_normal kernel
|
kernel boot parameter to completely disable expedited grace
|
||||||
boot parameter to completely disable expedited grace periods,
|
periods, though this might have performance implications.
|
||||||
though this might have performance implications.
|
|
||||||
|
|
||||||
In particular, if you find yourself invoking one of the expedited
|
In particular, if you find yourself invoking one of the expedited
|
||||||
primitives repeatedly in a loop, please do everyone a favor:
|
primitives repeatedly in a loop, please do everyone a favor:
|
||||||
@@ -215,8 +223,9 @@ over a rather long period of time, but improvements are always welcome!
|
|||||||
a single non-expedited primitive to cover the entire batch.
|
a single non-expedited primitive to cover the entire batch.
|
||||||
This will very likely be faster than the loop containing the
|
This will very likely be faster than the loop containing the
|
||||||
expedited primitive, and will be much much easier on the rest
|
expedited primitive, and will be much much easier on the rest
|
||||||
of the system, especially to real-time workloads running on
|
of the system, especially to real-time workloads running on the
|
||||||
the rest of the system.
|
rest of the system. Alternatively, instead use asynchronous
|
||||||
|
primitives such as call_rcu().
|
||||||
|
|
||||||
7. As of v4.20, a given kernel implements only one RCU flavor, which
|
7. As of v4.20, a given kernel implements only one RCU flavor, which
|
||||||
is RCU-sched for PREEMPTION=n and RCU-preempt for PREEMPTION=y.
|
is RCU-sched for PREEMPTION=n and RCU-preempt for PREEMPTION=y.
|
||||||
@@ -239,7 +248,8 @@ over a rather long period of time, but improvements are always welcome!
|
|||||||
the corresponding readers must use rcu_read_lock_trace() and
|
the corresponding readers must use rcu_read_lock_trace() and
|
||||||
rcu_read_unlock_trace(). If an updater uses call_rcu_tasks_rude()
|
rcu_read_unlock_trace(). If an updater uses call_rcu_tasks_rude()
|
||||||
or synchronize_rcu_tasks_rude(), then the corresponding readers
|
or synchronize_rcu_tasks_rude(), then the corresponding readers
|
||||||
must use anything that disables interrupts.
|
must use anything that disables preemption, for example,
|
||||||
|
preempt_disable() and preempt_enable().
|
||||||
|
|
||||||
Mixing things up will result in confusion and broken kernels, and
|
Mixing things up will result in confusion and broken kernels, and
|
||||||
has even resulted in an exploitable security issue. Therefore,
|
has even resulted in an exploitable security issue. Therefore,
|
||||||
@@ -253,15 +263,16 @@ over a rather long period of time, but improvements are always welcome!
|
|||||||
that this usage is safe is that readers can use anything that
|
that this usage is safe is that readers can use anything that
|
||||||
disables BH when updaters use call_rcu() or synchronize_rcu().
|
disables BH when updaters use call_rcu() or synchronize_rcu().
|
||||||
|
|
||||||
8. Although synchronize_rcu() is slower than is call_rcu(), it
|
8. Although synchronize_rcu() is slower than is call_rcu(),
|
||||||
usually results in simpler code. So, unless update performance is
|
it usually results in simpler code. So, unless update
|
||||||
critically important, the updaters cannot block, or the latency of
|
performance is critically important, the updaters cannot block,
|
||||||
synchronize_rcu() is visible from userspace, synchronize_rcu()
|
or the latency of synchronize_rcu() is visible from userspace,
|
||||||
should be used in preference to call_rcu(). Furthermore,
|
synchronize_rcu() should be used in preference to call_rcu().
|
||||||
kfree_rcu() usually results in even simpler code than does
|
Furthermore, kfree_rcu() and kvfree_rcu() usually result
|
||||||
synchronize_rcu() without synchronize_rcu()'s multi-millisecond
|
in even simpler code than does synchronize_rcu() without
|
||||||
latency. So please take advantage of kfree_rcu()'s "fire and
|
synchronize_rcu()'s multi-millisecond latency. So please take
|
||||||
forget" memory-freeing capabilities where it applies.
|
advantage of kfree_rcu()'s and kvfree_rcu()'s "fire and forget"
|
||||||
|
memory-freeing capabilities where it applies.
|
||||||
|
|
||||||
An especially important property of the synchronize_rcu()
|
An especially important property of the synchronize_rcu()
|
||||||
primitive is that it automatically self-limits: if grace periods
|
primitive is that it automatically self-limits: if grace periods
|
||||||
@@ -271,8 +282,8 @@ over a rather long period of time, but improvements are always welcome!
|
|||||||
cases where grace periods are delayed, as failing to do so can
|
cases where grace periods are delayed, as failing to do so can
|
||||||
result in excessive realtime latencies or even OOM conditions.
|
result in excessive realtime latencies or even OOM conditions.
|
||||||
|
|
||||||
Ways of gaining this self-limiting property when using call_rcu()
|
Ways of gaining this self-limiting property when using call_rcu(),
|
||||||
include:
|
kfree_rcu(), or kvfree_rcu() include:
|
||||||
|
|
||||||
a. Keeping a count of the number of data-structure elements
|
a. Keeping a count of the number of data-structure elements
|
||||||
used by the RCU-protected data structure, including
|
used by the RCU-protected data structure, including
|
||||||
@@ -304,18 +315,21 @@ over a rather long period of time, but improvements are always welcome!
|
|||||||
here is that superuser already has lots of ways to crash
|
here is that superuser already has lots of ways to crash
|
||||||
the machine.
|
the machine.
|
||||||
|
|
||||||
d. Periodically invoke synchronize_rcu(), permitting a limited
|
d. Periodically invoke rcu_barrier(), permitting a limited
|
||||||
number of updates per grace period. Better yet, periodically
|
number of updates per grace period.
|
||||||
invoke rcu_barrier() to wait for all outstanding callbacks.
|
|
||||||
|
|
||||||
The same cautions apply to call_srcu() and kfree_rcu().
|
The same cautions apply to call_srcu(), call_rcu_tasks(),
|
||||||
|
call_rcu_tasks_rude(), and call_rcu_tasks_trace(). This is
|
||||||
|
why there is an srcu_barrier(), rcu_barrier_tasks(),
|
||||||
|
rcu_barrier_tasks_rude(), and rcu_barrier_tasks_rude(),
|
||||||
|
respectively.
|
||||||
|
|
||||||
Note that although these primitives do take action to avoid memory
|
Note that although these primitives do take action to avoid
|
||||||
exhaustion when any given CPU has too many callbacks, a determined
|
memory exhaustion when any given CPU has too many callbacks,
|
||||||
user could still exhaust memory. This is especially the case
|
a determined user or administrator can still exhaust memory.
|
||||||
if a system with a large number of CPUs has been configured to
|
This is especially the case if a system with a large number of
|
||||||
offload all of its RCU callbacks onto a single CPU, or if the
|
CPUs has been configured to offload all of its RCU callbacks onto
|
||||||
system has relatively little free memory.
|
a single CPU, or if the system has relatively little free memory.
|
||||||
|
|
||||||
9. All RCU list-traversal primitives, which include
|
9. All RCU list-traversal primitives, which include
|
||||||
rcu_dereference(), list_for_each_entry_rcu(), and
|
rcu_dereference(), list_for_each_entry_rcu(), and
|
||||||
@@ -344,14 +358,14 @@ over a rather long period of time, but improvements are always welcome!
|
|||||||
and you don't hold the appropriate update-side lock, you *must*
|
and you don't hold the appropriate update-side lock, you *must*
|
||||||
use the "_rcu()" variants of the list macros. Failing to do so
|
use the "_rcu()" variants of the list macros. Failing to do so
|
||||||
will break Alpha, cause aggressive compilers to generate bad code,
|
will break Alpha, cause aggressive compilers to generate bad code,
|
||||||
and confuse people trying to read your code.
|
and confuse people trying to understand your code.
|
||||||
|
|
||||||
11. Any lock acquired by an RCU callback must be acquired elsewhere
|
11. Any lock acquired by an RCU callback must be acquired elsewhere
|
||||||
with softirq disabled, e.g., via spin_lock_irqsave(),
|
with softirq disabled, e.g., via spin_lock_bh(). Failing to
|
||||||
spin_lock_bh(), etc. Failing to disable softirq on a given
|
disable softirq on a given acquisition of that lock will result
|
||||||
acquisition of that lock will result in deadlock as soon as
|
in deadlock as soon as the RCU softirq handler happens to run
|
||||||
the RCU softirq handler happens to run your RCU callback while
|
your RCU callback while interrupting that acquisition's critical
|
||||||
interrupting that acquisition's critical section.
|
section.
|
||||||
|
|
||||||
12. RCU callbacks can be and are executed in parallel. In many cases,
|
12. RCU callbacks can be and are executed in parallel. In many cases,
|
||||||
the callback code simply wrappers around kfree(), so that this
|
the callback code simply wrappers around kfree(), so that this
|
||||||
@@ -372,7 +386,17 @@ over a rather long period of time, but improvements are always welcome!
|
|||||||
for some real-time workloads, this is the whole point of using
|
for some real-time workloads, this is the whole point of using
|
||||||
the rcu_nocbs= kernel boot parameter.
|
the rcu_nocbs= kernel boot parameter.
|
||||||
|
|
||||||
13. Unlike other forms of RCU, it *is* permissible to block in an
|
In addition, do not assume that callbacks queued in a given order
|
||||||
|
will be invoked in that order, even if they all are queued on the
|
||||||
|
same CPU. Furthermore, do not assume that same-CPU callbacks will
|
||||||
|
be invoked serially. For example, in recent kernels, CPUs can be
|
||||||
|
switched between offloaded and de-offloaded callback invocation,
|
||||||
|
and while a given CPU is undergoing such a switch, its callbacks
|
||||||
|
might be concurrently invoked by that CPU's softirq handler and
|
||||||
|
that CPU's rcuo kthread. At such times, that CPU's callbacks
|
||||||
|
might be executed both concurrently and out of order.
|
||||||
|
|
||||||
|
13. Unlike most flavors of RCU, it *is* permissible to block in an
|
||||||
SRCU read-side critical section (demarked by srcu_read_lock()
|
SRCU read-side critical section (demarked by srcu_read_lock()
|
||||||
and srcu_read_unlock()), hence the "SRCU": "sleepable RCU".
|
and srcu_read_unlock()), hence the "SRCU": "sleepable RCU".
|
||||||
Please note that if you don't need to sleep in read-side critical
|
Please note that if you don't need to sleep in read-side critical
|
||||||
@@ -412,6 +436,12 @@ over a rather long period of time, but improvements are always welcome!
|
|||||||
never sends IPIs to other CPUs, so it is easier on
|
never sends IPIs to other CPUs, so it is easier on
|
||||||
real-time workloads than is synchronize_rcu_expedited().
|
real-time workloads than is synchronize_rcu_expedited().
|
||||||
|
|
||||||
|
It is also permissible to sleep in RCU Tasks Trace read-side
|
||||||
|
critical, which are delimited by rcu_read_lock_trace() and
|
||||||
|
rcu_read_unlock_trace(). However, this is a specialized flavor
|
||||||
|
of RCU, and you should not use it without first checking with
|
||||||
|
its current users. In most cases, you should instead use SRCU.
|
||||||
|
|
||||||
Note that rcu_assign_pointer() relates to SRCU just as it does to
|
Note that rcu_assign_pointer() relates to SRCU just as it does to
|
||||||
other forms of RCU, but instead of rcu_dereference() you should
|
other forms of RCU, but instead of rcu_dereference() you should
|
||||||
use srcu_dereference() in order to avoid lockdep splats.
|
use srcu_dereference() in order to avoid lockdep splats.
|
||||||
@@ -442,50 +472,62 @@ over a rather long period of time, but improvements are always welcome!
|
|||||||
find problems as follows:
|
find problems as follows:
|
||||||
|
|
||||||
CONFIG_PROVE_LOCKING:
|
CONFIG_PROVE_LOCKING:
|
||||||
check that accesses to RCU-protected data
|
check that accesses to RCU-protected data structures
|
||||||
structures are carried out under the proper RCU
|
are carried out under the proper RCU read-side critical
|
||||||
read-side critical section, while holding the right
|
section, while holding the right combination of locks,
|
||||||
combination of locks, or whatever other conditions
|
or whatever other conditions are appropriate.
|
||||||
are appropriate.
|
|
||||||
|
|
||||||
CONFIG_DEBUG_OBJECTS_RCU_HEAD:
|
CONFIG_DEBUG_OBJECTS_RCU_HEAD:
|
||||||
check that you don't pass the
|
check that you don't pass the same object to call_rcu()
|
||||||
same object to call_rcu() (or friends) before an RCU
|
(or friends) before an RCU grace period has elapsed
|
||||||
grace period has elapsed since the last time that you
|
since the last time that you passed that same object to
|
||||||
passed that same object to call_rcu() (or friends).
|
call_rcu() (or friends).
|
||||||
|
|
||||||
__rcu sparse checks:
|
__rcu sparse checks:
|
||||||
tag the pointer to the RCU-protected data
|
tag the pointer to the RCU-protected data structure
|
||||||
structure with __rcu, and sparse will warn you if you
|
with __rcu, and sparse will warn you if you access that
|
||||||
access that pointer without the services of one of the
|
pointer without the services of one of the variants
|
||||||
variants of rcu_dereference().
|
of rcu_dereference().
|
||||||
|
|
||||||
These debugging aids can help you find problems that are
|
These debugging aids can help you find problems that are
|
||||||
otherwise extremely difficult to spot.
|
otherwise extremely difficult to spot.
|
||||||
|
|
||||||
17. If you register a callback using call_rcu() or call_srcu(), and
|
17. If you pass a callback function defined within a module to one of
|
||||||
pass in a function defined within a loadable module, then it in
|
call_rcu(), call_srcu(), call_rcu_tasks(), call_rcu_tasks_rude(),
|
||||||
necessary to wait for all pending callbacks to be invoked after
|
or call_rcu_tasks_trace(), then it is necessary to wait for all
|
||||||
the last invocation and before unloading that module. Note that
|
pending callbacks to be invoked before unloading that module.
|
||||||
it is absolutely *not* sufficient to wait for a grace period!
|
Note that it is absolutely *not* sufficient to wait for a grace
|
||||||
The current (say) synchronize_rcu() implementation is *not*
|
period! For example, synchronize_rcu() implementation is *not*
|
||||||
guaranteed to wait for callbacks registered on other CPUs.
|
guaranteed to wait for callbacks registered on other CPUs via
|
||||||
Or even on the current CPU if that CPU recently went offline
|
call_rcu(). Or even on the current CPU if that CPU recently
|
||||||
and came back online.
|
went offline and came back online.
|
||||||
|
|
||||||
You instead need to use one of the barrier functions:
|
You instead need to use one of the barrier functions:
|
||||||
|
|
||||||
- call_rcu() -> rcu_barrier()
|
- call_rcu() -> rcu_barrier()
|
||||||
- call_srcu() -> srcu_barrier()
|
- call_srcu() -> srcu_barrier()
|
||||||
|
- call_rcu_tasks() -> rcu_barrier_tasks()
|
||||||
|
- call_rcu_tasks_rude() -> rcu_barrier_tasks_rude()
|
||||||
|
- call_rcu_tasks_trace() -> rcu_barrier_tasks_trace()
|
||||||
|
|
||||||
However, these barrier functions are absolutely *not* guaranteed
|
However, these barrier functions are absolutely *not* guaranteed
|
||||||
to wait for a grace period. In fact, if there are no call_rcu()
|
to wait for a grace period. For example, if there are no
|
||||||
callbacks waiting anywhere in the system, rcu_barrier() is within
|
call_rcu() callbacks queued anywhere in the system, rcu_barrier()
|
||||||
its rights to return immediately.
|
can and will return immediately.
|
||||||
|
|
||||||
So if you need to wait for both an RCU grace period and for
|
So if you need to wait for both a grace period and for all
|
||||||
all pre-existing call_rcu() callbacks, you will need to execute
|
pre-existing callbacks, you will need to invoke both functions,
|
||||||
both rcu_barrier() and synchronize_rcu(), if necessary, using
|
with the pair depending on the flavor of RCU:
|
||||||
something like workqueues to execute them concurrently.
|
|
||||||
|
- Either synchronize_rcu() or synchronize_rcu_expedited(),
|
||||||
|
together with rcu_barrier()
|
||||||
|
- Either synchronize_srcu() or synchronize_srcu_expedited(),
|
||||||
|
together with and srcu_barrier()
|
||||||
|
- synchronize_rcu_tasks() and rcu_barrier_tasks()
|
||||||
|
- synchronize_tasks_rude() and rcu_barrier_tasks_rude()
|
||||||
|
- synchronize_tasks_trace() and rcu_barrier_tasks_trace()
|
||||||
|
|
||||||
|
If necessary, you can use something like workqueues to execute
|
||||||
|
the requisite pair of functions concurrently.
|
||||||
|
|
||||||
See rcubarrier.rst for more information.
|
See rcubarrier.rst for more information.
|
||||||
|
|||||||
@@ -9,7 +9,6 @@ RCU concepts
|
|||||||
.. toctree::
|
.. toctree::
|
||||||
:maxdepth: 3
|
:maxdepth: 3
|
||||||
|
|
||||||
arrayRCU
|
|
||||||
checklist
|
checklist
|
||||||
lockdep
|
lockdep
|
||||||
lockdep-splat
|
lockdep-splat
|
||||||
|
|||||||
@@ -3,11 +3,10 @@
|
|||||||
Using RCU to Protect Read-Mostly Linked Lists
|
Using RCU to Protect Read-Mostly Linked Lists
|
||||||
=============================================
|
=============================================
|
||||||
|
|
||||||
One of the best applications of RCU is to protect read-mostly linked lists
|
One of the most common uses of RCU is protecting read-mostly linked lists
|
||||||
(``struct list_head`` in list.h). One big advantage of this approach
|
(``struct list_head`` in list.h). One big advantage of this approach is
|
||||||
is that all of the required memory barriers are included for you in
|
that all of the required memory ordering is provided by the list macros.
|
||||||
the list macros. This document describes several applications of RCU,
|
This document describes several list-based RCU use cases.
|
||||||
with the best fits first.
|
|
||||||
|
|
||||||
|
|
||||||
Example 1: Read-mostly list: Deferred Destruction
|
Example 1: Read-mostly list: Deferred Destruction
|
||||||
@@ -35,7 +34,8 @@ The code traversing the list of all processes typically looks like::
|
|||||||
}
|
}
|
||||||
rcu_read_unlock();
|
rcu_read_unlock();
|
||||||
|
|
||||||
The simplified code for removing a process from a task list is::
|
The simplified and heavily inlined code for removing a process from a
|
||||||
|
task list is::
|
||||||
|
|
||||||
void release_task(struct task_struct *p)
|
void release_task(struct task_struct *p)
|
||||||
{
|
{
|
||||||
@@ -45,39 +45,48 @@ The simplified code for removing a process from a task list is::
|
|||||||
call_rcu(&p->rcu, delayed_put_task_struct);
|
call_rcu(&p->rcu, delayed_put_task_struct);
|
||||||
}
|
}
|
||||||
|
|
||||||
When a process exits, ``release_task()`` calls ``list_del_rcu(&p->tasks)`` under
|
When a process exits, ``release_task()`` calls ``list_del_rcu(&p->tasks)``
|
||||||
``tasklist_lock`` writer lock protection, to remove the task from the list of
|
via __exit_signal() and __unhash_process() under ``tasklist_lock``
|
||||||
all tasks. The ``tasklist_lock`` prevents concurrent list additions/removals
|
writer lock protection. The list_del_rcu() invocation removes
|
||||||
from corrupting the list. Readers using ``for_each_process()`` are not protected
|
the task from the list of all tasks. The ``tasklist_lock``
|
||||||
with the ``tasklist_lock``. To prevent readers from noticing changes in the list
|
prevents concurrent list additions/removals from corrupting the
|
||||||
pointers, the ``task_struct`` object is freed only after one or more grace
|
list. Readers using ``for_each_process()`` are not protected with the
|
||||||
periods elapse (with the help of call_rcu()). This deferring of destruction
|
``tasklist_lock``. To prevent readers from noticing changes in the list
|
||||||
ensures that any readers traversing the list will see valid ``p->tasks.next``
|
pointers, the ``task_struct`` object is freed only after one or more
|
||||||
pointers and deletion/freeing can happen in parallel with traversal of the list.
|
grace periods elapse, with the help of call_rcu(), which is invoked via
|
||||||
This pattern is also called an **existence lock**, since RCU pins the object in
|
put_task_struct_rcu_user(). This deferring of destruction ensures that
|
||||||
memory until all existing readers finish.
|
any readers traversing the list will see valid ``p->tasks.next`` pointers
|
||||||
|
and deletion/freeing can happen in parallel with traversal of the list.
|
||||||
|
This pattern is also called an **existence lock**, since RCU refrains
|
||||||
|
from invoking the delayed_put_task_struct() callback function until
|
||||||
|
all existing readers finish, which guarantees that the ``task_struct``
|
||||||
|
object in question will remain in existence until after the completion
|
||||||
|
of all RCU readers that might possibly have a reference to that object.
|
||||||
|
|
||||||
|
|
||||||
Example 2: Read-Side Action Taken Outside of Lock: No In-Place Updates
|
Example 2: Read-Side Action Taken Outside of Lock: No In-Place Updates
|
||||||
----------------------------------------------------------------------
|
----------------------------------------------------------------------
|
||||||
|
|
||||||
The best applications are cases where, if reader-writer locking were
|
Some reader-writer locking use cases compute a value while holding
|
||||||
used, the read-side lock would be dropped before taking any action
|
the read-side lock, but continue to use that value after that lock is
|
||||||
based on the results of the search. The most celebrated example is
|
released. These use cases are often good candidates for conversion
|
||||||
the routing table. Because the routing table is tracking the state of
|
to RCU. One prominent example involves network packet routing.
|
||||||
equipment outside of the computer, it will at times contain stale data.
|
Because the packet-routing data tracks the state of equipment outside
|
||||||
Therefore, once the route has been computed, there is no need to hold
|
of the computer, it will at times contain stale data. Therefore, once
|
||||||
the routing table static during transmission of the packet. After all,
|
the route has been computed, there is no need to hold the routing table
|
||||||
you can hold the routing table static all you want, but that won't keep
|
static during transmission of the packet. After all, you can hold the
|
||||||
the external Internet from changing, and it is the state of the external
|
routing table static all you want, but that won't keep the external
|
||||||
Internet that really matters. In addition, routing entries are typically
|
Internet from changing, and it is the state of the external Internet
|
||||||
added or deleted, rather than being modified in place.
|
that really matters. In addition, routing entries are typically added
|
||||||
|
or deleted, rather than being modified in place. This is a rare example
|
||||||
|
of the finite speed of light and the non-zero size of atoms actually
|
||||||
|
helping make synchronization be lighter weight.
|
||||||
|
|
||||||
A straightforward example of this use of RCU may be found in the
|
A straightforward example of this type of RCU use case may be found in
|
||||||
system-call auditing support. For example, a reader-writer locked
|
the system-call auditing support. For example, a reader-writer locked
|
||||||
implementation of ``audit_filter_task()`` might be as follows::
|
implementation of ``audit_filter_task()`` might be as follows::
|
||||||
|
|
||||||
static enum audit_state audit_filter_task(struct task_struct *tsk)
|
static enum audit_state audit_filter_task(struct task_struct *tsk, char **key)
|
||||||
{
|
{
|
||||||
struct audit_entry *e;
|
struct audit_entry *e;
|
||||||
enum audit_state state;
|
enum audit_state state;
|
||||||
@@ -86,6 +95,8 @@ implementation of ``audit_filter_task()`` might be as follows::
|
|||||||
/* Note: audit_filter_mutex held by caller. */
|
/* Note: audit_filter_mutex held by caller. */
|
||||||
list_for_each_entry(e, &audit_tsklist, list) {
|
list_for_each_entry(e, &audit_tsklist, list) {
|
||||||
if (audit_filter_rules(tsk, &e->rule, NULL, &state)) {
|
if (audit_filter_rules(tsk, &e->rule, NULL, &state)) {
|
||||||
|
if (state == AUDIT_STATE_RECORD)
|
||||||
|
*key = kstrdup(e->rule.filterkey, GFP_ATOMIC);
|
||||||
read_unlock(&auditsc_lock);
|
read_unlock(&auditsc_lock);
|
||||||
return state;
|
return state;
|
||||||
}
|
}
|
||||||
@@ -101,7 +112,7 @@ you are turning auditing off, it is OK to audit a few extra system calls.
|
|||||||
|
|
||||||
This means that RCU can be easily applied to the read side, as follows::
|
This means that RCU can be easily applied to the read side, as follows::
|
||||||
|
|
||||||
static enum audit_state audit_filter_task(struct task_struct *tsk)
|
static enum audit_state audit_filter_task(struct task_struct *tsk, char **key)
|
||||||
{
|
{
|
||||||
struct audit_entry *e;
|
struct audit_entry *e;
|
||||||
enum audit_state state;
|
enum audit_state state;
|
||||||
@@ -110,6 +121,8 @@ This means that RCU can be easily applied to the read side, as follows::
|
|||||||
/* Note: audit_filter_mutex held by caller. */
|
/* Note: audit_filter_mutex held by caller. */
|
||||||
list_for_each_entry_rcu(e, &audit_tsklist, list) {
|
list_for_each_entry_rcu(e, &audit_tsklist, list) {
|
||||||
if (audit_filter_rules(tsk, &e->rule, NULL, &state)) {
|
if (audit_filter_rules(tsk, &e->rule, NULL, &state)) {
|
||||||
|
if (state == AUDIT_STATE_RECORD)
|
||||||
|
*key = kstrdup(e->rule.filterkey, GFP_ATOMIC);
|
||||||
rcu_read_unlock();
|
rcu_read_unlock();
|
||||||
return state;
|
return state;
|
||||||
}
|
}
|
||||||
@@ -118,13 +131,15 @@ This means that RCU can be easily applied to the read side, as follows::
|
|||||||
return AUDIT_BUILD_CONTEXT;
|
return AUDIT_BUILD_CONTEXT;
|
||||||
}
|
}
|
||||||
|
|
||||||
The ``read_lock()`` and ``read_unlock()`` calls have become rcu_read_lock()
|
The read_lock() and read_unlock() calls have become rcu_read_lock()
|
||||||
and rcu_read_unlock(), respectively, and the list_for_each_entry() has
|
and rcu_read_unlock(), respectively, and the list_for_each_entry()
|
||||||
become list_for_each_entry_rcu(). The **_rcu()** list-traversal primitives
|
has become list_for_each_entry_rcu(). The **_rcu()** list-traversal
|
||||||
insert the read-side memory barriers that are required on DEC Alpha CPUs.
|
primitives add READ_ONCE() and diagnostic checks for incorrect use
|
||||||
|
outside of an RCU read-side critical section.
|
||||||
|
|
||||||
The changes to the update side are also straightforward. A reader-writer lock
|
The changes to the update side are also straightforward. A reader-writer lock
|
||||||
might be used as follows for deletion and insertion::
|
might be used as follows for deletion and insertion in these simplified
|
||||||
|
versions of audit_del_rule() and audit_add_rule()::
|
||||||
|
|
||||||
static inline int audit_del_rule(struct audit_rule *rule,
|
static inline int audit_del_rule(struct audit_rule *rule,
|
||||||
struct list_head *list)
|
struct list_head *list)
|
||||||
@@ -188,16 +203,16 @@ Following are the RCU equivalents for these two functions::
|
|||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
Normally, the ``write_lock()`` and ``write_unlock()`` would be replaced by a
|
Normally, the write_lock() and write_unlock() would be replaced by a
|
||||||
spin_lock() and a spin_unlock(). But in this case, all callers hold
|
spin_lock() and a spin_unlock(). But in this case, all callers hold
|
||||||
``audit_filter_mutex``, so no additional locking is required. The
|
``audit_filter_mutex``, so no additional locking is required. The
|
||||||
``auditsc_lock`` can therefore be eliminated, since use of RCU eliminates the
|
auditsc_lock can therefore be eliminated, since use of RCU eliminates the
|
||||||
need for writers to exclude readers.
|
need for writers to exclude readers.
|
||||||
|
|
||||||
The list_del(), list_add(), and list_add_tail() primitives have been
|
The list_del(), list_add(), and list_add_tail() primitives have been
|
||||||
replaced by list_del_rcu(), list_add_rcu(), and list_add_tail_rcu().
|
replaced by list_del_rcu(), list_add_rcu(), and list_add_tail_rcu().
|
||||||
The **_rcu()** list-manipulation primitives add memory barriers that are needed on
|
The **_rcu()** list-manipulation primitives add memory barriers that are
|
||||||
weakly ordered CPUs (most of them!). The list_del_rcu() primitive omits the
|
needed on weakly ordered CPUs. The list_del_rcu() primitive omits the
|
||||||
pointer poisoning debug-assist code that would otherwise cause concurrent
|
pointer poisoning debug-assist code that would otherwise cause concurrent
|
||||||
readers to fail spectacularly.
|
readers to fail spectacularly.
|
||||||
|
|
||||||
@@ -238,7 +253,9 @@ need to be filled in)::
|
|||||||
The RCU version creates a copy, updates the copy, then replaces the old
|
The RCU version creates a copy, updates the copy, then replaces the old
|
||||||
entry with the newly updated entry. This sequence of actions, allowing
|
entry with the newly updated entry. This sequence of actions, allowing
|
||||||
concurrent reads while making a copy to perform an update, is what gives
|
concurrent reads while making a copy to perform an update, is what gives
|
||||||
RCU (*read-copy update*) its name. The RCU code is as follows::
|
RCU (*read-copy update*) its name.
|
||||||
|
|
||||||
|
The RCU version of audit_upd_rule() is as follows::
|
||||||
|
|
||||||
static inline int audit_upd_rule(struct audit_rule *rule,
|
static inline int audit_upd_rule(struct audit_rule *rule,
|
||||||
struct list_head *list,
|
struct list_head *list,
|
||||||
@@ -267,6 +284,9 @@ RCU (*read-copy update*) its name. The RCU code is as follows::
|
|||||||
Again, this assumes that the caller holds ``audit_filter_mutex``. Normally, the
|
Again, this assumes that the caller holds ``audit_filter_mutex``. Normally, the
|
||||||
writer lock would become a spinlock in this sort of code.
|
writer lock would become a spinlock in this sort of code.
|
||||||
|
|
||||||
|
The update_lsm_rule() does something very similar, for those who would
|
||||||
|
prefer to look at real Linux-kernel code.
|
||||||
|
|
||||||
Another use of this pattern can be found in the openswitch driver's *connection
|
Another use of this pattern can be found in the openswitch driver's *connection
|
||||||
tracking table* code in ``ct_limit_set()``. The table holds connection tracking
|
tracking table* code in ``ct_limit_set()``. The table holds connection tracking
|
||||||
entries and has a limit on the maximum entries. There is one such table
|
entries and has a limit on the maximum entries. There is one such table
|
||||||
@@ -281,9 +301,10 @@ Example 4: Eliminating Stale Data
|
|||||||
---------------------------------
|
---------------------------------
|
||||||
|
|
||||||
The auditing example above tolerates stale data, as do most algorithms
|
The auditing example above tolerates stale data, as do most algorithms
|
||||||
that are tracking external state. Because there is a delay from the
|
that are tracking external state. After all, given there is a delay
|
||||||
time the external state changes before Linux becomes aware of the change,
|
from the time the external state changes before Linux becomes aware
|
||||||
additional RCU-induced staleness is generally not a problem.
|
of the change, and so as noted earlier, a small quantity of additional
|
||||||
|
RCU-induced staleness is generally not a problem.
|
||||||
|
|
||||||
However, there are many examples where stale data cannot be tolerated.
|
However, there are many examples where stale data cannot be tolerated.
|
||||||
One example in the Linux kernel is the System V IPC (see the shm_lock()
|
One example in the Linux kernel is the System V IPC (see the shm_lock()
|
||||||
@@ -302,7 +323,7 @@ Quick Quiz:
|
|||||||
|
|
||||||
If the system-call audit module were to ever need to reject stale data, one way
|
If the system-call audit module were to ever need to reject stale data, one way
|
||||||
to accomplish this would be to add a ``deleted`` flag and a ``lock`` spinlock to the
|
to accomplish this would be to add a ``deleted`` flag and a ``lock`` spinlock to the
|
||||||
audit_entry structure, and modify ``audit_filter_task()`` as follows::
|
``audit_entry`` structure, and modify audit_filter_task() as follows::
|
||||||
|
|
||||||
static enum audit_state audit_filter_task(struct task_struct *tsk)
|
static enum audit_state audit_filter_task(struct task_struct *tsk)
|
||||||
{
|
{
|
||||||
@@ -319,6 +340,8 @@ audit_entry structure, and modify ``audit_filter_task()`` as follows::
|
|||||||
return AUDIT_BUILD_CONTEXT;
|
return AUDIT_BUILD_CONTEXT;
|
||||||
}
|
}
|
||||||
rcu_read_unlock();
|
rcu_read_unlock();
|
||||||
|
if (state == AUDIT_STATE_RECORD)
|
||||||
|
*key = kstrdup(e->rule.filterkey, GFP_ATOMIC);
|
||||||
return state;
|
return state;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -326,12 +349,6 @@ audit_entry structure, and modify ``audit_filter_task()`` as follows::
|
|||||||
return AUDIT_BUILD_CONTEXT;
|
return AUDIT_BUILD_CONTEXT;
|
||||||
}
|
}
|
||||||
|
|
||||||
Note that this example assumes that entries are only added and deleted.
|
|
||||||
Additional mechanism is required to deal correctly with the update-in-place
|
|
||||||
performed by ``audit_upd_rule()``. For one thing, ``audit_upd_rule()`` would
|
|
||||||
need additional memory barriers to ensure that the list_add_rcu() was really
|
|
||||||
executed before the list_del_rcu().
|
|
||||||
|
|
||||||
The ``audit_del_rule()`` function would need to set the ``deleted`` flag under the
|
The ``audit_del_rule()`` function would need to set the ``deleted`` flag under the
|
||||||
spinlock as follows::
|
spinlock as follows::
|
||||||
|
|
||||||
@@ -357,24 +374,32 @@ spinlock as follows::
|
|||||||
|
|
||||||
This too assumes that the caller holds ``audit_filter_mutex``.
|
This too assumes that the caller holds ``audit_filter_mutex``.
|
||||||
|
|
||||||
|
Note that this example assumes that entries are only added and deleted.
|
||||||
|
Additional mechanism is required to deal correctly with the update-in-place
|
||||||
|
performed by audit_upd_rule(). For one thing, audit_upd_rule() would
|
||||||
|
need to hold the locks of both the old ``audit_entry`` and its replacement
|
||||||
|
while executing the list_replace_rcu().
|
||||||
|
|
||||||
|
|
||||||
Example 5: Skipping Stale Objects
|
Example 5: Skipping Stale Objects
|
||||||
---------------------------------
|
---------------------------------
|
||||||
|
|
||||||
For some usecases, reader performance can be improved by skipping stale objects
|
For some use cases, reader performance can be improved by skipping
|
||||||
during read-side list traversal if the object in concern is pending destruction
|
stale objects during read-side list traversal, where stale objects
|
||||||
after one or more grace periods. One such example can be found in the timerfd
|
are those that will be removed and destroyed after one or more grace
|
||||||
subsystem. When a ``CLOCK_REALTIME`` clock is reprogrammed - for example due to
|
periods. One such example can be found in the timerfd subsystem. When a
|
||||||
setting of the system time, then all programmed timerfds that depend on this
|
``CLOCK_REALTIME`` clock is reprogrammed (for example due to setting
|
||||||
clock get triggered and processes waiting on them to expire are woken up in
|
of the system time) then all programmed ``timerfds`` that depend on
|
||||||
advance of their scheduled expiry. To facilitate this, all such timers are added
|
this clock get triggered and processes waiting on them are awakened in
|
||||||
to an RCU-managed ``cancel_list`` when they are setup in
|
advance of their scheduled expiry. To facilitate this, all such timers
|
||||||
|
are added to an RCU-managed ``cancel_list`` when they are setup in
|
||||||
``timerfd_setup_cancel()``::
|
``timerfd_setup_cancel()``::
|
||||||
|
|
||||||
static void timerfd_setup_cancel(struct timerfd_ctx *ctx, int flags)
|
static void timerfd_setup_cancel(struct timerfd_ctx *ctx, int flags)
|
||||||
{
|
{
|
||||||
spin_lock(&ctx->cancel_lock);
|
spin_lock(&ctx->cancel_lock);
|
||||||
if ((ctx->clockid == CLOCK_REALTIME &&
|
if ((ctx->clockid == CLOCK_REALTIME ||
|
||||||
|
ctx->clockid == CLOCK_REALTIME_ALARM) &&
|
||||||
(flags & TFD_TIMER_ABSTIME) && (flags & TFD_TIMER_CANCEL_ON_SET)) {
|
(flags & TFD_TIMER_ABSTIME) && (flags & TFD_TIMER_CANCEL_ON_SET)) {
|
||||||
if (!ctx->might_cancel) {
|
if (!ctx->might_cancel) {
|
||||||
ctx->might_cancel = true;
|
ctx->might_cancel = true;
|
||||||
@@ -382,13 +407,16 @@ to an RCU-managed ``cancel_list`` when they are setup in
|
|||||||
list_add_rcu(&ctx->clist, &cancel_list);
|
list_add_rcu(&ctx->clist, &cancel_list);
|
||||||
spin_unlock(&cancel_lock);
|
spin_unlock(&cancel_lock);
|
||||||
}
|
}
|
||||||
|
} else {
|
||||||
|
__timerfd_remove_cancel(ctx);
|
||||||
}
|
}
|
||||||
spin_unlock(&ctx->cancel_lock);
|
spin_unlock(&ctx->cancel_lock);
|
||||||
}
|
}
|
||||||
|
|
||||||
When a timerfd is freed (fd is closed), then the ``might_cancel`` flag of the
|
When a timerfd is freed (fd is closed), then the ``might_cancel``
|
||||||
timerfd object is cleared, the object removed from the ``cancel_list`` and
|
flag of the timerfd object is cleared, the object removed from the
|
||||||
destroyed::
|
``cancel_list`` and destroyed, as shown in this simplified and inlined
|
||||||
|
version of timerfd_release()::
|
||||||
|
|
||||||
int timerfd_release(struct inode *inode, struct file *file)
|
int timerfd_release(struct inode *inode, struct file *file)
|
||||||
{
|
{
|
||||||
@@ -403,7 +431,10 @@ destroyed::
|
|||||||
}
|
}
|
||||||
spin_unlock(&ctx->cancel_lock);
|
spin_unlock(&ctx->cancel_lock);
|
||||||
|
|
||||||
hrtimer_cancel(&ctx->t.tmr);
|
if (isalarm(ctx))
|
||||||
|
alarm_cancel(&ctx->t.alarm);
|
||||||
|
else
|
||||||
|
hrtimer_cancel(&ctx->t.tmr);
|
||||||
kfree_rcu(ctx, rcu);
|
kfree_rcu(ctx, rcu);
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
@@ -416,6 +447,7 @@ objects::
|
|||||||
|
|
||||||
void timerfd_clock_was_set(void)
|
void timerfd_clock_was_set(void)
|
||||||
{
|
{
|
||||||
|
ktime_t moffs = ktime_mono_to_real(0);
|
||||||
struct timerfd_ctx *ctx;
|
struct timerfd_ctx *ctx;
|
||||||
unsigned long flags;
|
unsigned long flags;
|
||||||
|
|
||||||
@@ -424,7 +456,7 @@ objects::
|
|||||||
if (!ctx->might_cancel)
|
if (!ctx->might_cancel)
|
||||||
continue;
|
continue;
|
||||||
spin_lock_irqsave(&ctx->wqh.lock, flags);
|
spin_lock_irqsave(&ctx->wqh.lock, flags);
|
||||||
if (ctx->moffs != ktime_mono_to_real(0)) {
|
if (ctx->moffs != moffs) {
|
||||||
ctx->moffs = KTIME_MAX;
|
ctx->moffs = KTIME_MAX;
|
||||||
ctx->ticks++;
|
ctx->ticks++;
|
||||||
wake_up_locked_poll(&ctx->wqh, EPOLLIN);
|
wake_up_locked_poll(&ctx->wqh, EPOLLIN);
|
||||||
@@ -434,10 +466,10 @@ objects::
|
|||||||
rcu_read_unlock();
|
rcu_read_unlock();
|
||||||
}
|
}
|
||||||
|
|
||||||
The key point here is, because RCU-traversal of the ``cancel_list`` happens
|
The key point is that because RCU-protected traversal of the
|
||||||
while objects are being added and removed to the list, sometimes the traversal
|
``cancel_list`` happens concurrently with object addition and removal,
|
||||||
can step on an object that has been removed from the list. In this example, it
|
sometimes the traversal can access an object that has been removed from
|
||||||
is seen that it is better to skip such objects using a flag.
|
the list. In this example, a flag is used to skip such objects.
|
||||||
|
|
||||||
|
|
||||||
Summary
|
Summary
|
||||||
|
|||||||
@@ -17,7 +17,9 @@ state::
|
|||||||
rcu_read_lock_held() for normal RCU.
|
rcu_read_lock_held() for normal RCU.
|
||||||
rcu_read_lock_bh_held() for RCU-bh.
|
rcu_read_lock_bh_held() for RCU-bh.
|
||||||
rcu_read_lock_sched_held() for RCU-sched.
|
rcu_read_lock_sched_held() for RCU-sched.
|
||||||
|
rcu_read_lock_any_held() for any of normal RCU, RCU-bh, and RCU-sched.
|
||||||
srcu_read_lock_held() for SRCU.
|
srcu_read_lock_held() for SRCU.
|
||||||
|
rcu_read_lock_trace_held() for RCU Tasks Trace.
|
||||||
|
|
||||||
These functions are conservative, and will therefore return 1 if they
|
These functions are conservative, and will therefore return 1 if they
|
||||||
aren't certain (for example, if CONFIG_DEBUG_LOCK_ALLOC is not set).
|
aren't certain (for example, if CONFIG_DEBUG_LOCK_ALLOC is not set).
|
||||||
@@ -53,6 +55,8 @@ checking of rcu_dereference() primitives:
|
|||||||
is invoked by both SRCU readers and updaters.
|
is invoked by both SRCU readers and updaters.
|
||||||
rcu_dereference_raw(p):
|
rcu_dereference_raw(p):
|
||||||
Don't check. (Use sparingly, if at all.)
|
Don't check. (Use sparingly, if at all.)
|
||||||
|
rcu_dereference_raw_check(p):
|
||||||
|
Don't do lockdep at all. (Use sparingly, if at all.)
|
||||||
rcu_dereference_protected(p, c):
|
rcu_dereference_protected(p, c):
|
||||||
Use explicit check expression "c", and omit all barriers
|
Use explicit check expression "c", and omit all barriers
|
||||||
and compiler constraints. This is useful when the data
|
and compiler constraints. This is useful when the data
|
||||||
|
|||||||
@@ -468,6 +468,9 @@ config ARCH_WANT_IRQS_OFF_ACTIVATE_MM
|
|||||||
config ARCH_HAVE_NMI_SAFE_CMPXCHG
|
config ARCH_HAVE_NMI_SAFE_CMPXCHG
|
||||||
bool
|
bool
|
||||||
|
|
||||||
|
config ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
|
||||||
|
bool
|
||||||
|
|
||||||
config HAVE_ALIGNED_STRUCT_PAGE
|
config HAVE_ALIGNED_STRUCT_PAGE
|
||||||
bool
|
bool
|
||||||
help
|
help
|
||||||
|
|||||||
@@ -31,6 +31,7 @@ config ARM64
|
|||||||
select ARCH_HAS_KCOV
|
select ARCH_HAS_KCOV
|
||||||
select ARCH_HAS_KEEPINITRD
|
select ARCH_HAS_KEEPINITRD
|
||||||
select ARCH_HAS_MEMBARRIER_SYNC_CORE
|
select ARCH_HAS_MEMBARRIER_SYNC_CORE
|
||||||
|
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
|
||||||
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
|
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
|
||||||
select ARCH_HAS_PTE_DEVMAP
|
select ARCH_HAS_PTE_DEVMAP
|
||||||
select ARCH_HAS_PTE_SPECIAL
|
select ARCH_HAS_PTE_SPECIAL
|
||||||
|
|||||||
@@ -10,6 +10,7 @@ config LOONGARCH
|
|||||||
select ARCH_ENABLE_MEMORY_HOTPLUG
|
select ARCH_ENABLE_MEMORY_HOTPLUG
|
||||||
select ARCH_ENABLE_MEMORY_HOTREMOVE
|
select ARCH_ENABLE_MEMORY_HOTREMOVE
|
||||||
select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
|
select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
|
||||||
|
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
|
||||||
select ARCH_HAS_PTE_SPECIAL
|
select ARCH_HAS_PTE_SPECIAL
|
||||||
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
|
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
|
||||||
select ARCH_INLINE_READ_LOCK if !PREEMPTION
|
select ARCH_INLINE_READ_LOCK if !PREEMPTION
|
||||||
|
|||||||
@@ -73,6 +73,7 @@ config S390
|
|||||||
select ARCH_HAS_GIGANTIC_PAGE
|
select ARCH_HAS_GIGANTIC_PAGE
|
||||||
select ARCH_HAS_KCOV
|
select ARCH_HAS_KCOV
|
||||||
select ARCH_HAS_MEM_ENCRYPT
|
select ARCH_HAS_MEM_ENCRYPT
|
||||||
|
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
|
||||||
select ARCH_HAS_PTE_SPECIAL
|
select ARCH_HAS_PTE_SPECIAL
|
||||||
select ARCH_HAS_SCALED_CPUTIME
|
select ARCH_HAS_SCALED_CPUTIME
|
||||||
select ARCH_HAS_SET_MEMORY
|
select ARCH_HAS_SET_MEMORY
|
||||||
|
|||||||
@@ -81,6 +81,7 @@ config X86
|
|||||||
select ARCH_HAS_KCOV if X86_64
|
select ARCH_HAS_KCOV if X86_64
|
||||||
select ARCH_HAS_MEM_ENCRYPT
|
select ARCH_HAS_MEM_ENCRYPT
|
||||||
select ARCH_HAS_MEMBARRIER_SYNC_CORE
|
select ARCH_HAS_MEMBARRIER_SYNC_CORE
|
||||||
|
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
|
||||||
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
|
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
|
||||||
select ARCH_HAS_PMEM_API if X86_64
|
select ARCH_HAS_PMEM_API if X86_64
|
||||||
select ARCH_HAS_PTE_DEVMAP if X86_64
|
select ARCH_HAS_PTE_DEVMAP if X86_64
|
||||||
|
|||||||
@@ -312,7 +312,7 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
|
|||||||
* Ensure that all tasks observe the host state change before the
|
* Ensure that all tasks observe the host state change before the
|
||||||
* host_failed change.
|
* host_failed change.
|
||||||
*/
|
*/
|
||||||
call_rcu(&scmd->rcu, scsi_eh_inc_host_failed);
|
call_rcu_hurry(&scmd->rcu, scsi_eh_inc_host_failed);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
|||||||
@@ -416,7 +416,7 @@ static __always_inline void guest_context_enter_irqoff(void)
|
|||||||
*/
|
*/
|
||||||
if (!context_tracking_guest_enter()) {
|
if (!context_tracking_guest_enter()) {
|
||||||
instrumentation_begin();
|
instrumentation_begin();
|
||||||
rcu_virt_note_context_switch(smp_processor_id());
|
rcu_virt_note_context_switch();
|
||||||
instrumentation_end();
|
instrumentation_end();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -108,6 +108,15 @@ static inline int rcu_preempt_depth(void)
|
|||||||
|
|
||||||
#endif /* #else #ifdef CONFIG_PREEMPT_RCU */
|
#endif /* #else #ifdef CONFIG_PREEMPT_RCU */
|
||||||
|
|
||||||
|
#ifdef CONFIG_RCU_LAZY
|
||||||
|
void call_rcu_hurry(struct rcu_head *head, rcu_callback_t func);
|
||||||
|
#else
|
||||||
|
static inline void call_rcu_hurry(struct rcu_head *head, rcu_callback_t func)
|
||||||
|
{
|
||||||
|
call_rcu(head, func);
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
/* Internal to kernel */
|
/* Internal to kernel */
|
||||||
void rcu_init(void);
|
void rcu_init(void);
|
||||||
extern int rcu_scheduler_active;
|
extern int rcu_scheduler_active;
|
||||||
@@ -340,6 +349,11 @@ static inline int rcu_read_lock_any_held(void)
|
|||||||
return !preemptible();
|
return !preemptible();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static inline int debug_lockdep_rcu_enabled(void)
|
||||||
|
{
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
|
#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
|
||||||
|
|
||||||
#ifdef CONFIG_PROVE_RCU
|
#ifdef CONFIG_PROVE_RCU
|
||||||
|
|||||||
@@ -142,12 +142,10 @@ static inline int rcu_needs_cpu(void)
|
|||||||
* Take advantage of the fact that there is only one CPU, which
|
* Take advantage of the fact that there is only one CPU, which
|
||||||
* allows us to ignore virtualization-based context switches.
|
* allows us to ignore virtualization-based context switches.
|
||||||
*/
|
*/
|
||||||
static inline void rcu_virt_note_context_switch(int cpu) { }
|
static inline void rcu_virt_note_context_switch(void) { }
|
||||||
static inline void rcu_cpu_stall_reset(void) { }
|
static inline void rcu_cpu_stall_reset(void) { }
|
||||||
static inline int rcu_jiffies_till_stall_check(void) { return 21 * HZ; }
|
static inline int rcu_jiffies_till_stall_check(void) { return 21 * HZ; }
|
||||||
static inline void rcu_irq_exit_check_preempt(void) { }
|
static inline void rcu_irq_exit_check_preempt(void) { }
|
||||||
#define rcu_is_idle_cpu(cpu) \
|
|
||||||
(is_idle_task(current) && !in_nmi() && !in_hardirq() && !in_serving_softirq())
|
|
||||||
static inline void exit_rcu(void) { }
|
static inline void exit_rcu(void) { }
|
||||||
static inline bool rcu_preempt_need_deferred_qs(struct task_struct *t)
|
static inline bool rcu_preempt_need_deferred_qs(struct task_struct *t)
|
||||||
{
|
{
|
||||||
|
|||||||
@@ -27,7 +27,7 @@ void rcu_cpu_stall_reset(void);
|
|||||||
* wrapper around rcu_note_context_switch(), which allows TINY_RCU
|
* wrapper around rcu_note_context_switch(), which allows TINY_RCU
|
||||||
* to save a few bytes. The caller must have disabled interrupts.
|
* to save a few bytes. The caller must have disabled interrupts.
|
||||||
*/
|
*/
|
||||||
static inline void rcu_virt_note_context_switch(int cpu)
|
static inline void rcu_virt_note_context_switch(void)
|
||||||
{
|
{
|
||||||
rcu_note_context_switch(false);
|
rcu_note_context_switch(false);
|
||||||
}
|
}
|
||||||
@@ -87,8 +87,6 @@ bool poll_state_synchronize_rcu_full(struct rcu_gp_oldstate *rgosp);
|
|||||||
void cond_synchronize_rcu(unsigned long oldstate);
|
void cond_synchronize_rcu(unsigned long oldstate);
|
||||||
void cond_synchronize_rcu_full(struct rcu_gp_oldstate *rgosp);
|
void cond_synchronize_rcu_full(struct rcu_gp_oldstate *rgosp);
|
||||||
|
|
||||||
bool rcu_is_idle_cpu(int cpu);
|
|
||||||
|
|
||||||
#ifdef CONFIG_PROVE_RCU
|
#ifdef CONFIG_PROVE_RCU
|
||||||
void rcu_irq_exit_check_preempt(void);
|
void rcu_irq_exit_check_preempt(void);
|
||||||
#else
|
#else
|
||||||
|
|||||||
@@ -76,6 +76,17 @@
|
|||||||
* rcu_read_lock before reading the address, then rcu_read_unlock after
|
* rcu_read_lock before reading the address, then rcu_read_unlock after
|
||||||
* taking the spinlock within the structure expected at that address.
|
* taking the spinlock within the structure expected at that address.
|
||||||
*
|
*
|
||||||
|
* Note that it is not possible to acquire a lock within a structure
|
||||||
|
* allocated with SLAB_TYPESAFE_BY_RCU without first acquiring a reference
|
||||||
|
* as described above. The reason is that SLAB_TYPESAFE_BY_RCU pages
|
||||||
|
* are not zeroed before being given to the slab, which means that any
|
||||||
|
* locks must be initialized after each and every kmem_struct_alloc().
|
||||||
|
* Alternatively, make the ctor passed to kmem_cache_create() initialize
|
||||||
|
* the locks at page-allocation time, as is done in __i915_request_ctor(),
|
||||||
|
* sighand_ctor(), and anon_vma_ctor(). Such a ctor permits readers
|
||||||
|
* to safely acquire those ctor-initialized locks under rcu_read_lock()
|
||||||
|
* protection.
|
||||||
|
*
|
||||||
* Note that SLAB_TYPESAFE_BY_RCU was originally named SLAB_DESTROY_BY_RCU.
|
* Note that SLAB_TYPESAFE_BY_RCU was originally named SLAB_DESTROY_BY_RCU.
|
||||||
*/
|
*/
|
||||||
/* Defer freeing slabs to RCU */
|
/* Defer freeing slabs to RCU */
|
||||||
|
|||||||
@@ -64,6 +64,20 @@ unsigned long get_state_synchronize_srcu(struct srcu_struct *ssp);
|
|||||||
unsigned long start_poll_synchronize_srcu(struct srcu_struct *ssp);
|
unsigned long start_poll_synchronize_srcu(struct srcu_struct *ssp);
|
||||||
bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie);
|
bool poll_state_synchronize_srcu(struct srcu_struct *ssp, unsigned long cookie);
|
||||||
|
|
||||||
|
#ifdef CONFIG_NEED_SRCU_NMI_SAFE
|
||||||
|
int __srcu_read_lock_nmisafe(struct srcu_struct *ssp) __acquires(ssp);
|
||||||
|
void __srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx) __releases(ssp);
|
||||||
|
#else
|
||||||
|
static inline int __srcu_read_lock_nmisafe(struct srcu_struct *ssp)
|
||||||
|
{
|
||||||
|
return __srcu_read_lock(ssp);
|
||||||
|
}
|
||||||
|
static inline void __srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx)
|
||||||
|
{
|
||||||
|
__srcu_read_unlock(ssp, idx);
|
||||||
|
}
|
||||||
|
#endif /* CONFIG_NEED_SRCU_NMI_SAFE */
|
||||||
|
|
||||||
#ifdef CONFIG_SRCU
|
#ifdef CONFIG_SRCU
|
||||||
void srcu_init(void);
|
void srcu_init(void);
|
||||||
#else /* #ifdef CONFIG_SRCU */
|
#else /* #ifdef CONFIG_SRCU */
|
||||||
@@ -104,6 +118,18 @@ static inline int srcu_read_lock_held(const struct srcu_struct *ssp)
|
|||||||
|
|
||||||
#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
|
#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
|
||||||
|
|
||||||
|
#define SRCU_NMI_UNKNOWN 0x0
|
||||||
|
#define SRCU_NMI_UNSAFE 0x1
|
||||||
|
#define SRCU_NMI_SAFE 0x2
|
||||||
|
|
||||||
|
#if defined(CONFIG_PROVE_RCU) && defined(CONFIG_TREE_SRCU)
|
||||||
|
void srcu_check_nmi_safety(struct srcu_struct *ssp, bool nmi_safe);
|
||||||
|
#else
|
||||||
|
static inline void srcu_check_nmi_safety(struct srcu_struct *ssp,
|
||||||
|
bool nmi_safe) { }
|
||||||
|
#endif
|
||||||
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* srcu_dereference_check - fetch SRCU-protected pointer for later dereferencing
|
* srcu_dereference_check - fetch SRCU-protected pointer for later dereferencing
|
||||||
* @p: the pointer to fetch and protect for later dereferencing
|
* @p: the pointer to fetch and protect for later dereferencing
|
||||||
@@ -161,17 +187,36 @@ static inline int srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp)
|
|||||||
{
|
{
|
||||||
int retval;
|
int retval;
|
||||||
|
|
||||||
|
srcu_check_nmi_safety(ssp, false);
|
||||||
retval = __srcu_read_lock(ssp);
|
retval = __srcu_read_lock(ssp);
|
||||||
rcu_lock_acquire(&(ssp)->dep_map);
|
rcu_lock_acquire(&(ssp)->dep_map);
|
||||||
return retval;
|
return retval;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* srcu_read_lock_nmisafe - register a new reader for an SRCU-protected structure.
|
||||||
|
* @ssp: srcu_struct in which to register the new reader.
|
||||||
|
*
|
||||||
|
* Enter an SRCU read-side critical section, but in an NMI-safe manner.
|
||||||
|
* See srcu_read_lock() for more information.
|
||||||
|
*/
|
||||||
|
static inline int srcu_read_lock_nmisafe(struct srcu_struct *ssp) __acquires(ssp)
|
||||||
|
{
|
||||||
|
int retval;
|
||||||
|
|
||||||
|
srcu_check_nmi_safety(ssp, true);
|
||||||
|
retval = __srcu_read_lock_nmisafe(ssp);
|
||||||
|
rcu_lock_acquire(&(ssp)->dep_map);
|
||||||
|
return retval;
|
||||||
|
}
|
||||||
|
|
||||||
/* Used by tracing, cannot be traced and cannot invoke lockdep. */
|
/* Used by tracing, cannot be traced and cannot invoke lockdep. */
|
||||||
static inline notrace int
|
static inline notrace int
|
||||||
srcu_read_lock_notrace(struct srcu_struct *ssp) __acquires(ssp)
|
srcu_read_lock_notrace(struct srcu_struct *ssp) __acquires(ssp)
|
||||||
{
|
{
|
||||||
int retval;
|
int retval;
|
||||||
|
|
||||||
|
srcu_check_nmi_safety(ssp, false);
|
||||||
retval = __srcu_read_lock(ssp);
|
retval = __srcu_read_lock(ssp);
|
||||||
return retval;
|
return retval;
|
||||||
}
|
}
|
||||||
@@ -187,14 +232,32 @@ static inline void srcu_read_unlock(struct srcu_struct *ssp, int idx)
|
|||||||
__releases(ssp)
|
__releases(ssp)
|
||||||
{
|
{
|
||||||
WARN_ON_ONCE(idx & ~0x1);
|
WARN_ON_ONCE(idx & ~0x1);
|
||||||
|
srcu_check_nmi_safety(ssp, false);
|
||||||
rcu_lock_release(&(ssp)->dep_map);
|
rcu_lock_release(&(ssp)->dep_map);
|
||||||
__srcu_read_unlock(ssp, idx);
|
__srcu_read_unlock(ssp, idx);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* srcu_read_unlock_nmisafe - unregister a old reader from an SRCU-protected structure.
|
||||||
|
* @ssp: srcu_struct in which to unregister the old reader.
|
||||||
|
* @idx: return value from corresponding srcu_read_lock().
|
||||||
|
*
|
||||||
|
* Exit an SRCU read-side critical section, but in an NMI-safe manner.
|
||||||
|
*/
|
||||||
|
static inline void srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx)
|
||||||
|
__releases(ssp)
|
||||||
|
{
|
||||||
|
WARN_ON_ONCE(idx & ~0x1);
|
||||||
|
srcu_check_nmi_safety(ssp, true);
|
||||||
|
rcu_lock_release(&(ssp)->dep_map);
|
||||||
|
__srcu_read_unlock_nmisafe(ssp, idx);
|
||||||
|
}
|
||||||
|
|
||||||
/* Used by tracing, cannot be traced and cannot call lockdep. */
|
/* Used by tracing, cannot be traced and cannot call lockdep. */
|
||||||
static inline notrace void
|
static inline notrace void
|
||||||
srcu_read_unlock_notrace(struct srcu_struct *ssp, int idx) __releases(ssp)
|
srcu_read_unlock_notrace(struct srcu_struct *ssp, int idx) __releases(ssp)
|
||||||
{
|
{
|
||||||
|
srcu_check_nmi_safety(ssp, false);
|
||||||
__srcu_read_unlock(ssp, idx);
|
__srcu_read_unlock(ssp, idx);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -23,8 +23,9 @@ struct srcu_struct;
|
|||||||
*/
|
*/
|
||||||
struct srcu_data {
|
struct srcu_data {
|
||||||
/* Read-side state. */
|
/* Read-side state. */
|
||||||
unsigned long srcu_lock_count[2]; /* Locks per CPU. */
|
atomic_long_t srcu_lock_count[2]; /* Locks per CPU. */
|
||||||
unsigned long srcu_unlock_count[2]; /* Unlocks per CPU. */
|
atomic_long_t srcu_unlock_count[2]; /* Unlocks per CPU. */
|
||||||
|
int srcu_nmi_safety; /* NMI-safe srcu_struct structure? */
|
||||||
|
|
||||||
/* Update-side state. */
|
/* Update-side state. */
|
||||||
spinlock_t __private lock ____cacheline_internodealigned_in_smp;
|
spinlock_t __private lock ____cacheline_internodealigned_in_smp;
|
||||||
|
|||||||
@@ -72,6 +72,9 @@ config TREE_SRCU
|
|||||||
help
|
help
|
||||||
This option selects the full-fledged version of SRCU.
|
This option selects the full-fledged version of SRCU.
|
||||||
|
|
||||||
|
config NEED_SRCU_NMI_SAFE
|
||||||
|
def_bool HAVE_NMI && !ARCH_HAS_NMI_SAFE_THIS_CPU_OPS && !TINY_SRCU
|
||||||
|
|
||||||
config TASKS_RCU_GENERIC
|
config TASKS_RCU_GENERIC
|
||||||
def_bool TASKS_RCU || TASKS_RUDE_RCU || TASKS_TRACE_RCU
|
def_bool TASKS_RCU || TASKS_RUDE_RCU || TASKS_TRACE_RCU
|
||||||
select SRCU
|
select SRCU
|
||||||
@@ -311,4 +314,12 @@ config TASKS_TRACE_RCU_READ_MB
|
|||||||
Say N here if you hate read-side memory barriers.
|
Say N here if you hate read-side memory barriers.
|
||||||
Take the default if you are unsure.
|
Take the default if you are unsure.
|
||||||
|
|
||||||
|
config RCU_LAZY
|
||||||
|
bool "RCU callback lazy invocation functionality"
|
||||||
|
depends on RCU_NOCB_CPU
|
||||||
|
default n
|
||||||
|
help
|
||||||
|
To save power, batch RCU callbacks and flush after delay, memory
|
||||||
|
pressure, or callback list growing too big.
|
||||||
|
|
||||||
endmenu # "RCU Subsystem"
|
endmenu # "RCU Subsystem"
|
||||||
|
|||||||
@@ -474,6 +474,14 @@ enum rcutorture_type {
|
|||||||
INVALID_RCU_FLAVOR
|
INVALID_RCU_FLAVOR
|
||||||
};
|
};
|
||||||
|
|
||||||
|
#if defined(CONFIG_RCU_LAZY)
|
||||||
|
unsigned long rcu_lazy_get_jiffies_till_flush(void);
|
||||||
|
void rcu_lazy_set_jiffies_till_flush(unsigned long j);
|
||||||
|
#else
|
||||||
|
static inline unsigned long rcu_lazy_get_jiffies_till_flush(void) { return 0; }
|
||||||
|
static inline void rcu_lazy_set_jiffies_till_flush(unsigned long j) { }
|
||||||
|
#endif
|
||||||
|
|
||||||
#if defined(CONFIG_TREE_RCU)
|
#if defined(CONFIG_TREE_RCU)
|
||||||
void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags,
|
void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags,
|
||||||
unsigned long *gp_seq);
|
unsigned long *gp_seq);
|
||||||
|
|||||||
@@ -95,6 +95,7 @@ torture_param(int, verbose, 1, "Enable verbose debugging printk()s");
|
|||||||
torture_param(int, writer_holdoff, 0, "Holdoff (us) between GPs, zero to disable");
|
torture_param(int, writer_holdoff, 0, "Holdoff (us) between GPs, zero to disable");
|
||||||
torture_param(int, kfree_rcu_test, 0, "Do we run a kfree_rcu() scale test?");
|
torture_param(int, kfree_rcu_test, 0, "Do we run a kfree_rcu() scale test?");
|
||||||
torture_param(int, kfree_mult, 1, "Multiple of kfree_obj size to allocate.");
|
torture_param(int, kfree_mult, 1, "Multiple of kfree_obj size to allocate.");
|
||||||
|
torture_param(int, kfree_by_call_rcu, 0, "Use call_rcu() to emulate kfree_rcu()?");
|
||||||
|
|
||||||
static char *scale_type = "rcu";
|
static char *scale_type = "rcu";
|
||||||
module_param(scale_type, charp, 0444);
|
module_param(scale_type, charp, 0444);
|
||||||
@@ -175,7 +176,7 @@ static struct rcu_scale_ops rcu_ops = {
|
|||||||
.get_gp_seq = rcu_get_gp_seq,
|
.get_gp_seq = rcu_get_gp_seq,
|
||||||
.gp_diff = rcu_seq_diff,
|
.gp_diff = rcu_seq_diff,
|
||||||
.exp_completed = rcu_exp_batches_completed,
|
.exp_completed = rcu_exp_batches_completed,
|
||||||
.async = call_rcu,
|
.async = call_rcu_hurry,
|
||||||
.gp_barrier = rcu_barrier,
|
.gp_barrier = rcu_barrier,
|
||||||
.sync = synchronize_rcu,
|
.sync = synchronize_rcu,
|
||||||
.exp_sync = synchronize_rcu_expedited,
|
.exp_sync = synchronize_rcu_expedited,
|
||||||
@@ -659,6 +660,14 @@ struct kfree_obj {
|
|||||||
struct rcu_head rh;
|
struct rcu_head rh;
|
||||||
};
|
};
|
||||||
|
|
||||||
|
/* Used if doing RCU-kfree'ing via call_rcu(). */
|
||||||
|
static void kfree_call_rcu(struct rcu_head *rh)
|
||||||
|
{
|
||||||
|
struct kfree_obj *obj = container_of(rh, struct kfree_obj, rh);
|
||||||
|
|
||||||
|
kfree(obj);
|
||||||
|
}
|
||||||
|
|
||||||
static int
|
static int
|
||||||
kfree_scale_thread(void *arg)
|
kfree_scale_thread(void *arg)
|
||||||
{
|
{
|
||||||
@@ -696,6 +705,11 @@ kfree_scale_thread(void *arg)
|
|||||||
if (!alloc_ptr)
|
if (!alloc_ptr)
|
||||||
return -ENOMEM;
|
return -ENOMEM;
|
||||||
|
|
||||||
|
if (kfree_by_call_rcu) {
|
||||||
|
call_rcu(&(alloc_ptr->rh), kfree_call_rcu);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
// By default kfree_rcu_test_single and kfree_rcu_test_double are
|
// By default kfree_rcu_test_single and kfree_rcu_test_double are
|
||||||
// initialized to false. If both have the same value (false or true)
|
// initialized to false. If both have the same value (false or true)
|
||||||
// both are randomly tested, otherwise only the one with value true
|
// both are randomly tested, otherwise only the one with value true
|
||||||
@@ -767,11 +781,58 @@ kfree_scale_shutdown(void *arg)
|
|||||||
return -EINVAL;
|
return -EINVAL;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Used if doing RCU-kfree'ing via call_rcu().
|
||||||
|
static unsigned long jiffies_at_lazy_cb;
|
||||||
|
static struct rcu_head lazy_test1_rh;
|
||||||
|
static int rcu_lazy_test1_cb_called;
|
||||||
|
static void call_rcu_lazy_test1(struct rcu_head *rh)
|
||||||
|
{
|
||||||
|
jiffies_at_lazy_cb = jiffies;
|
||||||
|
WRITE_ONCE(rcu_lazy_test1_cb_called, 1);
|
||||||
|
}
|
||||||
|
|
||||||
static int __init
|
static int __init
|
||||||
kfree_scale_init(void)
|
kfree_scale_init(void)
|
||||||
{
|
{
|
||||||
long i;
|
|
||||||
int firsterr = 0;
|
int firsterr = 0;
|
||||||
|
long i;
|
||||||
|
unsigned long jif_start;
|
||||||
|
unsigned long orig_jif;
|
||||||
|
|
||||||
|
// Also, do a quick self-test to ensure laziness is as much as
|
||||||
|
// expected.
|
||||||
|
if (kfree_by_call_rcu && !IS_ENABLED(CONFIG_RCU_LAZY)) {
|
||||||
|
pr_alert("CONFIG_RCU_LAZY is disabled, falling back to kfree_rcu() for delayed RCU kfree'ing\n");
|
||||||
|
kfree_by_call_rcu = 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (kfree_by_call_rcu) {
|
||||||
|
/* do a test to check the timeout. */
|
||||||
|
orig_jif = rcu_lazy_get_jiffies_till_flush();
|
||||||
|
|
||||||
|
rcu_lazy_set_jiffies_till_flush(2 * HZ);
|
||||||
|
rcu_barrier();
|
||||||
|
|
||||||
|
jif_start = jiffies;
|
||||||
|
jiffies_at_lazy_cb = 0;
|
||||||
|
call_rcu(&lazy_test1_rh, call_rcu_lazy_test1);
|
||||||
|
|
||||||
|
smp_cond_load_relaxed(&rcu_lazy_test1_cb_called, VAL == 1);
|
||||||
|
|
||||||
|
rcu_lazy_set_jiffies_till_flush(orig_jif);
|
||||||
|
|
||||||
|
if (WARN_ON_ONCE(jiffies_at_lazy_cb - jif_start < 2 * HZ)) {
|
||||||
|
pr_alert("ERROR: call_rcu() CBs are not being lazy as expected!\n");
|
||||||
|
WARN_ON_ONCE(1);
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (WARN_ON_ONCE(jiffies_at_lazy_cb - jif_start > 3 * HZ)) {
|
||||||
|
pr_alert("ERROR: call_rcu() CBs are being too lazy!\n");
|
||||||
|
WARN_ON_ONCE(1);
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
kfree_nrealthreads = compute_real(kfree_nthreads);
|
kfree_nrealthreads = compute_real(kfree_nthreads);
|
||||||
/* Start up the kthreads. */
|
/* Start up the kthreads. */
|
||||||
@@ -784,7 +845,9 @@ kfree_scale_init(void)
|
|||||||
schedule_timeout_uninterruptible(1);
|
schedule_timeout_uninterruptible(1);
|
||||||
}
|
}
|
||||||
|
|
||||||
pr_alert("kfree object size=%zu\n", kfree_mult * sizeof(struct kfree_obj));
|
pr_alert("kfree object size=%zu, kfree_by_call_rcu=%d\n",
|
||||||
|
kfree_mult * sizeof(struct kfree_obj),
|
||||||
|
kfree_by_call_rcu);
|
||||||
|
|
||||||
kfree_reader_tasks = kcalloc(kfree_nrealthreads, sizeof(kfree_reader_tasks[0]),
|
kfree_reader_tasks = kcalloc(kfree_nrealthreads, sizeof(kfree_reader_tasks[0]),
|
||||||
GFP_KERNEL);
|
GFP_KERNEL);
|
||||||
|
|||||||
@@ -357,6 +357,10 @@ struct rcu_torture_ops {
|
|||||||
bool (*poll_gp_state_exp)(unsigned long oldstate);
|
bool (*poll_gp_state_exp)(unsigned long oldstate);
|
||||||
void (*cond_sync_exp)(unsigned long oldstate);
|
void (*cond_sync_exp)(unsigned long oldstate);
|
||||||
void (*cond_sync_exp_full)(struct rcu_gp_oldstate *rgosp);
|
void (*cond_sync_exp_full)(struct rcu_gp_oldstate *rgosp);
|
||||||
|
unsigned long (*get_comp_state)(void);
|
||||||
|
void (*get_comp_state_full)(struct rcu_gp_oldstate *rgosp);
|
||||||
|
bool (*same_gp_state)(unsigned long oldstate1, unsigned long oldstate2);
|
||||||
|
bool (*same_gp_state_full)(struct rcu_gp_oldstate *rgosp1, struct rcu_gp_oldstate *rgosp2);
|
||||||
unsigned long (*get_gp_state)(void);
|
unsigned long (*get_gp_state)(void);
|
||||||
void (*get_gp_state_full)(struct rcu_gp_oldstate *rgosp);
|
void (*get_gp_state_full)(struct rcu_gp_oldstate *rgosp);
|
||||||
unsigned long (*get_gp_completed)(void);
|
unsigned long (*get_gp_completed)(void);
|
||||||
@@ -510,7 +514,7 @@ static unsigned long rcu_no_completed(void)
|
|||||||
|
|
||||||
static void rcu_torture_deferred_free(struct rcu_torture *p)
|
static void rcu_torture_deferred_free(struct rcu_torture *p)
|
||||||
{
|
{
|
||||||
call_rcu(&p->rtort_rcu, rcu_torture_cb);
|
call_rcu_hurry(&p->rtort_rcu, rcu_torture_cb);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void rcu_sync_torture_init(void)
|
static void rcu_sync_torture_init(void)
|
||||||
@@ -535,6 +539,10 @@ static struct rcu_torture_ops rcu_ops = {
|
|||||||
.deferred_free = rcu_torture_deferred_free,
|
.deferred_free = rcu_torture_deferred_free,
|
||||||
.sync = synchronize_rcu,
|
.sync = synchronize_rcu,
|
||||||
.exp_sync = synchronize_rcu_expedited,
|
.exp_sync = synchronize_rcu_expedited,
|
||||||
|
.same_gp_state = same_state_synchronize_rcu,
|
||||||
|
.same_gp_state_full = same_state_synchronize_rcu_full,
|
||||||
|
.get_comp_state = get_completed_synchronize_rcu,
|
||||||
|
.get_comp_state_full = get_completed_synchronize_rcu_full,
|
||||||
.get_gp_state = get_state_synchronize_rcu,
|
.get_gp_state = get_state_synchronize_rcu,
|
||||||
.get_gp_state_full = get_state_synchronize_rcu_full,
|
.get_gp_state_full = get_state_synchronize_rcu_full,
|
||||||
.get_gp_completed = get_completed_synchronize_rcu,
|
.get_gp_completed = get_completed_synchronize_rcu,
|
||||||
@@ -551,7 +559,7 @@ static struct rcu_torture_ops rcu_ops = {
|
|||||||
.start_gp_poll_exp_full = start_poll_synchronize_rcu_expedited_full,
|
.start_gp_poll_exp_full = start_poll_synchronize_rcu_expedited_full,
|
||||||
.poll_gp_state_exp = poll_state_synchronize_rcu,
|
.poll_gp_state_exp = poll_state_synchronize_rcu,
|
||||||
.cond_sync_exp = cond_synchronize_rcu_expedited,
|
.cond_sync_exp = cond_synchronize_rcu_expedited,
|
||||||
.call = call_rcu,
|
.call = call_rcu_hurry,
|
||||||
.cb_barrier = rcu_barrier,
|
.cb_barrier = rcu_barrier,
|
||||||
.fqs = rcu_force_quiescent_state,
|
.fqs = rcu_force_quiescent_state,
|
||||||
.stats = NULL,
|
.stats = NULL,
|
||||||
@@ -615,10 +623,14 @@ static struct rcu_torture_ops rcu_busted_ops = {
|
|||||||
DEFINE_STATIC_SRCU(srcu_ctl);
|
DEFINE_STATIC_SRCU(srcu_ctl);
|
||||||
static struct srcu_struct srcu_ctld;
|
static struct srcu_struct srcu_ctld;
|
||||||
static struct srcu_struct *srcu_ctlp = &srcu_ctl;
|
static struct srcu_struct *srcu_ctlp = &srcu_ctl;
|
||||||
|
static struct rcu_torture_ops srcud_ops;
|
||||||
|
|
||||||
static int srcu_torture_read_lock(void) __acquires(srcu_ctlp)
|
static int srcu_torture_read_lock(void) __acquires(srcu_ctlp)
|
||||||
{
|
{
|
||||||
return srcu_read_lock(srcu_ctlp);
|
if (cur_ops == &srcud_ops)
|
||||||
|
return srcu_read_lock_nmisafe(srcu_ctlp);
|
||||||
|
else
|
||||||
|
return srcu_read_lock(srcu_ctlp);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void
|
static void
|
||||||
@@ -642,7 +654,10 @@ srcu_read_delay(struct torture_random_state *rrsp, struct rt_read_seg *rtrsp)
|
|||||||
|
|
||||||
static void srcu_torture_read_unlock(int idx) __releases(srcu_ctlp)
|
static void srcu_torture_read_unlock(int idx) __releases(srcu_ctlp)
|
||||||
{
|
{
|
||||||
srcu_read_unlock(srcu_ctlp, idx);
|
if (cur_ops == &srcud_ops)
|
||||||
|
srcu_read_unlock_nmisafe(srcu_ctlp, idx);
|
||||||
|
else
|
||||||
|
srcu_read_unlock(srcu_ctlp, idx);
|
||||||
}
|
}
|
||||||
|
|
||||||
static int torture_srcu_read_lock_held(void)
|
static int torture_srcu_read_lock_held(void)
|
||||||
@@ -848,7 +863,7 @@ static void rcu_tasks_torture_deferred_free(struct rcu_torture *p)
|
|||||||
|
|
||||||
static void synchronize_rcu_mult_test(void)
|
static void synchronize_rcu_mult_test(void)
|
||||||
{
|
{
|
||||||
synchronize_rcu_mult(call_rcu_tasks, call_rcu);
|
synchronize_rcu_mult(call_rcu_tasks, call_rcu_hurry);
|
||||||
}
|
}
|
||||||
|
|
||||||
static struct rcu_torture_ops tasks_ops = {
|
static struct rcu_torture_ops tasks_ops = {
|
||||||
@@ -1258,13 +1273,15 @@ static void rcu_torture_write_types(void)
|
|||||||
} else if (gp_normal && !cur_ops->deferred_free) {
|
} else if (gp_normal && !cur_ops->deferred_free) {
|
||||||
pr_alert("%s: gp_normal without primitives.\n", __func__);
|
pr_alert("%s: gp_normal without primitives.\n", __func__);
|
||||||
}
|
}
|
||||||
if (gp_poll1 && cur_ops->start_gp_poll && cur_ops->poll_gp_state) {
|
if (gp_poll1 && cur_ops->get_comp_state && cur_ops->same_gp_state &&
|
||||||
|
cur_ops->start_gp_poll && cur_ops->poll_gp_state) {
|
||||||
synctype[nsynctypes++] = RTWS_POLL_GET;
|
synctype[nsynctypes++] = RTWS_POLL_GET;
|
||||||
pr_info("%s: Testing polling GPs.\n", __func__);
|
pr_info("%s: Testing polling GPs.\n", __func__);
|
||||||
} else if (gp_poll && (!cur_ops->start_gp_poll || !cur_ops->poll_gp_state)) {
|
} else if (gp_poll && (!cur_ops->start_gp_poll || !cur_ops->poll_gp_state)) {
|
||||||
pr_alert("%s: gp_poll without primitives.\n", __func__);
|
pr_alert("%s: gp_poll without primitives.\n", __func__);
|
||||||
}
|
}
|
||||||
if (gp_poll_full1 && cur_ops->start_gp_poll_full && cur_ops->poll_gp_state_full) {
|
if (gp_poll_full1 && cur_ops->get_comp_state_full && cur_ops->same_gp_state_full
|
||||||
|
&& cur_ops->start_gp_poll_full && cur_ops->poll_gp_state_full) {
|
||||||
synctype[nsynctypes++] = RTWS_POLL_GET_FULL;
|
synctype[nsynctypes++] = RTWS_POLL_GET_FULL;
|
||||||
pr_info("%s: Testing polling full-state GPs.\n", __func__);
|
pr_info("%s: Testing polling full-state GPs.\n", __func__);
|
||||||
} else if (gp_poll_full && (!cur_ops->start_gp_poll_full || !cur_ops->poll_gp_state_full)) {
|
} else if (gp_poll_full && (!cur_ops->start_gp_poll_full || !cur_ops->poll_gp_state_full)) {
|
||||||
@@ -1339,14 +1356,18 @@ rcu_torture_writer(void *arg)
|
|||||||
struct rcu_gp_oldstate cookie_full;
|
struct rcu_gp_oldstate cookie_full;
|
||||||
int expediting = 0;
|
int expediting = 0;
|
||||||
unsigned long gp_snap;
|
unsigned long gp_snap;
|
||||||
|
unsigned long gp_snap1;
|
||||||
struct rcu_gp_oldstate gp_snap_full;
|
struct rcu_gp_oldstate gp_snap_full;
|
||||||
|
struct rcu_gp_oldstate gp_snap1_full;
|
||||||
int i;
|
int i;
|
||||||
int idx;
|
int idx;
|
||||||
int oldnice = task_nice(current);
|
int oldnice = task_nice(current);
|
||||||
|
struct rcu_gp_oldstate rgo[NUM_ACTIVE_RCU_POLL_FULL_OLDSTATE];
|
||||||
struct rcu_torture *rp;
|
struct rcu_torture *rp;
|
||||||
struct rcu_torture *old_rp;
|
struct rcu_torture *old_rp;
|
||||||
static DEFINE_TORTURE_RANDOM(rand);
|
static DEFINE_TORTURE_RANDOM(rand);
|
||||||
bool stutter_waited;
|
bool stutter_waited;
|
||||||
|
unsigned long ulo[NUM_ACTIVE_RCU_POLL_OLDSTATE];
|
||||||
|
|
||||||
VERBOSE_TOROUT_STRING("rcu_torture_writer task started");
|
VERBOSE_TOROUT_STRING("rcu_torture_writer task started");
|
||||||
if (!can_expedite)
|
if (!can_expedite)
|
||||||
@@ -1463,20 +1484,43 @@ rcu_torture_writer(void *arg)
|
|||||||
break;
|
break;
|
||||||
case RTWS_POLL_GET:
|
case RTWS_POLL_GET:
|
||||||
rcu_torture_writer_state = RTWS_POLL_GET;
|
rcu_torture_writer_state = RTWS_POLL_GET;
|
||||||
|
for (i = 0; i < ARRAY_SIZE(ulo); i++)
|
||||||
|
ulo[i] = cur_ops->get_comp_state();
|
||||||
gp_snap = cur_ops->start_gp_poll();
|
gp_snap = cur_ops->start_gp_poll();
|
||||||
rcu_torture_writer_state = RTWS_POLL_WAIT;
|
rcu_torture_writer_state = RTWS_POLL_WAIT;
|
||||||
while (!cur_ops->poll_gp_state(gp_snap))
|
while (!cur_ops->poll_gp_state(gp_snap)) {
|
||||||
|
gp_snap1 = cur_ops->get_gp_state();
|
||||||
|
for (i = 0; i < ARRAY_SIZE(ulo); i++)
|
||||||
|
if (cur_ops->poll_gp_state(ulo[i]) ||
|
||||||
|
cur_ops->same_gp_state(ulo[i], gp_snap1)) {
|
||||||
|
ulo[i] = gp_snap1;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
WARN_ON_ONCE(i >= ARRAY_SIZE(ulo));
|
||||||
torture_hrtimeout_jiffies(torture_random(&rand) % 16,
|
torture_hrtimeout_jiffies(torture_random(&rand) % 16,
|
||||||
&rand);
|
&rand);
|
||||||
|
}
|
||||||
rcu_torture_pipe_update(old_rp);
|
rcu_torture_pipe_update(old_rp);
|
||||||
break;
|
break;
|
||||||
case RTWS_POLL_GET_FULL:
|
case RTWS_POLL_GET_FULL:
|
||||||
rcu_torture_writer_state = RTWS_POLL_GET_FULL;
|
rcu_torture_writer_state = RTWS_POLL_GET_FULL;
|
||||||
|
for (i = 0; i < ARRAY_SIZE(rgo); i++)
|
||||||
|
cur_ops->get_comp_state_full(&rgo[i]);
|
||||||
cur_ops->start_gp_poll_full(&gp_snap_full);
|
cur_ops->start_gp_poll_full(&gp_snap_full);
|
||||||
rcu_torture_writer_state = RTWS_POLL_WAIT_FULL;
|
rcu_torture_writer_state = RTWS_POLL_WAIT_FULL;
|
||||||
while (!cur_ops->poll_gp_state_full(&gp_snap_full))
|
while (!cur_ops->poll_gp_state_full(&gp_snap_full)) {
|
||||||
|
cur_ops->get_gp_state_full(&gp_snap1_full);
|
||||||
|
for (i = 0; i < ARRAY_SIZE(rgo); i++)
|
||||||
|
if (cur_ops->poll_gp_state_full(&rgo[i]) ||
|
||||||
|
cur_ops->same_gp_state_full(&rgo[i],
|
||||||
|
&gp_snap1_full)) {
|
||||||
|
rgo[i] = gp_snap1_full;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
WARN_ON_ONCE(i >= ARRAY_SIZE(rgo));
|
||||||
torture_hrtimeout_jiffies(torture_random(&rand) % 16,
|
torture_hrtimeout_jiffies(torture_random(&rand) % 16,
|
||||||
&rand);
|
&rand);
|
||||||
|
}
|
||||||
rcu_torture_pipe_update(old_rp);
|
rcu_torture_pipe_update(old_rp);
|
||||||
break;
|
break;
|
||||||
case RTWS_POLL_GET_EXP:
|
case RTWS_POLL_GET_EXP:
|
||||||
@@ -3388,13 +3432,13 @@ static void rcu_test_debug_objects(void)
|
|||||||
/* Try to queue the rh2 pair of callbacks for the same grace period. */
|
/* Try to queue the rh2 pair of callbacks for the same grace period. */
|
||||||
preempt_disable(); /* Prevent preemption from interrupting test. */
|
preempt_disable(); /* Prevent preemption from interrupting test. */
|
||||||
rcu_read_lock(); /* Make it impossible to finish a grace period. */
|
rcu_read_lock(); /* Make it impossible to finish a grace period. */
|
||||||
call_rcu(&rh1, rcu_torture_leak_cb); /* Start grace period. */
|
call_rcu_hurry(&rh1, rcu_torture_leak_cb); /* Start grace period. */
|
||||||
local_irq_disable(); /* Make it harder to start a new grace period. */
|
local_irq_disable(); /* Make it harder to start a new grace period. */
|
||||||
call_rcu(&rh2, rcu_torture_leak_cb);
|
call_rcu_hurry(&rh2, rcu_torture_leak_cb);
|
||||||
call_rcu(&rh2, rcu_torture_err_cb); /* Duplicate callback. */
|
call_rcu_hurry(&rh2, rcu_torture_err_cb); /* Duplicate callback. */
|
||||||
if (rhp) {
|
if (rhp) {
|
||||||
call_rcu(rhp, rcu_torture_leak_cb);
|
call_rcu_hurry(rhp, rcu_torture_leak_cb);
|
||||||
call_rcu(rhp, rcu_torture_err_cb); /* Another duplicate callback. */
|
call_rcu_hurry(rhp, rcu_torture_err_cb); /* Another duplicate callback. */
|
||||||
}
|
}
|
||||||
local_irq_enable();
|
local_irq_enable();
|
||||||
rcu_read_unlock();
|
rcu_read_unlock();
|
||||||
|
|||||||
@@ -417,7 +417,7 @@ static unsigned long srcu_readers_lock_idx(struct srcu_struct *ssp, int idx)
|
|||||||
for_each_possible_cpu(cpu) {
|
for_each_possible_cpu(cpu) {
|
||||||
struct srcu_data *cpuc = per_cpu_ptr(ssp->sda, cpu);
|
struct srcu_data *cpuc = per_cpu_ptr(ssp->sda, cpu);
|
||||||
|
|
||||||
sum += READ_ONCE(cpuc->srcu_lock_count[idx]);
|
sum += atomic_long_read(&cpuc->srcu_lock_count[idx]);
|
||||||
}
|
}
|
||||||
return sum;
|
return sum;
|
||||||
}
|
}
|
||||||
@@ -429,13 +429,18 @@ static unsigned long srcu_readers_lock_idx(struct srcu_struct *ssp, int idx)
|
|||||||
static unsigned long srcu_readers_unlock_idx(struct srcu_struct *ssp, int idx)
|
static unsigned long srcu_readers_unlock_idx(struct srcu_struct *ssp, int idx)
|
||||||
{
|
{
|
||||||
int cpu;
|
int cpu;
|
||||||
|
unsigned long mask = 0;
|
||||||
unsigned long sum = 0;
|
unsigned long sum = 0;
|
||||||
|
|
||||||
for_each_possible_cpu(cpu) {
|
for_each_possible_cpu(cpu) {
|
||||||
struct srcu_data *cpuc = per_cpu_ptr(ssp->sda, cpu);
|
struct srcu_data *cpuc = per_cpu_ptr(ssp->sda, cpu);
|
||||||
|
|
||||||
sum += READ_ONCE(cpuc->srcu_unlock_count[idx]);
|
sum += atomic_long_read(&cpuc->srcu_unlock_count[idx]);
|
||||||
|
if (IS_ENABLED(CONFIG_PROVE_RCU))
|
||||||
|
mask = mask | READ_ONCE(cpuc->srcu_nmi_safety);
|
||||||
}
|
}
|
||||||
|
WARN_ONCE(IS_ENABLED(CONFIG_PROVE_RCU) && (mask & (mask >> 1)),
|
||||||
|
"Mixed NMI-safe readers for srcu_struct at %ps.\n", ssp);
|
||||||
return sum;
|
return sum;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -503,10 +508,10 @@ static bool srcu_readers_active(struct srcu_struct *ssp)
|
|||||||
for_each_possible_cpu(cpu) {
|
for_each_possible_cpu(cpu) {
|
||||||
struct srcu_data *cpuc = per_cpu_ptr(ssp->sda, cpu);
|
struct srcu_data *cpuc = per_cpu_ptr(ssp->sda, cpu);
|
||||||
|
|
||||||
sum += READ_ONCE(cpuc->srcu_lock_count[0]);
|
sum += atomic_long_read(&cpuc->srcu_lock_count[0]);
|
||||||
sum += READ_ONCE(cpuc->srcu_lock_count[1]);
|
sum += atomic_long_read(&cpuc->srcu_lock_count[1]);
|
||||||
sum -= READ_ONCE(cpuc->srcu_unlock_count[0]);
|
sum -= atomic_long_read(&cpuc->srcu_unlock_count[0]);
|
||||||
sum -= READ_ONCE(cpuc->srcu_unlock_count[1]);
|
sum -= atomic_long_read(&cpuc->srcu_unlock_count[1]);
|
||||||
}
|
}
|
||||||
return sum;
|
return sum;
|
||||||
}
|
}
|
||||||
@@ -626,6 +631,29 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)
|
|||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
|
EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
|
||||||
|
|
||||||
|
#ifdef CONFIG_PROVE_RCU
|
||||||
|
/*
|
||||||
|
* Check for consistent NMI safety.
|
||||||
|
*/
|
||||||
|
void srcu_check_nmi_safety(struct srcu_struct *ssp, bool nmi_safe)
|
||||||
|
{
|
||||||
|
int nmi_safe_mask = 1 << nmi_safe;
|
||||||
|
int old_nmi_safe_mask;
|
||||||
|
struct srcu_data *sdp;
|
||||||
|
|
||||||
|
/* NMI-unsafe use in NMI is a bad sign */
|
||||||
|
WARN_ON_ONCE(!nmi_safe && in_nmi());
|
||||||
|
sdp = raw_cpu_ptr(ssp->sda);
|
||||||
|
old_nmi_safe_mask = READ_ONCE(sdp->srcu_nmi_safety);
|
||||||
|
if (!old_nmi_safe_mask) {
|
||||||
|
WRITE_ONCE(sdp->srcu_nmi_safety, nmi_safe_mask);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
WARN_ONCE(old_nmi_safe_mask != nmi_safe_mask, "CPU %d old state %d new state %d\n", sdp->cpu, old_nmi_safe_mask, nmi_safe_mask);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(srcu_check_nmi_safety);
|
||||||
|
#endif /* CONFIG_PROVE_RCU */
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Counts the new reader in the appropriate per-CPU element of the
|
* Counts the new reader in the appropriate per-CPU element of the
|
||||||
* srcu_struct.
|
* srcu_struct.
|
||||||
@@ -636,7 +664,7 @@ int __srcu_read_lock(struct srcu_struct *ssp)
|
|||||||
int idx;
|
int idx;
|
||||||
|
|
||||||
idx = READ_ONCE(ssp->srcu_idx) & 0x1;
|
idx = READ_ONCE(ssp->srcu_idx) & 0x1;
|
||||||
this_cpu_inc(ssp->sda->srcu_lock_count[idx]);
|
this_cpu_inc(ssp->sda->srcu_lock_count[idx].counter);
|
||||||
smp_mb(); /* B */ /* Avoid leaking the critical section. */
|
smp_mb(); /* B */ /* Avoid leaking the critical section. */
|
||||||
return idx;
|
return idx;
|
||||||
}
|
}
|
||||||
@@ -650,10 +678,45 @@ EXPORT_SYMBOL_GPL(__srcu_read_lock);
|
|||||||
void __srcu_read_unlock(struct srcu_struct *ssp, int idx)
|
void __srcu_read_unlock(struct srcu_struct *ssp, int idx)
|
||||||
{
|
{
|
||||||
smp_mb(); /* C */ /* Avoid leaking the critical section. */
|
smp_mb(); /* C */ /* Avoid leaking the critical section. */
|
||||||
this_cpu_inc(ssp->sda->srcu_unlock_count[idx]);
|
this_cpu_inc(ssp->sda->srcu_unlock_count[idx].counter);
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(__srcu_read_unlock);
|
EXPORT_SYMBOL_GPL(__srcu_read_unlock);
|
||||||
|
|
||||||
|
#ifdef CONFIG_NEED_SRCU_NMI_SAFE
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Counts the new reader in the appropriate per-CPU element of the
|
||||||
|
* srcu_struct, but in an NMI-safe manner using RMW atomics.
|
||||||
|
* Returns an index that must be passed to the matching srcu_read_unlock().
|
||||||
|
*/
|
||||||
|
int __srcu_read_lock_nmisafe(struct srcu_struct *ssp)
|
||||||
|
{
|
||||||
|
int idx;
|
||||||
|
struct srcu_data *sdp = raw_cpu_ptr(ssp->sda);
|
||||||
|
|
||||||
|
idx = READ_ONCE(ssp->srcu_idx) & 0x1;
|
||||||
|
atomic_long_inc(&sdp->srcu_lock_count[idx]);
|
||||||
|
smp_mb__after_atomic(); /* B */ /* Avoid leaking the critical section. */
|
||||||
|
return idx;
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(__srcu_read_lock_nmisafe);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Removes the count for the old reader from the appropriate per-CPU
|
||||||
|
* element of the srcu_struct. Note that this may well be a different
|
||||||
|
* CPU than that which was incremented by the corresponding srcu_read_lock().
|
||||||
|
*/
|
||||||
|
void __srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx)
|
||||||
|
{
|
||||||
|
struct srcu_data *sdp = raw_cpu_ptr(ssp->sda);
|
||||||
|
|
||||||
|
smp_mb__before_atomic(); /* C */ /* Avoid leaking the critical section. */
|
||||||
|
atomic_long_inc(&sdp->srcu_unlock_count[idx]);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(__srcu_read_unlock_nmisafe);
|
||||||
|
|
||||||
|
#endif // CONFIG_NEED_SRCU_NMI_SAFE
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Start an SRCU grace period.
|
* Start an SRCU grace period.
|
||||||
*/
|
*/
|
||||||
@@ -1090,7 +1153,12 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
|
|||||||
int ss_state;
|
int ss_state;
|
||||||
|
|
||||||
check_init_srcu_struct(ssp);
|
check_init_srcu_struct(ssp);
|
||||||
idx = srcu_read_lock(ssp);
|
/*
|
||||||
|
* While starting a new grace period, make sure we are in an
|
||||||
|
* SRCU read-side critical section so that the grace-period
|
||||||
|
* sequence number cannot wrap around in the meantime.
|
||||||
|
*/
|
||||||
|
idx = __srcu_read_lock_nmisafe(ssp);
|
||||||
ss_state = smp_load_acquire(&ssp->srcu_size_state);
|
ss_state = smp_load_acquire(&ssp->srcu_size_state);
|
||||||
if (ss_state < SRCU_SIZE_WAIT_CALL)
|
if (ss_state < SRCU_SIZE_WAIT_CALL)
|
||||||
sdp = per_cpu_ptr(ssp->sda, 0);
|
sdp = per_cpu_ptr(ssp->sda, 0);
|
||||||
@@ -1123,7 +1191,7 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
|
|||||||
srcu_funnel_gp_start(ssp, sdp, s, do_norm);
|
srcu_funnel_gp_start(ssp, sdp, s, do_norm);
|
||||||
else if (needexp)
|
else if (needexp)
|
||||||
srcu_funnel_exp_start(ssp, sdp_mynode, s);
|
srcu_funnel_exp_start(ssp, sdp_mynode, s);
|
||||||
srcu_read_unlock(ssp, idx);
|
__srcu_read_unlock_nmisafe(ssp, idx);
|
||||||
return s;
|
return s;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -1427,13 +1495,13 @@ void srcu_barrier(struct srcu_struct *ssp)
|
|||||||
/* Initial count prevents reaching zero until all CBs are posted. */
|
/* Initial count prevents reaching zero until all CBs are posted. */
|
||||||
atomic_set(&ssp->srcu_barrier_cpu_cnt, 1);
|
atomic_set(&ssp->srcu_barrier_cpu_cnt, 1);
|
||||||
|
|
||||||
idx = srcu_read_lock(ssp);
|
idx = __srcu_read_lock_nmisafe(ssp);
|
||||||
if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
|
if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_BARRIER)
|
||||||
srcu_barrier_one_cpu(ssp, per_cpu_ptr(ssp->sda, 0));
|
srcu_barrier_one_cpu(ssp, per_cpu_ptr(ssp->sda, 0));
|
||||||
else
|
else
|
||||||
for_each_possible_cpu(cpu)
|
for_each_possible_cpu(cpu)
|
||||||
srcu_barrier_one_cpu(ssp, per_cpu_ptr(ssp->sda, cpu));
|
srcu_barrier_one_cpu(ssp, per_cpu_ptr(ssp->sda, cpu));
|
||||||
srcu_read_unlock(ssp, idx);
|
__srcu_read_unlock_nmisafe(ssp, idx);
|
||||||
|
|
||||||
/* Remove the initial count, at which point reaching zero can happen. */
|
/* Remove the initial count, at which point reaching zero can happen. */
|
||||||
if (atomic_dec_and_test(&ssp->srcu_barrier_cpu_cnt))
|
if (atomic_dec_and_test(&ssp->srcu_barrier_cpu_cnt))
|
||||||
@@ -1687,8 +1755,8 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf)
|
|||||||
struct srcu_data *sdp;
|
struct srcu_data *sdp;
|
||||||
|
|
||||||
sdp = per_cpu_ptr(ssp->sda, cpu);
|
sdp = per_cpu_ptr(ssp->sda, cpu);
|
||||||
u0 = data_race(sdp->srcu_unlock_count[!idx]);
|
u0 = data_race(atomic_long_read(&sdp->srcu_unlock_count[!idx]));
|
||||||
u1 = data_race(sdp->srcu_unlock_count[idx]);
|
u1 = data_race(atomic_long_read(&sdp->srcu_unlock_count[idx]));
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Make sure that a lock is always counted if the corresponding
|
* Make sure that a lock is always counted if the corresponding
|
||||||
@@ -1696,8 +1764,8 @@ void srcu_torture_stats_print(struct srcu_struct *ssp, char *tt, char *tf)
|
|||||||
*/
|
*/
|
||||||
smp_rmb();
|
smp_rmb();
|
||||||
|
|
||||||
l0 = data_race(sdp->srcu_lock_count[!idx]);
|
l0 = data_race(atomic_long_read(&sdp->srcu_lock_count[!idx]));
|
||||||
l1 = data_race(sdp->srcu_lock_count[idx]);
|
l1 = data_race(atomic_long_read(&sdp->srcu_lock_count[idx]));
|
||||||
|
|
||||||
c0 = l0 - u0;
|
c0 = l0 - u0;
|
||||||
c1 = l1 - u1;
|
c1 = l1 - u1;
|
||||||
|
|||||||
@@ -44,7 +44,7 @@ static void rcu_sync_func(struct rcu_head *rhp);
|
|||||||
|
|
||||||
static void rcu_sync_call(struct rcu_sync *rsp)
|
static void rcu_sync_call(struct rcu_sync *rsp)
|
||||||
{
|
{
|
||||||
call_rcu(&rsp->cb_head, rcu_sync_func);
|
call_rcu_hurry(&rsp->cb_head, rcu_sync_func);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
|||||||
@@ -728,7 +728,7 @@ static void rcu_tasks_wait_gp(struct rcu_tasks *rtp)
|
|||||||
if (rtsi > 0 && !reported && time_after(j, lastinfo + rtsi)) {
|
if (rtsi > 0 && !reported && time_after(j, lastinfo + rtsi)) {
|
||||||
lastinfo = j;
|
lastinfo = j;
|
||||||
rtsi = rtsi * rcu_task_stall_info_mult;
|
rtsi = rtsi * rcu_task_stall_info_mult;
|
||||||
pr_info("%s: %s grace period %lu is %lu jiffies old.\n",
|
pr_info("%s: %s grace period number %lu (since boot) is %lu jiffies old.\n",
|
||||||
__func__, rtp->kname, rtp->tasks_gp_seq, j - rtp->gp_start);
|
__func__, rtp->kname, rtp->tasks_gp_seq, j - rtp->gp_start);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -44,7 +44,7 @@ static struct rcu_ctrlblk rcu_ctrlblk = {
|
|||||||
|
|
||||||
void rcu_barrier(void)
|
void rcu_barrier(void)
|
||||||
{
|
{
|
||||||
wait_rcu_gp(call_rcu);
|
wait_rcu_gp(call_rcu_hurry);
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL(rcu_barrier);
|
EXPORT_SYMBOL(rcu_barrier);
|
||||||
|
|
||||||
|
|||||||
@@ -301,12 +301,6 @@ static bool rcu_dynticks_in_eqs(int snap)
|
|||||||
return !(snap & RCU_DYNTICKS_IDX);
|
return !(snap & RCU_DYNTICKS_IDX);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Return true if the specified CPU is currently idle from an RCU viewpoint. */
|
|
||||||
bool rcu_is_idle_cpu(int cpu)
|
|
||||||
{
|
|
||||||
return rcu_dynticks_in_eqs(rcu_dynticks_snap(cpu));
|
|
||||||
}
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Return true if the CPU corresponding to the specified rcu_data
|
* Return true if the CPU corresponding to the specified rcu_data
|
||||||
* structure has spent some time in an extended quiescent state since
|
* structure has spent some time in an extended quiescent state since
|
||||||
@@ -2108,7 +2102,7 @@ int rcutree_dying_cpu(unsigned int cpu)
|
|||||||
if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
|
if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
|
||||||
return 0;
|
return 0;
|
||||||
|
|
||||||
blkd = !!(rnp->qsmask & rdp->grpmask);
|
blkd = !!(READ_ONCE(rnp->qsmask) & rdp->grpmask);
|
||||||
trace_rcu_grace_period(rcu_state.name, READ_ONCE(rnp->gp_seq),
|
trace_rcu_grace_period(rcu_state.name, READ_ONCE(rnp->gp_seq),
|
||||||
blkd ? TPS("cpuofl-bgp") : TPS("cpuofl"));
|
blkd ? TPS("cpuofl-bgp") : TPS("cpuofl"));
|
||||||
return 0;
|
return 0;
|
||||||
@@ -2418,7 +2412,7 @@ void rcu_force_quiescent_state(void)
|
|||||||
struct rcu_node *rnp_old = NULL;
|
struct rcu_node *rnp_old = NULL;
|
||||||
|
|
||||||
/* Funnel through hierarchy to reduce memory contention. */
|
/* Funnel through hierarchy to reduce memory contention. */
|
||||||
rnp = __this_cpu_read(rcu_data.mynode);
|
rnp = raw_cpu_read(rcu_data.mynode);
|
||||||
for (; rnp != NULL; rnp = rnp->parent) {
|
for (; rnp != NULL; rnp = rnp->parent) {
|
||||||
ret = (READ_ONCE(rcu_state.gp_flags) & RCU_GP_FLAG_FQS) ||
|
ret = (READ_ONCE(rcu_state.gp_flags) & RCU_GP_FLAG_FQS) ||
|
||||||
!raw_spin_trylock(&rnp->fqslock);
|
!raw_spin_trylock(&rnp->fqslock);
|
||||||
@@ -2730,47 +2724,8 @@ static void check_cb_ovld(struct rcu_data *rdp)
|
|||||||
raw_spin_unlock_rcu_node(rnp);
|
raw_spin_unlock_rcu_node(rnp);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
static void
|
||||||
* call_rcu() - Queue an RCU callback for invocation after a grace period.
|
__call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy)
|
||||||
* @head: structure to be used for queueing the RCU updates.
|
|
||||||
* @func: actual callback function to be invoked after the grace period
|
|
||||||
*
|
|
||||||
* The callback function will be invoked some time after a full grace
|
|
||||||
* period elapses, in other words after all pre-existing RCU read-side
|
|
||||||
* critical sections have completed. However, the callback function
|
|
||||||
* might well execute concurrently with RCU read-side critical sections
|
|
||||||
* that started after call_rcu() was invoked.
|
|
||||||
*
|
|
||||||
* RCU read-side critical sections are delimited by rcu_read_lock()
|
|
||||||
* and rcu_read_unlock(), and may be nested. In addition, but only in
|
|
||||||
* v5.0 and later, regions of code across which interrupts, preemption,
|
|
||||||
* or softirqs have been disabled also serve as RCU read-side critical
|
|
||||||
* sections. This includes hardware interrupt handlers, softirq handlers,
|
|
||||||
* and NMI handlers.
|
|
||||||
*
|
|
||||||
* Note that all CPUs must agree that the grace period extended beyond
|
|
||||||
* all pre-existing RCU read-side critical section. On systems with more
|
|
||||||
* than one CPU, this means that when "func()" is invoked, each CPU is
|
|
||||||
* guaranteed to have executed a full memory barrier since the end of its
|
|
||||||
* last RCU read-side critical section whose beginning preceded the call
|
|
||||||
* to call_rcu(). It also means that each CPU executing an RCU read-side
|
|
||||||
* critical section that continues beyond the start of "func()" must have
|
|
||||||
* executed a memory barrier after the call_rcu() but before the beginning
|
|
||||||
* of that RCU read-side critical section. Note that these guarantees
|
|
||||||
* include CPUs that are offline, idle, or executing in user mode, as
|
|
||||||
* well as CPUs that are executing in the kernel.
|
|
||||||
*
|
|
||||||
* Furthermore, if CPU A invoked call_rcu() and CPU B invoked the
|
|
||||||
* resulting RCU callback function "func()", then both CPU A and CPU B are
|
|
||||||
* guaranteed to execute a full memory barrier during the time interval
|
|
||||||
* between the call to call_rcu() and the invocation of "func()" -- even
|
|
||||||
* if CPU A and CPU B are the same CPU (but again only if the system has
|
|
||||||
* more than one CPU).
|
|
||||||
*
|
|
||||||
* Implementation of these memory-ordering guarantees is described here:
|
|
||||||
* Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst.
|
|
||||||
*/
|
|
||||||
void call_rcu(struct rcu_head *head, rcu_callback_t func)
|
|
||||||
{
|
{
|
||||||
static atomic_t doublefrees;
|
static atomic_t doublefrees;
|
||||||
unsigned long flags;
|
unsigned long flags;
|
||||||
@@ -2811,7 +2766,7 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
|
|||||||
}
|
}
|
||||||
|
|
||||||
check_cb_ovld(rdp);
|
check_cb_ovld(rdp);
|
||||||
if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags))
|
if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags, lazy))
|
||||||
return; // Enqueued onto ->nocb_bypass, so just leave.
|
return; // Enqueued onto ->nocb_bypass, so just leave.
|
||||||
// If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock.
|
// If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired ->nocb_lock.
|
||||||
rcu_segcblist_enqueue(&rdp->cblist, head);
|
rcu_segcblist_enqueue(&rdp->cblist, head);
|
||||||
@@ -2833,8 +2788,84 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
|
|||||||
local_irq_restore(flags);
|
local_irq_restore(flags);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(call_rcu);
|
|
||||||
|
|
||||||
|
#ifdef CONFIG_RCU_LAZY
|
||||||
|
/**
|
||||||
|
* call_rcu_hurry() - Queue RCU callback for invocation after grace period, and
|
||||||
|
* flush all lazy callbacks (including the new one) to the main ->cblist while
|
||||||
|
* doing so.
|
||||||
|
*
|
||||||
|
* @head: structure to be used for queueing the RCU updates.
|
||||||
|
* @func: actual callback function to be invoked after the grace period
|
||||||
|
*
|
||||||
|
* The callback function will be invoked some time after a full grace
|
||||||
|
* period elapses, in other words after all pre-existing RCU read-side
|
||||||
|
* critical sections have completed.
|
||||||
|
*
|
||||||
|
* Use this API instead of call_rcu() if you don't want the callback to be
|
||||||
|
* invoked after very long periods of time, which can happen on systems without
|
||||||
|
* memory pressure and on systems which are lightly loaded or mostly idle.
|
||||||
|
* This function will cause callbacks to be invoked sooner than later at the
|
||||||
|
* expense of extra power. Other than that, this function is identical to, and
|
||||||
|
* reuses call_rcu()'s logic. Refer to call_rcu() for more details about memory
|
||||||
|
* ordering and other functionality.
|
||||||
|
*/
|
||||||
|
void call_rcu_hurry(struct rcu_head *head, rcu_callback_t func)
|
||||||
|
{
|
||||||
|
return __call_rcu_common(head, func, false);
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(call_rcu_hurry);
|
||||||
|
#endif
|
||||||
|
|
||||||
|
/**
|
||||||
|
* call_rcu() - Queue an RCU callback for invocation after a grace period.
|
||||||
|
* By default the callbacks are 'lazy' and are kept hidden from the main
|
||||||
|
* ->cblist to prevent starting of grace periods too soon.
|
||||||
|
* If you desire grace periods to start very soon, use call_rcu_hurry().
|
||||||
|
*
|
||||||
|
* @head: structure to be used for queueing the RCU updates.
|
||||||
|
* @func: actual callback function to be invoked after the grace period
|
||||||
|
*
|
||||||
|
* The callback function will be invoked some time after a full grace
|
||||||
|
* period elapses, in other words after all pre-existing RCU read-side
|
||||||
|
* critical sections have completed. However, the callback function
|
||||||
|
* might well execute concurrently with RCU read-side critical sections
|
||||||
|
* that started after call_rcu() was invoked.
|
||||||
|
*
|
||||||
|
* RCU read-side critical sections are delimited by rcu_read_lock()
|
||||||
|
* and rcu_read_unlock(), and may be nested. In addition, but only in
|
||||||
|
* v5.0 and later, regions of code across which interrupts, preemption,
|
||||||
|
* or softirqs have been disabled also serve as RCU read-side critical
|
||||||
|
* sections. This includes hardware interrupt handlers, softirq handlers,
|
||||||
|
* and NMI handlers.
|
||||||
|
*
|
||||||
|
* Note that all CPUs must agree that the grace period extended beyond
|
||||||
|
* all pre-existing RCU read-side critical section. On systems with more
|
||||||
|
* than one CPU, this means that when "func()" is invoked, each CPU is
|
||||||
|
* guaranteed to have executed a full memory barrier since the end of its
|
||||||
|
* last RCU read-side critical section whose beginning preceded the call
|
||||||
|
* to call_rcu(). It also means that each CPU executing an RCU read-side
|
||||||
|
* critical section that continues beyond the start of "func()" must have
|
||||||
|
* executed a memory barrier after the call_rcu() but before the beginning
|
||||||
|
* of that RCU read-side critical section. Note that these guarantees
|
||||||
|
* include CPUs that are offline, idle, or executing in user mode, as
|
||||||
|
* well as CPUs that are executing in the kernel.
|
||||||
|
*
|
||||||
|
* Furthermore, if CPU A invoked call_rcu() and CPU B invoked the
|
||||||
|
* resulting RCU callback function "func()", then both CPU A and CPU B are
|
||||||
|
* guaranteed to execute a full memory barrier during the time interval
|
||||||
|
* between the call to call_rcu() and the invocation of "func()" -- even
|
||||||
|
* if CPU A and CPU B are the same CPU (but again only if the system has
|
||||||
|
* more than one CPU).
|
||||||
|
*
|
||||||
|
* Implementation of these memory-ordering guarantees is described here:
|
||||||
|
* Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst.
|
||||||
|
*/
|
||||||
|
void call_rcu(struct rcu_head *head, rcu_callback_t func)
|
||||||
|
{
|
||||||
|
return __call_rcu_common(head, func, IS_ENABLED(CONFIG_RCU_LAZY));
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL_GPL(call_rcu);
|
||||||
|
|
||||||
/* Maximum number of jiffies to wait before draining a batch. */
|
/* Maximum number of jiffies to wait before draining a batch. */
|
||||||
#define KFREE_DRAIN_JIFFIES (5 * HZ)
|
#define KFREE_DRAIN_JIFFIES (5 * HZ)
|
||||||
@@ -3509,7 +3540,7 @@ void synchronize_rcu(void)
|
|||||||
if (rcu_gp_is_expedited())
|
if (rcu_gp_is_expedited())
|
||||||
synchronize_rcu_expedited();
|
synchronize_rcu_expedited();
|
||||||
else
|
else
|
||||||
wait_rcu_gp(call_rcu);
|
wait_rcu_gp(call_rcu_hurry);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -3896,6 +3927,8 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
|
|||||||
{
|
{
|
||||||
unsigned long gseq = READ_ONCE(rcu_state.barrier_sequence);
|
unsigned long gseq = READ_ONCE(rcu_state.barrier_sequence);
|
||||||
unsigned long lseq = READ_ONCE(rdp->barrier_seq_snap);
|
unsigned long lseq = READ_ONCE(rdp->barrier_seq_snap);
|
||||||
|
bool wake_nocb = false;
|
||||||
|
bool was_alldone = false;
|
||||||
|
|
||||||
lockdep_assert_held(&rcu_state.barrier_lock);
|
lockdep_assert_held(&rcu_state.barrier_lock);
|
||||||
if (rcu_seq_state(lseq) || !rcu_seq_state(gseq) || rcu_seq_ctr(lseq) != rcu_seq_ctr(gseq))
|
if (rcu_seq_state(lseq) || !rcu_seq_state(gseq) || rcu_seq_ctr(lseq) != rcu_seq_ctr(gseq))
|
||||||
@@ -3904,7 +3937,14 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
|
|||||||
rdp->barrier_head.func = rcu_barrier_callback;
|
rdp->barrier_head.func = rcu_barrier_callback;
|
||||||
debug_rcu_head_queue(&rdp->barrier_head);
|
debug_rcu_head_queue(&rdp->barrier_head);
|
||||||
rcu_nocb_lock(rdp);
|
rcu_nocb_lock(rdp);
|
||||||
WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
|
/*
|
||||||
|
* Flush bypass and wakeup rcuog if we add callbacks to an empty regular
|
||||||
|
* queue. This way we don't wait for bypass timer that can reach seconds
|
||||||
|
* if it's fully lazy.
|
||||||
|
*/
|
||||||
|
was_alldone = rcu_rdp_is_offloaded(rdp) && !rcu_segcblist_pend_cbs(&rdp->cblist);
|
||||||
|
WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false));
|
||||||
|
wake_nocb = was_alldone && rcu_segcblist_pend_cbs(&rdp->cblist);
|
||||||
if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head)) {
|
if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head)) {
|
||||||
atomic_inc(&rcu_state.barrier_cpu_count);
|
atomic_inc(&rcu_state.barrier_cpu_count);
|
||||||
} else {
|
} else {
|
||||||
@@ -3912,6 +3952,8 @@ static void rcu_barrier_entrain(struct rcu_data *rdp)
|
|||||||
rcu_barrier_trace(TPS("IRQNQ"), -1, rcu_state.barrier_sequence);
|
rcu_barrier_trace(TPS("IRQNQ"), -1, rcu_state.barrier_sequence);
|
||||||
}
|
}
|
||||||
rcu_nocb_unlock(rdp);
|
rcu_nocb_unlock(rdp);
|
||||||
|
if (wake_nocb)
|
||||||
|
wake_nocb_gp(rdp, false);
|
||||||
smp_store_release(&rdp->barrier_seq_snap, gseq);
|
smp_store_release(&rdp->barrier_seq_snap, gseq);
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -4278,8 +4320,6 @@ void rcu_report_dead(unsigned int cpu)
|
|||||||
// Do any dangling deferred wakeups.
|
// Do any dangling deferred wakeups.
|
||||||
do_nocb_deferred_wakeup(rdp);
|
do_nocb_deferred_wakeup(rdp);
|
||||||
|
|
||||||
/* QS for any half-done expedited grace period. */
|
|
||||||
rcu_report_exp_rdp(rdp);
|
|
||||||
rcu_preempt_deferred_qs(current);
|
rcu_preempt_deferred_qs(current);
|
||||||
|
|
||||||
/* Remove outgoing CPU from mask in the leaf rcu_node structure. */
|
/* Remove outgoing CPU from mask in the leaf rcu_node structure. */
|
||||||
@@ -4327,7 +4367,7 @@ void rcutree_migrate_callbacks(int cpu)
|
|||||||
my_rdp = this_cpu_ptr(&rcu_data);
|
my_rdp = this_cpu_ptr(&rcu_data);
|
||||||
my_rnp = my_rdp->mynode;
|
my_rnp = my_rdp->mynode;
|
||||||
rcu_nocb_lock(my_rdp); /* irqs already disabled. */
|
rcu_nocb_lock(my_rdp); /* irqs already disabled. */
|
||||||
WARN_ON_ONCE(!rcu_nocb_flush_bypass(my_rdp, NULL, jiffies));
|
WARN_ON_ONCE(!rcu_nocb_flush_bypass(my_rdp, NULL, jiffies, false));
|
||||||
raw_spin_lock_rcu_node(my_rnp); /* irqs already disabled. */
|
raw_spin_lock_rcu_node(my_rnp); /* irqs already disabled. */
|
||||||
/* Leverage recent GPs and set GP for new callbacks. */
|
/* Leverage recent GPs and set GP for new callbacks. */
|
||||||
needwake = rcu_advance_cbs(my_rnp, rdp) ||
|
needwake = rcu_advance_cbs(my_rnp, rdp) ||
|
||||||
|
|||||||
@@ -263,14 +263,16 @@ struct rcu_data {
|
|||||||
unsigned long last_fqs_resched; /* Time of last rcu_resched(). */
|
unsigned long last_fqs_resched; /* Time of last rcu_resched(). */
|
||||||
unsigned long last_sched_clock; /* Jiffies of last rcu_sched_clock_irq(). */
|
unsigned long last_sched_clock; /* Jiffies of last rcu_sched_clock_irq(). */
|
||||||
|
|
||||||
|
long lazy_len; /* Length of buffered lazy callbacks. */
|
||||||
int cpu;
|
int cpu;
|
||||||
};
|
};
|
||||||
|
|
||||||
/* Values for nocb_defer_wakeup field in struct rcu_data. */
|
/* Values for nocb_defer_wakeup field in struct rcu_data. */
|
||||||
#define RCU_NOCB_WAKE_NOT 0
|
#define RCU_NOCB_WAKE_NOT 0
|
||||||
#define RCU_NOCB_WAKE_BYPASS 1
|
#define RCU_NOCB_WAKE_BYPASS 1
|
||||||
#define RCU_NOCB_WAKE 2
|
#define RCU_NOCB_WAKE_LAZY 2
|
||||||
#define RCU_NOCB_WAKE_FORCE 3
|
#define RCU_NOCB_WAKE 3
|
||||||
|
#define RCU_NOCB_WAKE_FORCE 4
|
||||||
|
|
||||||
#define RCU_JIFFIES_TILL_FORCE_QS (1 + (HZ > 250) + (HZ > 500))
|
#define RCU_JIFFIES_TILL_FORCE_QS (1 + (HZ > 250) + (HZ > 500))
|
||||||
/* For jiffies_till_first_fqs and */
|
/* For jiffies_till_first_fqs and */
|
||||||
@@ -439,10 +441,12 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp);
|
|||||||
static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
|
static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
|
||||||
static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq);
|
static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq);
|
||||||
static void rcu_init_one_nocb(struct rcu_node *rnp);
|
static void rcu_init_one_nocb(struct rcu_node *rnp);
|
||||||
|
static bool wake_nocb_gp(struct rcu_data *rdp, bool force);
|
||||||
static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
||||||
unsigned long j);
|
unsigned long j, bool lazy);
|
||||||
static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
||||||
bool *was_alldone, unsigned long flags);
|
bool *was_alldone, unsigned long flags,
|
||||||
|
bool lazy);
|
||||||
static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_empty,
|
static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_empty,
|
||||||
unsigned long flags);
|
unsigned long flags);
|
||||||
static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp, int level);
|
static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp, int level);
|
||||||
|
|||||||
@@ -937,7 +937,7 @@ void synchronize_rcu_expedited(void)
|
|||||||
|
|
||||||
/* If expedited grace periods are prohibited, fall back to normal. */
|
/* If expedited grace periods are prohibited, fall back to normal. */
|
||||||
if (rcu_gp_is_normal()) {
|
if (rcu_gp_is_normal()) {
|
||||||
wait_rcu_gp(call_rcu);
|
wait_rcu_gp(call_rcu_hurry);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -256,6 +256,31 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
|
|||||||
return __wake_nocb_gp(rdp_gp, rdp, force, flags);
|
return __wake_nocb_gp(rdp_gp, rdp, force, flags);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* LAZY_FLUSH_JIFFIES decides the maximum amount of time that
|
||||||
|
* can elapse before lazy callbacks are flushed. Lazy callbacks
|
||||||
|
* could be flushed much earlier for a number of other reasons
|
||||||
|
* however, LAZY_FLUSH_JIFFIES will ensure no lazy callbacks are
|
||||||
|
* left unsubmitted to RCU after those many jiffies.
|
||||||
|
*/
|
||||||
|
#define LAZY_FLUSH_JIFFIES (10 * HZ)
|
||||||
|
static unsigned long jiffies_till_flush = LAZY_FLUSH_JIFFIES;
|
||||||
|
|
||||||
|
#ifdef CONFIG_RCU_LAZY
|
||||||
|
// To be called only from test code.
|
||||||
|
void rcu_lazy_set_jiffies_till_flush(unsigned long jif)
|
||||||
|
{
|
||||||
|
jiffies_till_flush = jif;
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL(rcu_lazy_set_jiffies_till_flush);
|
||||||
|
|
||||||
|
unsigned long rcu_lazy_get_jiffies_till_flush(void)
|
||||||
|
{
|
||||||
|
return jiffies_till_flush;
|
||||||
|
}
|
||||||
|
EXPORT_SYMBOL(rcu_lazy_get_jiffies_till_flush);
|
||||||
|
#endif
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Arrange to wake the GP kthread for this NOCB group at some future
|
* Arrange to wake the GP kthread for this NOCB group at some future
|
||||||
* time when it is safe to do so.
|
* time when it is safe to do so.
|
||||||
@@ -269,10 +294,14 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
|
|||||||
raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
|
raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Bypass wakeup overrides previous deferments. In case
|
* Bypass wakeup overrides previous deferments. In case of
|
||||||
* of callback storm, no need to wake up too early.
|
* callback storms, no need to wake up too early.
|
||||||
*/
|
*/
|
||||||
if (waketype == RCU_NOCB_WAKE_BYPASS) {
|
if (waketype == RCU_NOCB_WAKE_LAZY &&
|
||||||
|
rdp->nocb_defer_wakeup == RCU_NOCB_WAKE_NOT) {
|
||||||
|
mod_timer(&rdp_gp->nocb_timer, jiffies + jiffies_till_flush);
|
||||||
|
WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype);
|
||||||
|
} else if (waketype == RCU_NOCB_WAKE_BYPASS) {
|
||||||
mod_timer(&rdp_gp->nocb_timer, jiffies + 2);
|
mod_timer(&rdp_gp->nocb_timer, jiffies + 2);
|
||||||
WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype);
|
WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype);
|
||||||
} else {
|
} else {
|
||||||
@@ -293,12 +322,16 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
|
|||||||
* proves to be initially empty, just return false because the no-CB GP
|
* proves to be initially empty, just return false because the no-CB GP
|
||||||
* kthread may need to be awakened in this case.
|
* kthread may need to be awakened in this case.
|
||||||
*
|
*
|
||||||
|
* Return true if there was something to be flushed and it succeeded, otherwise
|
||||||
|
* false.
|
||||||
|
*
|
||||||
* Note that this function always returns true if rhp is NULL.
|
* Note that this function always returns true if rhp is NULL.
|
||||||
*/
|
*/
|
||||||
static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp_in,
|
||||||
unsigned long j)
|
unsigned long j, bool lazy)
|
||||||
{
|
{
|
||||||
struct rcu_cblist rcl;
|
struct rcu_cblist rcl;
|
||||||
|
struct rcu_head *rhp = rhp_in;
|
||||||
|
|
||||||
WARN_ON_ONCE(!rcu_rdp_is_offloaded(rdp));
|
WARN_ON_ONCE(!rcu_rdp_is_offloaded(rdp));
|
||||||
rcu_lockdep_assert_cblist_protected(rdp);
|
rcu_lockdep_assert_cblist_protected(rdp);
|
||||||
@@ -310,7 +343,20 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
|||||||
/* Note: ->cblist.len already accounts for ->nocb_bypass contents. */
|
/* Note: ->cblist.len already accounts for ->nocb_bypass contents. */
|
||||||
if (rhp)
|
if (rhp)
|
||||||
rcu_segcblist_inc_len(&rdp->cblist); /* Must precede enqueue. */
|
rcu_segcblist_inc_len(&rdp->cblist); /* Must precede enqueue. */
|
||||||
|
|
||||||
|
/*
|
||||||
|
* If the new CB requested was a lazy one, queue it onto the main
|
||||||
|
* ->cblist so that we can take advantage of the grace-period that will
|
||||||
|
* happen regardless. But queue it onto the bypass list first so that
|
||||||
|
* the lazy CB is ordered with the existing CBs in the bypass list.
|
||||||
|
*/
|
||||||
|
if (lazy && rhp) {
|
||||||
|
rcu_cblist_enqueue(&rdp->nocb_bypass, rhp);
|
||||||
|
rhp = NULL;
|
||||||
|
}
|
||||||
rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp);
|
rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp);
|
||||||
|
WRITE_ONCE(rdp->lazy_len, 0);
|
||||||
|
|
||||||
rcu_segcblist_insert_pend_cbs(&rdp->cblist, &rcl);
|
rcu_segcblist_insert_pend_cbs(&rdp->cblist, &rcl);
|
||||||
WRITE_ONCE(rdp->nocb_bypass_first, j);
|
WRITE_ONCE(rdp->nocb_bypass_first, j);
|
||||||
rcu_nocb_bypass_unlock(rdp);
|
rcu_nocb_bypass_unlock(rdp);
|
||||||
@@ -326,13 +372,13 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
|||||||
* Note that this function always returns true if rhp is NULL.
|
* Note that this function always returns true if rhp is NULL.
|
||||||
*/
|
*/
|
||||||
static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
||||||
unsigned long j)
|
unsigned long j, bool lazy)
|
||||||
{
|
{
|
||||||
if (!rcu_rdp_is_offloaded(rdp))
|
if (!rcu_rdp_is_offloaded(rdp))
|
||||||
return true;
|
return true;
|
||||||
rcu_lockdep_assert_cblist_protected(rdp);
|
rcu_lockdep_assert_cblist_protected(rdp);
|
||||||
rcu_nocb_bypass_lock(rdp);
|
rcu_nocb_bypass_lock(rdp);
|
||||||
return rcu_nocb_do_flush_bypass(rdp, rhp, j);
|
return rcu_nocb_do_flush_bypass(rdp, rhp, j, lazy);
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
@@ -345,7 +391,7 @@ static void rcu_nocb_try_flush_bypass(struct rcu_data *rdp, unsigned long j)
|
|||||||
if (!rcu_rdp_is_offloaded(rdp) ||
|
if (!rcu_rdp_is_offloaded(rdp) ||
|
||||||
!rcu_nocb_bypass_trylock(rdp))
|
!rcu_nocb_bypass_trylock(rdp))
|
||||||
return;
|
return;
|
||||||
WARN_ON_ONCE(!rcu_nocb_do_flush_bypass(rdp, NULL, j));
|
WARN_ON_ONCE(!rcu_nocb_do_flush_bypass(rdp, NULL, j, false));
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
@@ -367,12 +413,14 @@ static void rcu_nocb_try_flush_bypass(struct rcu_data *rdp, unsigned long j)
|
|||||||
* there is only one CPU in operation.
|
* there is only one CPU in operation.
|
||||||
*/
|
*/
|
||||||
static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
||||||
bool *was_alldone, unsigned long flags)
|
bool *was_alldone, unsigned long flags,
|
||||||
|
bool lazy)
|
||||||
{
|
{
|
||||||
unsigned long c;
|
unsigned long c;
|
||||||
unsigned long cur_gp_seq;
|
unsigned long cur_gp_seq;
|
||||||
unsigned long j = jiffies;
|
unsigned long j = jiffies;
|
||||||
long ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
|
long ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
|
||||||
|
bool bypass_is_lazy = (ncbs == READ_ONCE(rdp->lazy_len));
|
||||||
|
|
||||||
lockdep_assert_irqs_disabled();
|
lockdep_assert_irqs_disabled();
|
||||||
|
|
||||||
@@ -417,24 +465,29 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
|||||||
// If there hasn't yet been all that many ->cblist enqueues
|
// If there hasn't yet been all that many ->cblist enqueues
|
||||||
// this jiffy, tell the caller to enqueue onto ->cblist. But flush
|
// this jiffy, tell the caller to enqueue onto ->cblist. But flush
|
||||||
// ->nocb_bypass first.
|
// ->nocb_bypass first.
|
||||||
if (rdp->nocb_nobypass_count < nocb_nobypass_lim_per_jiffy) {
|
// Lazy CBs throttle this back and do immediate bypass queuing.
|
||||||
|
if (rdp->nocb_nobypass_count < nocb_nobypass_lim_per_jiffy && !lazy) {
|
||||||
rcu_nocb_lock(rdp);
|
rcu_nocb_lock(rdp);
|
||||||
*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
|
*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
|
||||||
if (*was_alldone)
|
if (*was_alldone)
|
||||||
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
|
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
|
||||||
TPS("FirstQ"));
|
TPS("FirstQ"));
|
||||||
WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
|
|
||||||
|
WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j, false));
|
||||||
WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
|
WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
|
||||||
return false; // Caller must enqueue the callback.
|
return false; // Caller must enqueue the callback.
|
||||||
}
|
}
|
||||||
|
|
||||||
// If ->nocb_bypass has been used too long or is too full,
|
// If ->nocb_bypass has been used too long or is too full,
|
||||||
// flush ->nocb_bypass to ->cblist.
|
// flush ->nocb_bypass to ->cblist.
|
||||||
if ((ncbs && j != READ_ONCE(rdp->nocb_bypass_first)) ||
|
if ((ncbs && !bypass_is_lazy && j != READ_ONCE(rdp->nocb_bypass_first)) ||
|
||||||
|
(ncbs && bypass_is_lazy &&
|
||||||
|
(time_after(j, READ_ONCE(rdp->nocb_bypass_first) + jiffies_till_flush))) ||
|
||||||
ncbs >= qhimark) {
|
ncbs >= qhimark) {
|
||||||
rcu_nocb_lock(rdp);
|
rcu_nocb_lock(rdp);
|
||||||
if (!rcu_nocb_flush_bypass(rdp, rhp, j)) {
|
*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
|
||||||
*was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
|
|
||||||
|
if (!rcu_nocb_flush_bypass(rdp, rhp, j, lazy)) {
|
||||||
if (*was_alldone)
|
if (*was_alldone)
|
||||||
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
|
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
|
||||||
TPS("FirstQ"));
|
TPS("FirstQ"));
|
||||||
@@ -447,7 +500,12 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
|||||||
rcu_advance_cbs_nowake(rdp->mynode, rdp);
|
rcu_advance_cbs_nowake(rdp->mynode, rdp);
|
||||||
rdp->nocb_gp_adv_time = j;
|
rdp->nocb_gp_adv_time = j;
|
||||||
}
|
}
|
||||||
rcu_nocb_unlock_irqrestore(rdp, flags);
|
|
||||||
|
// The flush succeeded and we moved CBs into the regular list.
|
||||||
|
// Don't wait for the wake up timer as it may be too far ahead.
|
||||||
|
// Wake up the GP thread now instead, if the cblist was empty.
|
||||||
|
__call_rcu_nocb_wake(rdp, *was_alldone, flags);
|
||||||
|
|
||||||
return true; // Callback already enqueued.
|
return true; // Callback already enqueued.
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -457,13 +515,24 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
|||||||
ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
|
ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
|
||||||
rcu_segcblist_inc_len(&rdp->cblist); /* Must precede enqueue. */
|
rcu_segcblist_inc_len(&rdp->cblist); /* Must precede enqueue. */
|
||||||
rcu_cblist_enqueue(&rdp->nocb_bypass, rhp);
|
rcu_cblist_enqueue(&rdp->nocb_bypass, rhp);
|
||||||
|
|
||||||
|
if (lazy)
|
||||||
|
WRITE_ONCE(rdp->lazy_len, rdp->lazy_len + 1);
|
||||||
|
|
||||||
if (!ncbs) {
|
if (!ncbs) {
|
||||||
WRITE_ONCE(rdp->nocb_bypass_first, j);
|
WRITE_ONCE(rdp->nocb_bypass_first, j);
|
||||||
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstBQ"));
|
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstBQ"));
|
||||||
}
|
}
|
||||||
rcu_nocb_bypass_unlock(rdp);
|
rcu_nocb_bypass_unlock(rdp);
|
||||||
smp_mb(); /* Order enqueue before wake. */
|
smp_mb(); /* Order enqueue before wake. */
|
||||||
if (ncbs) {
|
// A wake up of the grace period kthread or timer adjustment
|
||||||
|
// needs to be done only if:
|
||||||
|
// 1. Bypass list was fully empty before (this is the first
|
||||||
|
// bypass list entry), or:
|
||||||
|
// 2. Both of these conditions are met:
|
||||||
|
// a. The bypass list previously had only lazy CBs, and:
|
||||||
|
// b. The new CB is non-lazy.
|
||||||
|
if (ncbs && (!bypass_is_lazy || lazy)) {
|
||||||
local_irq_restore(flags);
|
local_irq_restore(flags);
|
||||||
} else {
|
} else {
|
||||||
// No-CBs GP kthread might be indefinitely asleep, if so, wake.
|
// No-CBs GP kthread might be indefinitely asleep, if so, wake.
|
||||||
@@ -491,8 +560,10 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
|
|||||||
unsigned long flags)
|
unsigned long flags)
|
||||||
__releases(rdp->nocb_lock)
|
__releases(rdp->nocb_lock)
|
||||||
{
|
{
|
||||||
|
long bypass_len;
|
||||||
unsigned long cur_gp_seq;
|
unsigned long cur_gp_seq;
|
||||||
unsigned long j;
|
unsigned long j;
|
||||||
|
long lazy_len;
|
||||||
long len;
|
long len;
|
||||||
struct task_struct *t;
|
struct task_struct *t;
|
||||||
|
|
||||||
@@ -506,9 +577,16 @@ static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
|
|||||||
}
|
}
|
||||||
// Need to actually to a wakeup.
|
// Need to actually to a wakeup.
|
||||||
len = rcu_segcblist_n_cbs(&rdp->cblist);
|
len = rcu_segcblist_n_cbs(&rdp->cblist);
|
||||||
|
bypass_len = rcu_cblist_n_cbs(&rdp->nocb_bypass);
|
||||||
|
lazy_len = READ_ONCE(rdp->lazy_len);
|
||||||
if (was_alldone) {
|
if (was_alldone) {
|
||||||
rdp->qlen_last_fqs_check = len;
|
rdp->qlen_last_fqs_check = len;
|
||||||
if (!irqs_disabled_flags(flags)) {
|
// Only lazy CBs in bypass list
|
||||||
|
if (lazy_len && bypass_len == lazy_len) {
|
||||||
|
rcu_nocb_unlock_irqrestore(rdp, flags);
|
||||||
|
wake_nocb_gp_defer(rdp, RCU_NOCB_WAKE_LAZY,
|
||||||
|
TPS("WakeLazy"));
|
||||||
|
} else if (!irqs_disabled_flags(flags)) {
|
||||||
/* ... if queue was empty ... */
|
/* ... if queue was empty ... */
|
||||||
rcu_nocb_unlock_irqrestore(rdp, flags);
|
rcu_nocb_unlock_irqrestore(rdp, flags);
|
||||||
wake_nocb_gp(rdp, false);
|
wake_nocb_gp(rdp, false);
|
||||||
@@ -599,12 +677,12 @@ static void nocb_gp_sleep(struct rcu_data *my_rdp, int cpu)
|
|||||||
static void nocb_gp_wait(struct rcu_data *my_rdp)
|
static void nocb_gp_wait(struct rcu_data *my_rdp)
|
||||||
{
|
{
|
||||||
bool bypass = false;
|
bool bypass = false;
|
||||||
long bypass_ncbs;
|
|
||||||
int __maybe_unused cpu = my_rdp->cpu;
|
int __maybe_unused cpu = my_rdp->cpu;
|
||||||
unsigned long cur_gp_seq;
|
unsigned long cur_gp_seq;
|
||||||
unsigned long flags;
|
unsigned long flags;
|
||||||
bool gotcbs = false;
|
bool gotcbs = false;
|
||||||
unsigned long j = jiffies;
|
unsigned long j = jiffies;
|
||||||
|
bool lazy = false;
|
||||||
bool needwait_gp = false; // This prevents actual uninitialized use.
|
bool needwait_gp = false; // This prevents actual uninitialized use.
|
||||||
bool needwake;
|
bool needwake;
|
||||||
bool needwake_gp;
|
bool needwake_gp;
|
||||||
@@ -634,24 +712,43 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
|
|||||||
* won't be ignored for long.
|
* won't be ignored for long.
|
||||||
*/
|
*/
|
||||||
list_for_each_entry(rdp, &my_rdp->nocb_head_rdp, nocb_entry_rdp) {
|
list_for_each_entry(rdp, &my_rdp->nocb_head_rdp, nocb_entry_rdp) {
|
||||||
|
long bypass_ncbs;
|
||||||
|
bool flush_bypass = false;
|
||||||
|
long lazy_ncbs;
|
||||||
|
|
||||||
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Check"));
|
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Check"));
|
||||||
rcu_nocb_lock_irqsave(rdp, flags);
|
rcu_nocb_lock_irqsave(rdp, flags);
|
||||||
lockdep_assert_held(&rdp->nocb_lock);
|
lockdep_assert_held(&rdp->nocb_lock);
|
||||||
bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
|
bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
|
||||||
if (bypass_ncbs &&
|
lazy_ncbs = READ_ONCE(rdp->lazy_len);
|
||||||
|
|
||||||
|
if (bypass_ncbs && (lazy_ncbs == bypass_ncbs) &&
|
||||||
|
(time_after(j, READ_ONCE(rdp->nocb_bypass_first) + jiffies_till_flush) ||
|
||||||
|
bypass_ncbs > 2 * qhimark)) {
|
||||||
|
flush_bypass = true;
|
||||||
|
} else if (bypass_ncbs && (lazy_ncbs != bypass_ncbs) &&
|
||||||
(time_after(j, READ_ONCE(rdp->nocb_bypass_first) + 1) ||
|
(time_after(j, READ_ONCE(rdp->nocb_bypass_first) + 1) ||
|
||||||
bypass_ncbs > 2 * qhimark)) {
|
bypass_ncbs > 2 * qhimark)) {
|
||||||
// Bypass full or old, so flush it.
|
flush_bypass = true;
|
||||||
(void)rcu_nocb_try_flush_bypass(rdp, j);
|
|
||||||
bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
|
|
||||||
} else if (!bypass_ncbs && rcu_segcblist_empty(&rdp->cblist)) {
|
} else if (!bypass_ncbs && rcu_segcblist_empty(&rdp->cblist)) {
|
||||||
rcu_nocb_unlock_irqrestore(rdp, flags);
|
rcu_nocb_unlock_irqrestore(rdp, flags);
|
||||||
continue; /* No callbacks here, try next. */
|
continue; /* No callbacks here, try next. */
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (flush_bypass) {
|
||||||
|
// Bypass full or old, so flush it.
|
||||||
|
(void)rcu_nocb_try_flush_bypass(rdp, j);
|
||||||
|
bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
|
||||||
|
lazy_ncbs = READ_ONCE(rdp->lazy_len);
|
||||||
|
}
|
||||||
|
|
||||||
if (bypass_ncbs) {
|
if (bypass_ncbs) {
|
||||||
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
|
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
|
||||||
TPS("Bypass"));
|
bypass_ncbs == lazy_ncbs ? TPS("Lazy") : TPS("Bypass"));
|
||||||
bypass = true;
|
if (bypass_ncbs == lazy_ncbs)
|
||||||
|
lazy = true;
|
||||||
|
else
|
||||||
|
bypass = true;
|
||||||
}
|
}
|
||||||
rnp = rdp->mynode;
|
rnp = rdp->mynode;
|
||||||
|
|
||||||
@@ -699,12 +796,20 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
|
|||||||
my_rdp->nocb_gp_gp = needwait_gp;
|
my_rdp->nocb_gp_gp = needwait_gp;
|
||||||
my_rdp->nocb_gp_seq = needwait_gp ? wait_gp_seq : 0;
|
my_rdp->nocb_gp_seq = needwait_gp ? wait_gp_seq : 0;
|
||||||
|
|
||||||
if (bypass && !rcu_nocb_poll) {
|
// At least one child with non-empty ->nocb_bypass, so set
|
||||||
// At least one child with non-empty ->nocb_bypass, so set
|
// timer in order to avoid stranding its callbacks.
|
||||||
// timer in order to avoid stranding its callbacks.
|
if (!rcu_nocb_poll) {
|
||||||
wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_BYPASS,
|
// If bypass list only has lazy CBs. Add a deferred lazy wake up.
|
||||||
TPS("WakeBypassIsDeferred"));
|
if (lazy && !bypass) {
|
||||||
|
wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_LAZY,
|
||||||
|
TPS("WakeLazyIsDeferred"));
|
||||||
|
// Otherwise add a deferred bypass wake up.
|
||||||
|
} else if (bypass) {
|
||||||
|
wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_BYPASS,
|
||||||
|
TPS("WakeBypassIsDeferred"));
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if (rcu_nocb_poll) {
|
if (rcu_nocb_poll) {
|
||||||
/* Polling, so trace if first poll in the series. */
|
/* Polling, so trace if first poll in the series. */
|
||||||
if (gotcbs)
|
if (gotcbs)
|
||||||
@@ -1030,7 +1135,7 @@ static long rcu_nocb_rdp_deoffload(void *arg)
|
|||||||
* return false, which means that future calls to rcu_nocb_try_bypass()
|
* return false, which means that future calls to rcu_nocb_try_bypass()
|
||||||
* will refuse to put anything into the bypass.
|
* will refuse to put anything into the bypass.
|
||||||
*/
|
*/
|
||||||
WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
|
WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false));
|
||||||
/*
|
/*
|
||||||
* Start with invoking rcu_core() early. This way if the current thread
|
* Start with invoking rcu_core() early. This way if the current thread
|
||||||
* happens to preempt an ongoing call to rcu_core() in the middle,
|
* happens to preempt an ongoing call to rcu_core() in the middle,
|
||||||
@@ -1207,47 +1312,87 @@ int rcu_nocb_cpu_offload(int cpu)
|
|||||||
}
|
}
|
||||||
EXPORT_SYMBOL_GPL(rcu_nocb_cpu_offload);
|
EXPORT_SYMBOL_GPL(rcu_nocb_cpu_offload);
|
||||||
|
|
||||||
|
static unsigned long
|
||||||
|
lazy_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
|
||||||
|
{
|
||||||
|
int cpu;
|
||||||
|
unsigned long count = 0;
|
||||||
|
|
||||||
|
/* Snapshot count of all CPUs */
|
||||||
|
for_each_possible_cpu(cpu) {
|
||||||
|
struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
|
||||||
|
|
||||||
|
count += READ_ONCE(rdp->lazy_len);
|
||||||
|
}
|
||||||
|
|
||||||
|
return count ? count : SHRINK_EMPTY;
|
||||||
|
}
|
||||||
|
|
||||||
|
static unsigned long
|
||||||
|
lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
|
||||||
|
{
|
||||||
|
int cpu;
|
||||||
|
unsigned long flags;
|
||||||
|
unsigned long count = 0;
|
||||||
|
|
||||||
|
/* Snapshot count of all CPUs */
|
||||||
|
for_each_possible_cpu(cpu) {
|
||||||
|
struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
|
||||||
|
int _count = READ_ONCE(rdp->lazy_len);
|
||||||
|
|
||||||
|
if (_count == 0)
|
||||||
|
continue;
|
||||||
|
rcu_nocb_lock_irqsave(rdp, flags);
|
||||||
|
WRITE_ONCE(rdp->lazy_len, 0);
|
||||||
|
rcu_nocb_unlock_irqrestore(rdp, flags);
|
||||||
|
wake_nocb_gp(rdp, false);
|
||||||
|
sc->nr_to_scan -= _count;
|
||||||
|
count += _count;
|
||||||
|
if (sc->nr_to_scan <= 0)
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
return count ? count : SHRINK_STOP;
|
||||||
|
}
|
||||||
|
|
||||||
|
static struct shrinker lazy_rcu_shrinker = {
|
||||||
|
.count_objects = lazy_rcu_shrink_count,
|
||||||
|
.scan_objects = lazy_rcu_shrink_scan,
|
||||||
|
.batch = 0,
|
||||||
|
.seeks = DEFAULT_SEEKS,
|
||||||
|
};
|
||||||
|
|
||||||
void __init rcu_init_nohz(void)
|
void __init rcu_init_nohz(void)
|
||||||
{
|
{
|
||||||
int cpu;
|
int cpu;
|
||||||
bool need_rcu_nocb_mask = false;
|
|
||||||
bool offload_all = false;
|
|
||||||
struct rcu_data *rdp;
|
struct rcu_data *rdp;
|
||||||
|
const struct cpumask *cpumask = NULL;
|
||||||
#if defined(CONFIG_RCU_NOCB_CPU_DEFAULT_ALL)
|
|
||||||
if (!rcu_state.nocb_is_setup) {
|
|
||||||
need_rcu_nocb_mask = true;
|
|
||||||
offload_all = true;
|
|
||||||
}
|
|
||||||
#endif /* #if defined(CONFIG_RCU_NOCB_CPU_DEFAULT_ALL) */
|
|
||||||
|
|
||||||
#if defined(CONFIG_NO_HZ_FULL)
|
#if defined(CONFIG_NO_HZ_FULL)
|
||||||
if (tick_nohz_full_running && !cpumask_empty(tick_nohz_full_mask)) {
|
if (tick_nohz_full_running && !cpumask_empty(tick_nohz_full_mask))
|
||||||
need_rcu_nocb_mask = true;
|
cpumask = tick_nohz_full_mask;
|
||||||
offload_all = false; /* NO_HZ_FULL has its own mask. */
|
#endif
|
||||||
}
|
|
||||||
#endif /* #if defined(CONFIG_NO_HZ_FULL) */
|
|
||||||
|
|
||||||
if (need_rcu_nocb_mask) {
|
if (IS_ENABLED(CONFIG_RCU_NOCB_CPU_DEFAULT_ALL) &&
|
||||||
|
!rcu_state.nocb_is_setup && !cpumask)
|
||||||
|
cpumask = cpu_possible_mask;
|
||||||
|
|
||||||
|
if (cpumask) {
|
||||||
if (!cpumask_available(rcu_nocb_mask)) {
|
if (!cpumask_available(rcu_nocb_mask)) {
|
||||||
if (!zalloc_cpumask_var(&rcu_nocb_mask, GFP_KERNEL)) {
|
if (!zalloc_cpumask_var(&rcu_nocb_mask, GFP_KERNEL)) {
|
||||||
pr_info("rcu_nocb_mask allocation failed, callback offloading disabled.\n");
|
pr_info("rcu_nocb_mask allocation failed, callback offloading disabled.\n");
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
cpumask_or(rcu_nocb_mask, rcu_nocb_mask, cpumask);
|
||||||
rcu_state.nocb_is_setup = true;
|
rcu_state.nocb_is_setup = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (!rcu_state.nocb_is_setup)
|
if (!rcu_state.nocb_is_setup)
|
||||||
return;
|
return;
|
||||||
|
|
||||||
#if defined(CONFIG_NO_HZ_FULL)
|
if (register_shrinker(&lazy_rcu_shrinker, "rcu-lazy"))
|
||||||
if (tick_nohz_full_running)
|
pr_err("Failed to register lazy_rcu shrinker!\n");
|
||||||
cpumask_or(rcu_nocb_mask, rcu_nocb_mask, tick_nohz_full_mask);
|
|
||||||
#endif /* #if defined(CONFIG_NO_HZ_FULL) */
|
|
||||||
|
|
||||||
if (offload_all)
|
|
||||||
cpumask_setall(rcu_nocb_mask);
|
|
||||||
|
|
||||||
if (!cpumask_subset(rcu_nocb_mask, cpu_possible_mask)) {
|
if (!cpumask_subset(rcu_nocb_mask, cpu_possible_mask)) {
|
||||||
pr_info("\tNote: kernel parameter 'rcu_nocbs=', 'nohz_full', or 'isolcpus=' contains nonexistent CPUs.\n");
|
pr_info("\tNote: kernel parameter 'rcu_nocbs=', 'nohz_full', or 'isolcpus=' contains nonexistent CPUs.\n");
|
||||||
@@ -1284,6 +1429,7 @@ static void __init rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp)
|
|||||||
raw_spin_lock_init(&rdp->nocb_gp_lock);
|
raw_spin_lock_init(&rdp->nocb_gp_lock);
|
||||||
timer_setup(&rdp->nocb_timer, do_nocb_deferred_wakeup_timer, 0);
|
timer_setup(&rdp->nocb_timer, do_nocb_deferred_wakeup_timer, 0);
|
||||||
rcu_cblist_init(&rdp->nocb_bypass);
|
rcu_cblist_init(&rdp->nocb_bypass);
|
||||||
|
WRITE_ONCE(rdp->lazy_len, 0);
|
||||||
mutex_init(&rdp->nocb_gp_kthread_mutex);
|
mutex_init(&rdp->nocb_gp_kthread_mutex);
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -1564,14 +1710,19 @@ static void rcu_init_one_nocb(struct rcu_node *rnp)
|
|||||||
{
|
{
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
|
||||||
|
{
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
||||||
unsigned long j)
|
unsigned long j, bool lazy)
|
||||||
{
|
{
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
|
||||||
bool *was_alldone, unsigned long flags)
|
bool *was_alldone, unsigned long flags, bool lazy)
|
||||||
{
|
{
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1221,11 +1221,13 @@ static void rcu_spawn_one_boost_kthread(struct rcu_node *rnp)
|
|||||||
* We don't include outgoingcpu in the affinity set, use -1 if there is
|
* We don't include outgoingcpu in the affinity set, use -1 if there is
|
||||||
* no outgoing CPU. If there are no CPUs left in the affinity set,
|
* no outgoing CPU. If there are no CPUs left in the affinity set,
|
||||||
* this function allows the kthread to execute on any CPU.
|
* this function allows the kthread to execute on any CPU.
|
||||||
|
*
|
||||||
|
* Any future concurrent calls are serialized via ->boost_kthread_mutex.
|
||||||
*/
|
*/
|
||||||
static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
|
static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
|
||||||
{
|
{
|
||||||
struct task_struct *t = rnp->boost_kthread_task;
|
struct task_struct *t = rnp->boost_kthread_task;
|
||||||
unsigned long mask = rcu_rnp_online_cpus(rnp);
|
unsigned long mask;
|
||||||
cpumask_var_t cm;
|
cpumask_var_t cm;
|
||||||
int cpu;
|
int cpu;
|
||||||
|
|
||||||
@@ -1234,6 +1236,7 @@ static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
|
|||||||
if (!zalloc_cpumask_var(&cm, GFP_KERNEL))
|
if (!zalloc_cpumask_var(&cm, GFP_KERNEL))
|
||||||
return;
|
return;
|
||||||
mutex_lock(&rnp->boost_kthread_mutex);
|
mutex_lock(&rnp->boost_kthread_mutex);
|
||||||
|
mask = rcu_rnp_online_cpus(rnp);
|
||||||
for_each_leaf_node_possible_cpu(rnp, cpu)
|
for_each_leaf_node_possible_cpu(rnp, cpu)
|
||||||
if ((mask & leaf_node_cpu_bit(rnp, cpu)) &&
|
if ((mask & leaf_node_cpu_bit(rnp, cpu)) &&
|
||||||
cpu != outgoingcpu)
|
cpu != outgoingcpu)
|
||||||
|
|||||||
@@ -1771,7 +1771,7 @@ bool queue_rcu_work(struct workqueue_struct *wq, struct rcu_work *rwork)
|
|||||||
|
|
||||||
if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work))) {
|
if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work))) {
|
||||||
rwork->wq = wq;
|
rwork->wq = wq;
|
||||||
call_rcu(&rwork->rcu, rcu_work_rcufn);
|
call_rcu_hurry(&rwork->rcu, rcu_work_rcufn);
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -230,7 +230,8 @@ static void __percpu_ref_switch_to_atomic(struct percpu_ref *ref,
|
|||||||
percpu_ref_noop_confirm_switch;
|
percpu_ref_noop_confirm_switch;
|
||||||
|
|
||||||
percpu_ref_get(ref); /* put after confirmation */
|
percpu_ref_get(ref); /* put after confirmation */
|
||||||
call_rcu(&ref->data->rcu, percpu_ref_switch_to_atomic_rcu);
|
call_rcu_hurry(&ref->data->rcu,
|
||||||
|
percpu_ref_switch_to_atomic_rcu);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void __percpu_ref_switch_to_percpu(struct percpu_ref *ref)
|
static void __percpu_ref_switch_to_percpu(struct percpu_ref *ref)
|
||||||
|
|||||||
@@ -174,7 +174,7 @@ void dst_release(struct dst_entry *dst)
|
|||||||
net_warn_ratelimited("%s: dst:%p refcnt:%d\n",
|
net_warn_ratelimited("%s: dst:%p refcnt:%d\n",
|
||||||
__func__, dst, newrefcnt);
|
__func__, dst, newrefcnt);
|
||||||
if (!newrefcnt)
|
if (!newrefcnt)
|
||||||
call_rcu(&dst->rcu_head, dst_destroy_rcu);
|
call_rcu_hurry(&dst->rcu_head, dst_destroy_rcu);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL(dst_release);
|
EXPORT_SYMBOL(dst_release);
|
||||||
|
|||||||
@@ -234,13 +234,20 @@ static void inet_free_ifa(struct in_ifaddr *ifa)
|
|||||||
call_rcu(&ifa->rcu_head, inet_rcu_free_ifa);
|
call_rcu(&ifa->rcu_head, inet_rcu_free_ifa);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static void in_dev_free_rcu(struct rcu_head *head)
|
||||||
|
{
|
||||||
|
struct in_device *idev = container_of(head, struct in_device, rcu_head);
|
||||||
|
|
||||||
|
kfree(rcu_dereference_protected(idev->mc_hash, 1));
|
||||||
|
kfree(idev);
|
||||||
|
}
|
||||||
|
|
||||||
void in_dev_finish_destroy(struct in_device *idev)
|
void in_dev_finish_destroy(struct in_device *idev)
|
||||||
{
|
{
|
||||||
struct net_device *dev = idev->dev;
|
struct net_device *dev = idev->dev;
|
||||||
|
|
||||||
WARN_ON(idev->ifa_list);
|
WARN_ON(idev->ifa_list);
|
||||||
WARN_ON(idev->mc_list);
|
WARN_ON(idev->mc_list);
|
||||||
kfree(rcu_dereference_protected(idev->mc_hash, 1));
|
|
||||||
#ifdef NET_REFCNT_DEBUG
|
#ifdef NET_REFCNT_DEBUG
|
||||||
pr_debug("%s: %p=%s\n", __func__, idev, dev ? dev->name : "NIL");
|
pr_debug("%s: %p=%s\n", __func__, idev, dev ? dev->name : "NIL");
|
||||||
#endif
|
#endif
|
||||||
@@ -248,7 +255,7 @@ void in_dev_finish_destroy(struct in_device *idev)
|
|||||||
if (!idev->dead)
|
if (!idev->dead)
|
||||||
pr_err("Freeing alive in_device %p\n", idev);
|
pr_err("Freeing alive in_device %p\n", idev);
|
||||||
else
|
else
|
||||||
kfree(idev);
|
call_rcu(&idev->rcu_head, in_dev_free_rcu);
|
||||||
}
|
}
|
||||||
EXPORT_SYMBOL(in_dev_finish_destroy);
|
EXPORT_SYMBOL(in_dev_finish_destroy);
|
||||||
|
|
||||||
@@ -298,12 +305,6 @@ out_kfree:
|
|||||||
goto out;
|
goto out;
|
||||||
}
|
}
|
||||||
|
|
||||||
static void in_dev_rcu_put(struct rcu_head *head)
|
|
||||||
{
|
|
||||||
struct in_device *idev = container_of(head, struct in_device, rcu_head);
|
|
||||||
in_dev_put(idev);
|
|
||||||
}
|
|
||||||
|
|
||||||
static void inetdev_destroy(struct in_device *in_dev)
|
static void inetdev_destroy(struct in_device *in_dev)
|
||||||
{
|
{
|
||||||
struct net_device *dev;
|
struct net_device *dev;
|
||||||
@@ -328,7 +329,7 @@ static void inetdev_destroy(struct in_device *in_dev)
|
|||||||
neigh_parms_release(&arp_tbl, in_dev->arp_parms);
|
neigh_parms_release(&arp_tbl, in_dev->arp_parms);
|
||||||
arp_ifdown(dev);
|
arp_ifdown(dev);
|
||||||
|
|
||||||
call_rcu(&in_dev->rcu_head, in_dev_rcu_put);
|
in_dev_put(in_dev);
|
||||||
}
|
}
|
||||||
|
|
||||||
int inet_addr_onlink(struct in_device *in_dev, __be32 a, __be32 b)
|
int inet_addr_onlink(struct in_device *in_dev, __be32 a, __be32 b)
|
||||||
|
|||||||
@@ -30,9 +30,8 @@ else
|
|||||||
fi
|
fi
|
||||||
scenarios="`echo $scenariosarg | sed -e "s/\<CFLIST\>/$defaultconfigs/g"`"
|
scenarios="`echo $scenariosarg | sed -e "s/\<CFLIST\>/$defaultconfigs/g"`"
|
||||||
|
|
||||||
T=/tmp/config2latex.sh.$$
|
T=`mktemp -d /tmp/config2latex.sh.XXXXXX`
|
||||||
trap 'rm -rf $T' 0
|
trap 'rm -rf $T' 0
|
||||||
mkdir $T
|
|
||||||
|
|
||||||
cat << '---EOF---' >> $T/p.awk
|
cat << '---EOF---' >> $T/p.awk
|
||||||
END {
|
END {
|
||||||
|
|||||||
@@ -29,9 +29,8 @@ else
|
|||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
T=${TMPDIR-/tmp}/config_override.sh.$$
|
T="`mktemp -d ${TMPDIR-/tmp}/config_override.sh.XXXXXX`"
|
||||||
trap 'rm -rf $T' 0
|
trap 'rm -rf $T' 0
|
||||||
mkdir $T
|
|
||||||
|
|
||||||
sed < $override -e 's/^/grep -v "/' -e 's/=.*$/="/' |
|
sed < $override -e 's/^/grep -v "/' -e 's/=.*$/="/' |
|
||||||
awk '
|
awk '
|
||||||
|
|||||||
@@ -7,9 +7,8 @@
|
|||||||
#
|
#
|
||||||
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
|
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
|
||||||
|
|
||||||
T=${TMPDIR-/tmp}/abat-chk-config.sh.$$
|
T="`mktemp -d ${TMPDIR-/tmp}/configcheck.sh.XXXXXX`"
|
||||||
trap 'rm -rf $T' 0
|
trap 'rm -rf $T' 0
|
||||||
mkdir $T
|
|
||||||
|
|
||||||
cat $1 > $T/.config
|
cat $1 > $T/.config
|
||||||
|
|
||||||
|
|||||||
@@ -15,9 +15,8 @@
|
|||||||
#
|
#
|
||||||
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
|
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
|
||||||
|
|
||||||
T=${TMPDIR-/tmp}/configinit.sh.$$
|
T="`mktemp -d ${TMPDIR-/tmp}/configinit.sh.XXXXXX`"
|
||||||
trap 'rm -rf $T' 0
|
trap 'rm -rf $T' 0
|
||||||
mkdir $T
|
|
||||||
|
|
||||||
# Capture config spec file.
|
# Capture config spec file.
|
||||||
|
|
||||||
|
|||||||
@@ -12,9 +12,8 @@
|
|||||||
scriptname=$0
|
scriptname=$0
|
||||||
args="$*"
|
args="$*"
|
||||||
|
|
||||||
T=${TMPDIR-/tmp}/kvm-again.sh.$$
|
T="`mktemp -d ${TMPDIR-/tmp}/kvm-again.sh.XXXXXX`"
|
||||||
trap 'rm -rf $T' 0
|
trap 'rm -rf $T' 0
|
||||||
mkdir $T
|
|
||||||
|
|
||||||
if ! test -d tools/testing/selftests/rcutorture/bin
|
if ! test -d tools/testing/selftests/rcutorture/bin
|
||||||
then
|
then
|
||||||
@@ -51,27 +50,56 @@ RCUTORTURE="`pwd`/tools/testing/selftests/rcutorture"; export RCUTORTURE
|
|||||||
PATH=${RCUTORTURE}/bin:$PATH; export PATH
|
PATH=${RCUTORTURE}/bin:$PATH; export PATH
|
||||||
. functions.sh
|
. functions.sh
|
||||||
|
|
||||||
|
bootargs=
|
||||||
dryrun=
|
dryrun=
|
||||||
dur=
|
dur=
|
||||||
default_link="cp -R"
|
default_link="cp -R"
|
||||||
rundir="`pwd`/tools/testing/selftests/rcutorture/res/`date +%Y.%m.%d-%H.%M.%S-again`"
|
resdir="`pwd`/tools/testing/selftests/rcutorture/res"
|
||||||
|
rundir="$resdir/`date +%Y.%m.%d-%H.%M.%S-again`"
|
||||||
|
got_datestamp=
|
||||||
|
got_rundir=
|
||||||
|
|
||||||
startdate="`date`"
|
startdate="`date`"
|
||||||
starttime="`get_starttime`"
|
starttime="`get_starttime`"
|
||||||
|
|
||||||
usage () {
|
usage () {
|
||||||
echo "Usage: $scriptname $oldrun [ arguments ]:"
|
echo "Usage: $scriptname $oldrun [ arguments ]:"
|
||||||
|
echo " --bootargs kernel-boot-arguments"
|
||||||
|
echo " --datestamp string"
|
||||||
echo " --dryrun"
|
echo " --dryrun"
|
||||||
echo " --duration minutes | <seconds>s | <hours>h | <days>d"
|
echo " --duration minutes | <seconds>s | <hours>h | <days>d"
|
||||||
echo " --link hard|soft|copy"
|
echo " --link hard|soft|copy"
|
||||||
echo " --remote"
|
echo " --remote"
|
||||||
echo " --rundir /new/res/path"
|
echo " --rundir /new/res/path"
|
||||||
|
echo "Command line: $scriptname $args"
|
||||||
exit 1
|
exit 1
|
||||||
}
|
}
|
||||||
|
|
||||||
while test $# -gt 0
|
while test $# -gt 0
|
||||||
do
|
do
|
||||||
case "$1" in
|
case "$1" in
|
||||||
|
--bootargs|--bootarg)
|
||||||
|
checkarg --bootargs "(list of kernel boot arguments)" "$#" "$2" '.*' '^--'
|
||||||
|
bootargs="$bootargs $2"
|
||||||
|
shift
|
||||||
|
;;
|
||||||
|
--datestamp)
|
||||||
|
checkarg --datestamp "(relative pathname)" "$#" "$2" '^[a-zA-Z0-9._/-]*$' '^--'
|
||||||
|
if test -n "$got_rundir" || test -n "$got_datestamp"
|
||||||
|
then
|
||||||
|
echo Only one of --datestamp or --rundir may be specified
|
||||||
|
usage
|
||||||
|
fi
|
||||||
|
got_datestamp=y
|
||||||
|
ds=$2
|
||||||
|
rundir="$resdir/$ds"
|
||||||
|
if test -e "$rundir"
|
||||||
|
then
|
||||||
|
echo "--datestamp $2: Already exists."
|
||||||
|
usage
|
||||||
|
fi
|
||||||
|
shift
|
||||||
|
;;
|
||||||
--dryrun)
|
--dryrun)
|
||||||
dryrun=1
|
dryrun=1
|
||||||
;;
|
;;
|
||||||
@@ -113,6 +141,12 @@ do
|
|||||||
;;
|
;;
|
||||||
--rundir)
|
--rundir)
|
||||||
checkarg --rundir "(absolute pathname)" "$#" "$2" '^/' '^error'
|
checkarg --rundir "(absolute pathname)" "$#" "$2" '^/' '^error'
|
||||||
|
if test -n "$got_rundir" || test -n "$got_datestamp"
|
||||||
|
then
|
||||||
|
echo Only one of --datestamp or --rundir may be specified
|
||||||
|
usage
|
||||||
|
fi
|
||||||
|
got_rundir=y
|
||||||
rundir=$2
|
rundir=$2
|
||||||
if test -e "$rundir"
|
if test -e "$rundir"
|
||||||
then
|
then
|
||||||
@@ -122,8 +156,11 @@ do
|
|||||||
shift
|
shift
|
||||||
;;
|
;;
|
||||||
*)
|
*)
|
||||||
echo Unknown argument $1
|
if test -n "$1"
|
||||||
usage
|
then
|
||||||
|
echo Unknown argument $1
|
||||||
|
usage
|
||||||
|
fi
|
||||||
;;
|
;;
|
||||||
esac
|
esac
|
||||||
shift
|
shift
|
||||||
@@ -156,7 +193,7 @@ do
|
|||||||
qemu_cmd_dir="`dirname "$i"`"
|
qemu_cmd_dir="`dirname "$i"`"
|
||||||
kernel_dir="`echo $qemu_cmd_dir | sed -e 's/\.[0-9]\+$//'`"
|
kernel_dir="`echo $qemu_cmd_dir | sed -e 's/\.[0-9]\+$//'`"
|
||||||
jitter_dir="`dirname "$kernel_dir"`"
|
jitter_dir="`dirname "$kernel_dir"`"
|
||||||
kvm-transform.sh "$kernel_dir/bzImage" "$qemu_cmd_dir/console.log" "$jitter_dir" $dur < $T/qemu-cmd > $i
|
kvm-transform.sh "$kernel_dir/bzImage" "$qemu_cmd_dir/console.log" "$jitter_dir" $dur "$bootargs" < $T/qemu-cmd > $i
|
||||||
if test -n "$arg_remote"
|
if test -n "$arg_remote"
|
||||||
then
|
then
|
||||||
echo "# TORTURE_KCONFIG_GDB_ARG=''" >> $i
|
echo "# TORTURE_KCONFIG_GDB_ARG=''" >> $i
|
||||||
|
|||||||
@@ -7,9 +7,8 @@
|
|||||||
#
|
#
|
||||||
# Usage: kvm-assign-cpus.sh /path/to/sysfs
|
# Usage: kvm-assign-cpus.sh /path/to/sysfs
|
||||||
|
|
||||||
T=/tmp/kvm-assign-cpus.sh.$$
|
T="`mktemp -d ${TMPDIR-/tmp}/kvm-assign-cpus.sh.XXXXXX`"
|
||||||
trap 'rm -rf $T' 0 2
|
trap 'rm -rf $T' 0 2
|
||||||
mkdir $T
|
|
||||||
|
|
||||||
sysfsdir=${1-/sys/devices/system/node}
|
sysfsdir=${1-/sys/devices/system/node}
|
||||||
if ! cd "$sysfsdir" > $T/msg 2>&1
|
if ! cd "$sysfsdir" > $T/msg 2>&1
|
||||||
|
|||||||
@@ -23,9 +23,8 @@ then
|
|||||||
fi
|
fi
|
||||||
resdir=${2}
|
resdir=${2}
|
||||||
|
|
||||||
T=${TMPDIR-/tmp}/test-linux.sh.$$
|
T="`mktemp -d ${TMPDIR-/tmp}/kvm-build.sh.XXXXXX`"
|
||||||
trap 'rm -rf $T' 0
|
trap 'rm -rf $T' 0
|
||||||
mkdir $T
|
|
||||||
|
|
||||||
cp ${config_template} $T/config
|
cp ${config_template} $T/config
|
||||||
cat << ___EOF___ >> $T/config
|
cat << ___EOF___ >> $T/config
|
||||||
|
|||||||
@@ -18,9 +18,8 @@ then
|
|||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
T=${TMPDIR-/tmp}/kvm-end-run-stats.sh.$$
|
T="`mktemp -d ${TMPDIR-/tmp}/kvm-end-run-stats.sh.XXXXXX`"
|
||||||
trap 'rm -rf $T' 0
|
trap 'rm -rf $T' 0
|
||||||
mkdir $T
|
|
||||||
|
|
||||||
RCUTORTURE="`pwd`/tools/testing/selftests/rcutorture"; export RCUTORTURE
|
RCUTORTURE="`pwd`/tools/testing/selftests/rcutorture"; export RCUTORTURE
|
||||||
PATH=${RCUTORTURE}/bin:$PATH; export PATH
|
PATH=${RCUTORTURE}/bin:$PATH; export PATH
|
||||||
|
|||||||
@@ -30,7 +30,7 @@ do
|
|||||||
resdir=`echo $i | sed -e 's,/$,,' -e 's,/[^/]*$,,'`
|
resdir=`echo $i | sed -e 's,/$,,' -e 's,/[^/]*$,,'`
|
||||||
head -1 $resdir/log
|
head -1 $resdir/log
|
||||||
fi
|
fi
|
||||||
TORTURE_SUITE="`cat $i/../torture_suite`"
|
TORTURE_SUITE="`cat $i/../torture_suite`" ; export TORTURE_SUITE
|
||||||
configfile=`echo $i | sed -e 's,^.*/,,'`
|
configfile=`echo $i | sed -e 's,^.*/,,'`
|
||||||
rm -f $i/console.log.*.diags
|
rm -f $i/console.log.*.diags
|
||||||
case "${TORTURE_SUITE}" in
|
case "${TORTURE_SUITE}" in
|
||||||
|
|||||||
@@ -34,19 +34,18 @@ fi
|
|||||||
shift
|
shift
|
||||||
|
|
||||||
# Pathnames:
|
# Pathnames:
|
||||||
# T: /tmp/kvm-remote.sh.$$
|
# T: /tmp/kvm-remote.sh.NNNNNN where "NNNNNN" is set by mktemp
|
||||||
# resdir: /tmp/kvm-remote.sh.$$/res
|
# resdir: /tmp/kvm-remote.sh.NNNNNN/res
|
||||||
# rundir: /tmp/kvm-remote.sh.$$/res/$ds ("-remote" suffix)
|
# rundir: /tmp/kvm-remote.sh.NNNNNN/res/$ds ("-remote" suffix)
|
||||||
# oldrun: `pwd`/tools/testing/.../res/$otherds
|
# oldrun: `pwd`/tools/testing/.../res/$otherds
|
||||||
#
|
#
|
||||||
# Pathname segments:
|
# Pathname segments:
|
||||||
# TD: kvm-remote.sh.$$
|
# TD: kvm-remote.sh.NNNNNN
|
||||||
# ds: yyyy.mm.dd-hh.mm.ss-remote
|
# ds: yyyy.mm.dd-hh.mm.ss-remote
|
||||||
|
|
||||||
TD=kvm-remote.sh.$$
|
T="`mktemp -d ${TMPDIR-/tmp}/kvm-remote.sh.XXXXXX`"
|
||||||
T=${TMPDIR-/tmp}/$TD
|
|
||||||
trap 'rm -rf $T' 0
|
trap 'rm -rf $T' 0
|
||||||
mkdir $T
|
TD="`basename "$T"`"
|
||||||
|
|
||||||
resdir="$T/res"
|
resdir="$T/res"
|
||||||
ds=`date +%Y.%m.%d-%H.%M.%S`-remote
|
ds=`date +%Y.%m.%d-%H.%M.%S`-remote
|
||||||
|
|||||||
@@ -13,9 +13,8 @@
|
|||||||
#
|
#
|
||||||
# Authors: Paul E. McKenney <paulmck@kernel.org>
|
# Authors: Paul E. McKenney <paulmck@kernel.org>
|
||||||
|
|
||||||
T=${TMPDIR-/tmp}/kvm-test-1-run-batch.sh.$$
|
T="`mktemp -d ${TMPDIR-/tmp}/kvm-test-1-run-batch.sh.XXXXXX`"
|
||||||
trap 'rm -rf $T' 0
|
trap 'rm -rf $T' 0
|
||||||
mkdir $T
|
|
||||||
|
|
||||||
echo ---- Running batch $*
|
echo ---- Running batch $*
|
||||||
# Check arguments
|
# Check arguments
|
||||||
|
|||||||
@@ -17,9 +17,8 @@
|
|||||||
#
|
#
|
||||||
# Authors: Paul E. McKenney <paulmck@kernel.org>
|
# Authors: Paul E. McKenney <paulmck@kernel.org>
|
||||||
|
|
||||||
T=${TMPDIR-/tmp}/kvm-test-1-run-qemu.sh.$$
|
T="`mktemp -d ${TMPDIR-/tmp}/kvm-test-1-run-qemu.sh.XXXXXX`"
|
||||||
trap 'rm -rf $T' 0
|
trap 'rm -rf $T' 0
|
||||||
mkdir $T
|
|
||||||
|
|
||||||
resdir="$1"
|
resdir="$1"
|
||||||
if ! test -d "$resdir"
|
if ! test -d "$resdir"
|
||||||
@@ -109,7 +108,7 @@ do
|
|||||||
if test $kruntime -lt $seconds
|
if test $kruntime -lt $seconds
|
||||||
then
|
then
|
||||||
echo Completed in $kruntime vs. $seconds >> $resdir/Warnings 2>&1
|
echo Completed in $kruntime vs. $seconds >> $resdir/Warnings 2>&1
|
||||||
grep "^(qemu) qemu:" $resdir/kvm-test-1-run.sh.out >> $resdir/Warnings 2>&1
|
grep "^(qemu) qemu:" $resdir/kvm-test-1-run*.sh.out >> $resdir/Warnings 2>&1
|
||||||
killpid="`sed -n "s/^(qemu) qemu: terminating on signal [0-9]* from pid \([0-9]*\).*$/\1/p" $resdir/Warnings`"
|
killpid="`sed -n "s/^(qemu) qemu: terminating on signal [0-9]* from pid \([0-9]*\).*$/\1/p" $resdir/Warnings`"
|
||||||
if test -n "$killpid"
|
if test -n "$killpid"
|
||||||
then
|
then
|
||||||
|
|||||||
@@ -25,9 +25,8 @@
|
|||||||
#
|
#
|
||||||
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
|
# Authors: Paul E. McKenney <paulmck@linux.ibm.com>
|
||||||
|
|
||||||
T=${TMPDIR-/tmp}/kvm-test-1-run.sh.$$
|
T="`mktemp -d ${TMPDIR-/tmp}/kvm-test-1-run.sh.XXXXXX`"
|
||||||
trap 'rm -rf $T' 0
|
trap 'rm -rf $T' 0
|
||||||
mkdir $T
|
|
||||||
|
|
||||||
. functions.sh
|
. functions.sh
|
||||||
. $CONFIGFRAG/ver_functions.sh
|
. $CONFIGFRAG/ver_functions.sh
|
||||||
|
|||||||
@@ -3,10 +3,14 @@
|
|||||||
#
|
#
|
||||||
# Transform a qemu-cmd file to allow reuse.
|
# Transform a qemu-cmd file to allow reuse.
|
||||||
#
|
#
|
||||||
# Usage: kvm-transform.sh bzImage console.log jitter_dir [ seconds ] < qemu-cmd-in > qemu-cmd-out
|
# Usage: kvm-transform.sh bzImage console.log jitter_dir seconds [ bootargs ] < qemu-cmd-in > qemu-cmd-out
|
||||||
#
|
#
|
||||||
# bzImage: Kernel and initrd from the same prior kvm.sh run.
|
# bzImage: Kernel and initrd from the same prior kvm.sh run.
|
||||||
# console.log: File into which to place console output.
|
# console.log: File into which to place console output.
|
||||||
|
# jitter_dir: Jitter directory for TORTURE_JITTER_START and
|
||||||
|
# TORTURE_JITTER_STOP environment variables.
|
||||||
|
# seconds: Run duaration for *.shutdown_secs module parameter.
|
||||||
|
# bootargs: New kernel boot parameters. Beware of Robert Tables.
|
||||||
#
|
#
|
||||||
# The original qemu-cmd file is provided on standard input.
|
# The original qemu-cmd file is provided on standard input.
|
||||||
# The transformed qemu-cmd file is on standard output.
|
# The transformed qemu-cmd file is on standard output.
|
||||||
@@ -17,6 +21,9 @@
|
|||||||
#
|
#
|
||||||
# Authors: Paul E. McKenney <paulmck@kernel.org>
|
# Authors: Paul E. McKenney <paulmck@kernel.org>
|
||||||
|
|
||||||
|
T=`mktemp -d /tmp/kvm-transform.sh.XXXXXXXXXX`
|
||||||
|
trap 'rm -rf $T' 0 2
|
||||||
|
|
||||||
image="$1"
|
image="$1"
|
||||||
if test -z "$image"
|
if test -z "$image"
|
||||||
then
|
then
|
||||||
@@ -41,9 +48,17 @@ then
|
|||||||
echo "Invalid duration, should be numeric in seconds: '$seconds'"
|
echo "Invalid duration, should be numeric in seconds: '$seconds'"
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
bootargs="$5"
|
||||||
|
|
||||||
|
# Build awk program.
|
||||||
|
echo "BEGIN {" > $T/bootarg.awk
|
||||||
|
echo $bootargs | tr -s ' ' '\012' |
|
||||||
|
awk -v dq='"' '/./ { print "\tbootarg[" NR "] = " dq $1 dq ";" }' >> $T/bootarg.awk
|
||||||
|
echo $bootargs | tr -s ' ' '\012' | sed -e 's/=.*$//' |
|
||||||
|
awk -v dq='"' '/./ { print "\tbootpar[" NR "] = " dq $1 dq ";" }' >> $T/bootarg.awk
|
||||||
|
cat >> $T/bootarg.awk << '___EOF___'
|
||||||
|
}
|
||||||
|
|
||||||
awk -v image="$image" -v consolelog="$consolelog" -v jitter_dir="$jitter_dir" \
|
|
||||||
-v seconds="$seconds" '
|
|
||||||
/^# seconds=/ {
|
/^# seconds=/ {
|
||||||
if (seconds == "")
|
if (seconds == "")
|
||||||
print $0;
|
print $0;
|
||||||
@@ -70,13 +85,7 @@ awk -v image="$image" -v consolelog="$consolelog" -v jitter_dir="$jitter_dir" \
|
|||||||
{
|
{
|
||||||
line = "";
|
line = "";
|
||||||
for (i = 1; i <= NF; i++) {
|
for (i = 1; i <= NF; i++) {
|
||||||
if ("" seconds != "" && $i ~ /\.shutdown_secs=[0-9]*$/) {
|
if (line == "") {
|
||||||
sub(/[0-9]*$/, seconds, $i);
|
|
||||||
if (line == "")
|
|
||||||
line = $i;
|
|
||||||
else
|
|
||||||
line = line " " $i;
|
|
||||||
} else if (line == "") {
|
|
||||||
line = $i;
|
line = $i;
|
||||||
} else {
|
} else {
|
||||||
line = line " " $i;
|
line = line " " $i;
|
||||||
@@ -87,7 +96,44 @@ awk -v image="$image" -v consolelog="$consolelog" -v jitter_dir="$jitter_dir" \
|
|||||||
} else if ($i == "-kernel") {
|
} else if ($i == "-kernel") {
|
||||||
i++;
|
i++;
|
||||||
line = line " " image;
|
line = line " " image;
|
||||||
|
} else if ($i == "-append") {
|
||||||
|
for (i++; i <= NF; i++) {
|
||||||
|
arg = $i;
|
||||||
|
lq = "";
|
||||||
|
rq = "";
|
||||||
|
if ("" seconds != "" && $i ~ /\.shutdown_secs=[0-9]*$/)
|
||||||
|
sub(/[0-9]*$/, seconds, arg);
|
||||||
|
if (arg ~ /^"/) {
|
||||||
|
lq = substr(arg, 1, 1);
|
||||||
|
arg = substr(arg, 2);
|
||||||
|
}
|
||||||
|
if (arg ~ /"$/) {
|
||||||
|
rq = substr(arg, length($i), 1);
|
||||||
|
arg = substr(arg, 1, length($i) - 1);
|
||||||
|
}
|
||||||
|
par = arg;
|
||||||
|
gsub(/=.*$/, "", par);
|
||||||
|
j = 1;
|
||||||
|
while (bootpar[j] != "") {
|
||||||
|
if (bootpar[j] == par) {
|
||||||
|
arg = "";
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
j++;
|
||||||
|
}
|
||||||
|
if (line == "")
|
||||||
|
line = lq arg;
|
||||||
|
else
|
||||||
|
line = line " " lq arg;
|
||||||
|
}
|
||||||
|
for (j in bootarg)
|
||||||
|
line = line " " bootarg[j];
|
||||||
|
line = line rq;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
print line;
|
print line;
|
||||||
}'
|
}
|
||||||
|
___EOF___
|
||||||
|
|
||||||
|
awk -v image="$image" -v consolelog="$consolelog" -v jitter_dir="$jitter_dir" \
|
||||||
|
-v seconds="$seconds" -f $T/bootarg.awk
|
||||||
|
|||||||
@@ -14,9 +14,8 @@
|
|||||||
scriptname=$0
|
scriptname=$0
|
||||||
args="$*"
|
args="$*"
|
||||||
|
|
||||||
T=${TMPDIR-/tmp}/kvm.sh.$$
|
T="`mktemp -d ${TMPDIR-/tmp}/kvm.sh.XXXXXX`"
|
||||||
trap 'rm -rf $T' 0
|
trap 'rm -rf $T' 0
|
||||||
mkdir $T
|
|
||||||
|
|
||||||
cd `dirname $scriptname`/../../../../../
|
cd `dirname $scriptname`/../../../../../
|
||||||
|
|
||||||
|
|||||||
@@ -15,9 +15,8 @@
|
|||||||
|
|
||||||
F=$1
|
F=$1
|
||||||
title=$2
|
title=$2
|
||||||
T=${TMPDIR-/tmp}/parse-build.sh.$$
|
T="`mktemp -d ${TMPDIR-/tmp}/parse-build.sh.XXXXXX`"
|
||||||
trap 'rm -rf $T' 0
|
trap 'rm -rf $T' 0
|
||||||
mkdir $T
|
|
||||||
|
|
||||||
. functions.sh
|
. functions.sh
|
||||||
|
|
||||||
|
|||||||
@@ -206,9 +206,8 @@ ds="`date +%Y.%m.%d-%H.%M.%S`-torture"
|
|||||||
startdate="`date`"
|
startdate="`date`"
|
||||||
starttime="`get_starttime`"
|
starttime="`get_starttime`"
|
||||||
|
|
||||||
T=/tmp/torture.sh.$$
|
T="`mktemp -d ${TMPDIR-/tmp}/torture.sh.XXXXXX`"
|
||||||
trap 'rm -rf $T' 0 2
|
trap 'rm -rf $T' 0 2
|
||||||
mkdir $T
|
|
||||||
|
|
||||||
echo " --- " $scriptname $args | tee -a $T/log
|
echo " --- " $scriptname $args | tee -a $T/log
|
||||||
echo " --- Results directory: " $ds | tee -a $T/log
|
echo " --- Results directory: " $ds | tee -a $T/log
|
||||||
@@ -278,6 +277,8 @@ function torture_one {
|
|||||||
then
|
then
|
||||||
cat $T/$curflavor.out | tee -a $T/log
|
cat $T/$curflavor.out | tee -a $T/log
|
||||||
echo retcode=$retcode | tee -a $T/log
|
echo retcode=$retcode | tee -a $T/log
|
||||||
|
else
|
||||||
|
echo $resdir > $T/last-resdir
|
||||||
fi
|
fi
|
||||||
if test "$retcode" == 0
|
if test "$retcode" == 0
|
||||||
then
|
then
|
||||||
@@ -303,10 +304,12 @@ function torture_set {
|
|||||||
shift
|
shift
|
||||||
curflavor=$flavor
|
curflavor=$flavor
|
||||||
torture_one "$@"
|
torture_one "$@"
|
||||||
|
mv $T/last-resdir $T/last-resdir-nodebug || :
|
||||||
if test "$do_kasan" = "yes"
|
if test "$do_kasan" = "yes"
|
||||||
then
|
then
|
||||||
curflavor=${flavor}-kasan
|
curflavor=${flavor}-kasan
|
||||||
torture_one "$@" --kasan
|
torture_one "$@" --kasan
|
||||||
|
mv $T/last-resdir $T/last-resdir-kasan || :
|
||||||
fi
|
fi
|
||||||
if test "$do_kcsan" = "yes"
|
if test "$do_kcsan" = "yes"
|
||||||
then
|
then
|
||||||
@@ -317,6 +320,7 @@ function torture_set {
|
|||||||
cur_kcsan_kmake_args="$kcsan_kmake_args"
|
cur_kcsan_kmake_args="$kcsan_kmake_args"
|
||||||
fi
|
fi
|
||||||
torture_one "$@" --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" $kcsan_kmake_tag $cur_kcsan_kmake_args --kcsan
|
torture_one "$@" --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" $kcsan_kmake_tag $cur_kcsan_kmake_args --kcsan
|
||||||
|
mv $T/last-resdir $T/last-resdir-kcsan || :
|
||||||
fi
|
fi
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -326,20 +330,34 @@ then
|
|||||||
echo " --- allmodconfig:" Start `date` | tee -a $T/log
|
echo " --- allmodconfig:" Start `date` | tee -a $T/log
|
||||||
amcdir="tools/testing/selftests/rcutorture/res/$ds/allmodconfig"
|
amcdir="tools/testing/selftests/rcutorture/res/$ds/allmodconfig"
|
||||||
mkdir -p "$amcdir"
|
mkdir -p "$amcdir"
|
||||||
echo " --- make clean" > "$amcdir/Make.out" 2>&1
|
echo " --- make clean" | tee $amcdir/log > "$amcdir/Make.out" 2>&1
|
||||||
make -j$MAKE_ALLOTED_CPUS clean >> "$amcdir/Make.out" 2>&1
|
make -j$MAKE_ALLOTED_CPUS clean >> "$amcdir/Make.out" 2>&1
|
||||||
echo " --- make allmodconfig" >> "$amcdir/Make.out" 2>&1
|
retcode=$?
|
||||||
cp .config $amcdir
|
buildphase='"make clean"'
|
||||||
make -j$MAKE_ALLOTED_CPUS allmodconfig >> "$amcdir/Make.out" 2>&1
|
if test "$retcode" -eq 0
|
||||||
echo " --- make " >> "$amcdir/Make.out" 2>&1
|
then
|
||||||
make -j$MAKE_ALLOTED_CPUS >> "$amcdir/Make.out" 2>&1
|
echo " --- make allmodconfig" | tee -a $amcdir/log >> "$amcdir/Make.out" 2>&1
|
||||||
retcode="$?"
|
cp .config $amcdir
|
||||||
echo $retcode > "$amcdir/Make.exitcode"
|
make -j$MAKE_ALLOTED_CPUS allmodconfig >> "$amcdir/Make.out" 2>&1
|
||||||
if test "$retcode" == 0
|
retcode=$?
|
||||||
|
buildphase='"make allmodconfig"'
|
||||||
|
fi
|
||||||
|
if test "$retcode" -eq 0
|
||||||
|
then
|
||||||
|
echo " --- make " | tee -a $amcdir/log >> "$amcdir/Make.out" 2>&1
|
||||||
|
make -j$MAKE_ALLOTED_CPUS >> "$amcdir/Make.out" 2>&1
|
||||||
|
retcode="$?"
|
||||||
|
echo $retcode > "$amcdir/Make.exitcode"
|
||||||
|
buildphase='"make"'
|
||||||
|
fi
|
||||||
|
if test "$retcode" -eq 0
|
||||||
then
|
then
|
||||||
echo "allmodconfig($retcode)" $amcdir >> $T/successes
|
echo "allmodconfig($retcode)" $amcdir >> $T/successes
|
||||||
|
echo Success >> $amcdir/log
|
||||||
else
|
else
|
||||||
echo "allmodconfig($retcode)" $amcdir >> $T/failures
|
echo "allmodconfig($retcode)" $amcdir >> $T/failures
|
||||||
|
echo " --- allmodconfig Test summary:" >> $amcdir/log
|
||||||
|
echo " --- Summary: Exit code $retcode from $buildphase, see Make.out" >> $amcdir/log
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
@@ -379,11 +397,48 @@ then
|
|||||||
else
|
else
|
||||||
primlist=
|
primlist=
|
||||||
fi
|
fi
|
||||||
|
firsttime=1
|
||||||
|
do_kasan_save="$do_kasan"
|
||||||
|
do_kcsan_save="$do_kcsan"
|
||||||
for prim in $primlist
|
for prim in $primlist
|
||||||
do
|
do
|
||||||
torture_bootargs="refscale.scale_type="$prim" refscale.nreaders=$HALF_ALLOTED_CPUS refscale.loops=10000 refscale.holdoff=20 torture.disable_onoff_at_boot"
|
if test -n "$firsttime"
|
||||||
torture_set "refscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture refscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --bootargs "verbose_batched=$VERBOSE_BATCH_CPUS torture.verbose_sleep_frequency=8 torture.verbose_sleep_duration=$VERBOSE_BATCH_CPUS" --trust-make
|
then
|
||||||
|
torture_bootargs="refscale.scale_type="$prim" refscale.nreaders=$HALF_ALLOTED_CPUS refscale.loops=10000 refscale.holdoff=20 torture.disable_onoff_at_boot"
|
||||||
|
torture_set "refscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture refscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --bootargs "verbose_batched=$VERBOSE_BATCH_CPUS torture.verbose_sleep_frequency=8 torture.verbose_sleep_duration=$VERBOSE_BATCH_CPUS" --trust-make
|
||||||
|
mv $T/last-resdir-nodebug $T/first-resdir-nodebug || :
|
||||||
|
if test -f "$T/last-resdir-kasan"
|
||||||
|
then
|
||||||
|
mv $T/last-resdir-kasan $T/first-resdir-kasan || :
|
||||||
|
fi
|
||||||
|
if test -f "$T/last-resdir-kcsan"
|
||||||
|
then
|
||||||
|
mv $T/last-resdir-kcsan $T/first-resdir-kcsan || :
|
||||||
|
fi
|
||||||
|
firsttime=
|
||||||
|
do_kasan=
|
||||||
|
do_kcsan=
|
||||||
|
else
|
||||||
|
torture_bootargs=
|
||||||
|
for i in $T/first-resdir-*
|
||||||
|
do
|
||||||
|
case "$i" in
|
||||||
|
*-nodebug)
|
||||||
|
torture_suffix=
|
||||||
|
;;
|
||||||
|
*-kasan)
|
||||||
|
torture_suffix="-kasan"
|
||||||
|
;;
|
||||||
|
*-kcsan)
|
||||||
|
torture_suffix="-kcsan"
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
torture_set "refscale-$prim$torture_suffix" tools/testing/selftests/rcutorture/bin/kvm-again.sh "`cat "$i"`" --duration 5 --bootargs "refscale.scale_type=$prim"
|
||||||
|
done
|
||||||
|
fi
|
||||||
done
|
done
|
||||||
|
do_kasan="$do_kasan_save"
|
||||||
|
do_kcsan="$do_kcsan_save"
|
||||||
|
|
||||||
if test "$do_rcuscale" = yes
|
if test "$do_rcuscale" = yes
|
||||||
then
|
then
|
||||||
@@ -391,11 +446,48 @@ then
|
|||||||
else
|
else
|
||||||
primlist=
|
primlist=
|
||||||
fi
|
fi
|
||||||
|
firsttime=1
|
||||||
|
do_kasan_save="$do_kasan"
|
||||||
|
do_kcsan_save="$do_kcsan"
|
||||||
for prim in $primlist
|
for prim in $primlist
|
||||||
do
|
do
|
||||||
torture_bootargs="rcuscale.scale_type="$prim" rcuscale.nwriters=$HALF_ALLOTED_CPUS rcuscale.holdoff=20 torture.disable_onoff_at_boot"
|
if test -n "$firsttime"
|
||||||
torture_set "rcuscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --trust-make
|
then
|
||||||
|
torture_bootargs="rcuscale.scale_type="$prim" rcuscale.nwriters=$HALF_ALLOTED_CPUS rcuscale.holdoff=20 torture.disable_onoff_at_boot"
|
||||||
|
torture_set "rcuscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --trust-make
|
||||||
|
mv $T/last-resdir-nodebug $T/first-resdir-nodebug || :
|
||||||
|
if test -f "$T/last-resdir-kasan"
|
||||||
|
then
|
||||||
|
mv $T/last-resdir-kasan $T/first-resdir-kasan || :
|
||||||
|
fi
|
||||||
|
if test -f "$T/last-resdir-kcsan"
|
||||||
|
then
|
||||||
|
mv $T/last-resdir-kcsan $T/first-resdir-kcsan || :
|
||||||
|
fi
|
||||||
|
firsttime=
|
||||||
|
do_kasan=
|
||||||
|
do_kcsan=
|
||||||
|
else
|
||||||
|
torture_bootargs=
|
||||||
|
for i in $T/first-resdir-*
|
||||||
|
do
|
||||||
|
case "$i" in
|
||||||
|
*-nodebug)
|
||||||
|
torture_suffix=
|
||||||
|
;;
|
||||||
|
*-kasan)
|
||||||
|
torture_suffix="-kasan"
|
||||||
|
;;
|
||||||
|
*-kcsan)
|
||||||
|
torture_suffix="-kcsan"
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
torture_set "rcuscale-$prim$torture_suffix" tools/testing/selftests/rcutorture/bin/kvm-again.sh "`cat "$i"`" --duration 5 --bootargs "rcuscale.scale_type=$prim"
|
||||||
|
done
|
||||||
|
fi
|
||||||
done
|
done
|
||||||
|
do_kasan="$do_kasan_save"
|
||||||
|
do_kcsan="$do_kcsan_save"
|
||||||
|
|
||||||
if test "$do_kvfree" = "yes"
|
if test "$do_kvfree" = "yes"
|
||||||
then
|
then
|
||||||
@@ -458,7 +550,10 @@ if test -n "$tdir" && test $compress_concurrency -gt 0
|
|||||||
then
|
then
|
||||||
# KASAN vmlinux files can approach 1GB in size, so compress them.
|
# KASAN vmlinux files can approach 1GB in size, so compress them.
|
||||||
echo Looking for K[AC]SAN files to compress: `date` > "$tdir/log-xz" 2>&1
|
echo Looking for K[AC]SAN files to compress: `date` > "$tdir/log-xz" 2>&1
|
||||||
find "$tdir" -type d -name '*-k[ac]san' -print > $T/xz-todo
|
find "$tdir" -type d -name '*-k[ac]san' -print > $T/xz-todo-all
|
||||||
|
find "$tdir" -type f -name 're-run' -print | sed -e 's,/re-run,,' |
|
||||||
|
grep -e '-k[ac]san$' > $T/xz-todo-copy
|
||||||
|
sort $T/xz-todo-all $T/xz-todo-copy | uniq -u > $T/xz-todo
|
||||||
ncompresses=0
|
ncompresses=0
|
||||||
batchno=1
|
batchno=1
|
||||||
if test -s $T/xz-todo
|
if test -s $T/xz-todo
|
||||||
@@ -490,6 +585,24 @@ then
|
|||||||
echo Waiting for final batch $batchno of $ncompresses compressions `date` | tee -a "$tdir/log-xz" | tee -a $T/log
|
echo Waiting for final batch $batchno of $ncompresses compressions `date` | tee -a "$tdir/log-xz" | tee -a $T/log
|
||||||
fi
|
fi
|
||||||
wait
|
wait
|
||||||
|
if test -s $T/xz-todo-copy
|
||||||
|
then
|
||||||
|
# The trick here is that we need corresponding
|
||||||
|
# vmlinux files from corresponding scenarios.
|
||||||
|
echo Linking vmlinux.xz files to re-use scenarios `date` | tee -a "$tdir/log-xz" | tee -a $T/log
|
||||||
|
dirstash="`pwd`"
|
||||||
|
for i in `cat $T/xz-todo-copy`
|
||||||
|
do
|
||||||
|
cd $i
|
||||||
|
find . -name vmlinux -print > $T/xz-todo-copy-vmlinux
|
||||||
|
for v in `cat $T/xz-todo-copy-vmlinux`
|
||||||
|
do
|
||||||
|
rm -f "$v"
|
||||||
|
cp -l `cat $i/re-run`/"$i/$v".xz "`dirname "$v"`"
|
||||||
|
done
|
||||||
|
cd "$dirstash"
|
||||||
|
done
|
||||||
|
fi
|
||||||
echo Size after compressing $n2compress files: `du -sh $tdir | awk '{ print $1 }'` `date` 2>&1 | tee -a "$tdir/log-xz" | tee -a $T/log
|
echo Size after compressing $n2compress files: `du -sh $tdir | awk '{ print $1 }'` `date` 2>&1 | tee -a "$tdir/log-xz" | tee -a $T/log
|
||||||
echo Total duration `get_starttime_duration $starttime`. | tee -a $T/log
|
echo Total duration `get_starttime_duration $starttime`. | tee -a $T/log
|
||||||
else
|
else
|
||||||
|
|||||||
Reference in New Issue
Block a user