* freezer:
af_unix: use freezable blocking calls in read
sigtimedwait: use freezable blocking call
nanosleep: use freezable blocking call
futex: use freezable blocking call
select: use freezable blocking call
epoll: use freezable blocking call
binder: use freezable blocking calls
freezer: add new freezable helpers using freezer_do_not_count()
freezer: convert freezable helpers to static inline where possible
freezer: convert freezable helpers to freezer_do_not_count()
freezer: skip waking up tasks with PF_FREEZER_SKIP set
freezer: shorten freezer sleep time using exponential backoff
lockdep: check that no locks held at freeze time
lockdep: remove task argument from debug_check_no_locks_held
freezer: add unsafe versions of freezable helpers for CIFS
freezer: add unsafe versions of freezable helpers for NFS
Pull vfs fixes from Al Viro:
"Several fixes for bugs caught while looking through f_pos (ab)users"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
aout32 coredump compat fix
splice: don't pass the address of ->f_pos to methods
mconsole: we'd better initialize pos before passing it to vfs_read()...
There were a few noticeable formatting issues in core cpufreq code.
This cleans them up to make code look better. The changes include:
- Whitespace cleanup.
- Rearrangements of code.
- Multiline comments fixes.
- Formatting changes to fit 80 columns.
Copyright information in cpufreq.c is also updated to include my name
for 2013.
[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cpufreq governors' stop and start operations should be carried out
in sequence. Otherwise, there will be unexpected behavior, like in
the example below.
Suppose there are 4 CPUs and policy->cpu=CPU0, CPU1/2/3 are linked
to CPU0. The normal sequence is:
1) Current governor is userspace. An application tries to set the
governor to ondemand. It will call __cpufreq_set_policy() in
which it will stop the userspace governor and then start the
ondemand governor.
2) Current governor is userspace. The online of CPU3 runs on CPU0.
It will call cpufreq_add_policy_cpu() in which it will first
stop the userspace governor, and then start it again.
If the sequence of the above two cases interleaves, it becomes:
1) Application stops userspace governor
2) Hotplug stops userspace governor
which is a problem, because the governor shouldn't be stopped twice
in a row. What happens next is:
3) Application starts ondemand governor
4) Hotplug starts a governor
In step 4, the hotplug is supposed to start the userspace governor,
but now the governor has been changed by the application to ondemand,
so the ondemand governor is started once again, which is incorrect.
The solution is to prevent policy governors from being stopped
multiple times in a row. A governor should only be stopped once for
one policy. After it has been stopped, no more governor stop
operations should be executed.
Also add a mutex to serialize governor operations.
[rjw: Changelog. And you owe me a beverage of my choice.]
Signed-off-by: Xiaoguang Chen <chenxg@marvell.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit a938da06 introduced a useful little log message to tell
users/debuggers which wakeup source aborted a suspend. However,
this message is only printed if the abort happens during the
in-kernel suspend path (after writing /sys/power/state).
The full specification of the /sys/power/wakeup_count facility
allows user-space power managers to double-check if wakeups have
already happened before it actually tries to suspend (e.g. while it
was running user-space pre-suspend hooks), by writing the last known
wakeup_count value to /sys/power/wakeup_count. This patch changes
the sysfs handler for that node to also print said log message if
that write fails, so that we can figure out the offending wakeup
source for both kinds of suspend aborts.
Signed-off-by: Julius Werner <jwerner@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull scheduler fixes from Ingo Molnar:
"Two smaller fixes - plus a context tracking tracing fix that is a bit
bigger"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tracing/context-tracking: Add preempt_schedule_context() for tracing
sched: Fix clear NOHZ_BALANCE_KICK
sched/x86: Construct all sibling maps if smt
Pull perf fixes from Ingo Molnar:
"Four fixes. The mmap ones are unfortunately larger than desired -
fuzzing uncovered bugs that needed perf context life time management
changes to fix properly"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Fix broken PEBS-LL support on SNB-EP/IVB-EP
perf: Fix mmap() accounting hole
perf: Fix perf mmap bugs
kprobes: Fix to free gone and unused optprobes
Pull timer fixes from Thomas Gleixner:
- Fix inconstinant clock usage in virtual time accounting
- Fix a build error in KVM caused by the NOHZ work
- Remove a pointless timekeeping duty assignment which breaks NOHZ
- Use a proper notifier return value to avoid random behaviour
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick: Remove useless timekeeping duty attribution to broadcast source
nohz: Fix notifier return val that enforce timekeeping
kvm: Move guest entry/exit APIs to context_tracking
vtime: Use consistent clocks among nohz accounting
Dave Jones hit the following bug report:
===============================
[ INFO: suspicious RCU usage. ]
3.10.0-rc2+ #1 Not tainted
-------------------------------
include/linux/rcupdate.h:771 rcu_read_lock() used illegally while idle!
other info that might help us debug this:
RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
2 locks held by cc1/63645:
#0: (&rq->lock){-.-.-.}, at: [<ffffffff816b39fd>] __schedule+0xed/0x9b0
#1: (rcu_read_lock){.+.+..}, at: [<ffffffff8109d645>] cpuacct_charge+0x5/0x1f0
CPU: 1 PID: 63645 Comm: cc1 Not tainted 3.10.0-rc2+ #1 [loadavg: 40.57 27.55 13.39 25/277 64369]
Hardware name: Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H, BIOS F12a 04/23/2010
0000000000000000 ffff88010f78fcf8 ffffffff816ae383 ffff88010f78fd28
ffffffff810b698d ffff88011c092548 000000000023d073 ffff88011c092500
0000000000000001 ffff88010f78fd60 ffffffff8109d7c5 ffffffff8109d645
Call Trace:
[<ffffffff816ae383>] dump_stack+0x19/0x1b
[<ffffffff810b698d>] lockdep_rcu_suspicious+0xfd/0x130
[<ffffffff8109d7c5>] cpuacct_charge+0x185/0x1f0
[<ffffffff8109d645>] ? cpuacct_charge+0x5/0x1f0
[<ffffffff8108dffc>] update_curr+0xec/0x240
[<ffffffff8108f528>] put_prev_task_fair+0x228/0x480
[<ffffffff816b3a71>] __schedule+0x161/0x9b0
[<ffffffff816b4721>] preempt_schedule+0x51/0x80
[<ffffffff816b4800>] ? __cond_resched_softirq+0x60/0x60
[<ffffffff816b6824>] ? retint_careful+0x12/0x2e
[<ffffffff810ff3cc>] ftrace_ops_control_func+0x1dc/0x210
[<ffffffff816be280>] ftrace_call+0x5/0x2f
[<ffffffff816b681d>] ? retint_careful+0xb/0x2e
[<ffffffff816b4805>] ? schedule_user+0x5/0x70
[<ffffffff816b4805>] ? schedule_user+0x5/0x70
[<ffffffff816b6824>] ? retint_careful+0x12/0x2e
------------[ cut here ]------------
What happened was that the function tracer traced the schedule_user() code
that tells RCU that the system is coming back from userspace, and to
add the CPU back to the RCU monitoring.
Because the function tracer does a preempt_disable/enable_notrace() calls
the preempt_enable_notrace() checks the NEED_RESCHED flag. If it is set,
then preempt_schedule() is called. But this is called before the user_exit()
function can inform the kernel that the CPU is no longer in user mode and
needs to be accounted for by RCU.
The fix is to create a new preempt_schedule_context() that checks if
the kernel is still in user mode and if so to switch it to kernel mode
before calling schedule. It also switches back to user mode coming back
from schedule in need be.
The only user of this currently is the preempt_enable_notrace(), which is
only used by the tracing subsystem.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1369423420.6828.226.camel@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
CONFIG_ACPI_HOTPLUG_MEMORY has been changed to bool (y/n), and
its module option is no longer valid. So, stop using
CONFIG_ACPI_HOTPLUG_MEMORY_MODULE.
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Thanks to commit f91eb62f71 ("init: scream bloody murder if interrupts
are enabled too early"), "bloody murder" is now being screamed.
With a MIPS OCTEON config, we use on_each_cpu() in our
irq_chip.irq_bus_sync_unlock() function. This gets called in early as a
result of the time_init() call. Because the !SMP version of
on_each_cpu() unconditionally enables irqs, we get:
WARNING: at init/main.c:560 start_kernel+0x250/0x410()
Interrupts were enabled early
CPU: 0 PID: 0 Comm: swapper Not tainted 3.10.0-rc5-Cavium-Octeon+ #801
Call Trace:
show_stack+0x68/0x80
warn_slowpath_common+0x78/0xb0
warn_slowpath_fmt+0x38/0x48
start_kernel+0x250/0x410
Suggested fix: Do what we already do in the SMP version of
on_each_cpu(), and use local_irq_save/local_irq_restore. Because we
need a flags variable, make it a static inline to avoid name space
issues.
[ Change from v1: Convert on_each_cpu to a static inline function, add
#include <linux/irqflags.h> to avoid build breakage on some files.
on_each_cpu_mask() and on_each_cpu_cond() suffer the same problem as
on_each_cpu(), but they are not causing !SMP bugs for me, so I will
defer changing them to a less urgent patch. ]
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull RCU fixes from Paul McKenney:
"I must confess that this past merge window was not RCU's best showing.
This series contains three more fixes for RCU regressions:
1. A fix to __DECLARE_TRACE_RCU() that causes it to act as an
interrupt from idle rather than as a task switch from idle.
This change is needed due to the recent use of _rcuidle()
tracepoints that can be invoked from interrupt handlers as well
as from idle. Without this fix, invoking _rcuidle() tracepoints
from interrupt handlers results in splats and (more seriously)
confusion on RCU's part as to whether a given CPU is idle or not.
This confusion can in turn result in too-short grace periods and
therefore random memory corruption.
2. A fix to a subtle deadlock that could result due to RCU doing
a wakeup while holding one of its rcu_node structure's locks.
Although the probability of occurrence is low, it really
does happen. The fix, courtesy of Steven Rostedt, uses
irq_work_queue() to avoid the deadlock.
3. A fix to a silent deadlock (invisible to lockdep) due to the
interaction of timeouts posted by RCU debug code enabled by
CONFIG_PROVE_RCU_DELAY=y, grace-period initialization, and CPU
hotplug operations. This will not occur in production kernels,
but really does occur in randconfig testing. Diagnosis courtesy
of Steven Rostedt"
* 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
rcu: Fix deadlock with CPU hotplug, RCU GP init, and timer migration
rcu: Don't call wakeup() with rcu_node structure ->lock held
trace: Allow idle-safe tracepoints to be called from irq
Pull networking update from David Miller:
1) Fix dump iterator in nfnl_acct_dump() and ctnl_timeout_dump() to
dump all objects properly, from Pablo Neira Ayuso.
2) xt_TCPMSS must use the default MSS of 536 when no MSS TCP option is
present. Fix from Phil Oester.
3) qdisc_get_rtab() looks for an existing matching rate table and uses
that instead of creating a new one. However, it's key matching is
incomplete, it fails to check to make sure the ->data[] array is
identical too. Fix from Eric Dumazet.
4) ip_vs_dest_entry isn't fully initialized before copying back to
userspace, fix from Dan Carpenter.
5) Fix ubuf reference counting regression in vhost_net, from Jason
Wang.
6) When sock_diag dumps a socket filter back to userspace, we have to
translate it out of the kernel's internal representation first.
From Nicolas Dichtel.
7) davinci_mdio holds a spinlock while calling pm_runtime, which
sleeps. Fix from Sebastian Siewior.
8) Timeout check in sh_eth_check_reset is off by one, from Sergei
Shtylyov.
9) If sctp socket init fails, we can NULL deref during cleanup. Fix
from Daniel Borkmann.
10) netlink_mmap() does not propagate errors properly, from Patrick
McHardy.
11) Disable powersave and use minstrel by default in ath9k. From Sujith
Manoharan.
12) Fix a regression in that SOCK_ZEROCOPY is not set on tuntap sockets
which prevents vhost from being able to use zerocopy. From Jason
Wang.
13) Fix race between port lookup and TX path in team driver, from Jiri
Pirko.
14) Missing length checks in bluetooth L2CAP packet parsing, from Johan
Hedberg.
15) rtlwifi fails to connect to networking using any encryption method
other than WPA2. Fix from Larry Finger.
16) Fix iwlegacy build due to incorrect CONFIG_* ifdeffing for power
management stuff. From Yijing Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits)
b43: stop format string leaking into error msgs
ath9k: Use minstrel rate control by default
Revert "ath9k_hw: Update rx gain initval to improve rx sensitivity"
ath9k: Disable PowerSave by default
net: wireless: iwlegacy: fix build error for il_pm_ops
rtlwifi: Fix a false leak indication for PCI devices
wl12xx/wl18xx: scan all 5ghz channels
wl12xx: increase minimum singlerole firmware version required
wl12xx: fix minimum required firmware version for wl127x multirole
rtlwifi: rtl8192cu: Fix problem in connecting to WEP or WPA(1) networks
mwifiex: debugfs: Fix out of bounds array access
Bluetooth: Fix mgmt handling of power on failures
Bluetooth: Fix missing length checks for L2CAP signalling PDUs
Bluetooth: btmrvl: support Marvell Bluetooth device SD8897
Bluetooth: Fix checks for LE support on LE-only controllers
team: fix checks in team_get_first_port_txable_rcu()
team: move add to port list before port enablement
team: check return value of team_get_port_by_index_rcu() for NULL
tuntap: set SOCK_ZEROCOPY flag during open
netlink: fix error propagation in netlink_mmap()
...
Pull block layer fixes from Jens Axboe:
"Outside of bcache (which really isn't super big), these are all
few-liners. There are a few important fixes in here:
- Fix blk pm sleeping when holding the queue lock
- A small collection of bcache fixes that have been done and tested
since bcache was included in this merge window.
- A fix for a raid5 regression introduced with the bio changes.
- Two important fixes for mtip32xx, fixing an oops and potential data
corruption (or hang) due to wrong bio iteration on stacked devices."
* 'for-linus' of git://git.kernel.dk/linux-block:
scatterlist: sg_set_buf() argument must be in linear mapping
raid5: Initialize bi_vcnt
pktcdvd: silence static checker warning
block: remove refs to XD disks from documentation
blkpm: avoid sleep when holding queue lock
mtip32xx: Correctly handle bio->bi_idx != 0 conditions
mtip32xx: Fix NULL pointer dereference during module unload
bcache: Fix error handling in init code
bcache: clarify free/available/unused space
bcache: drop "select CLOSURES"
bcache: Fix incompatible pointer type warning
There is div64_long() to handle the s64/long division, but no mocro do
u64/ul division. It is necessary in some scenarios, so add this
function.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Alex Shi <alex.shi@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When we have a page fault for the address which is backed by a hugepage
under migration, the kernel can't wait correctly and do busy looping on
hugepage fault until the migration finishes. As a result, users who try
to kick hugepage migration (via soft offlining, for example) occasionally
experience long delay or soft lockup.
This is because pte_offset_map_lock() can't get a correct migration entry
or a correct page table lock for hugepage. This patch introduces
migration_entry_wait_huge() to solve this.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <stable@vger.kernel.org> [2.6.35+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The dmesg_restrict sysctl currently covers the syslog method for access
dmesg, however /dev/kmsg isn't covered by the same protections. Most
people haven't noticed because util-linux dmesg(1) defaults to using the
syslog method for access in older versions. With util-linux dmesg(1)
defaults to reading directly from /dev/kmsg.
To fix /dev/kmsg, let's compare the existing interfaces and what they
allow:
- /proc/kmsg allows:
- open (SYSLOG_ACTION_OPEN) if CAP_SYSLOG since it uses a destructive
single-reader interface (SYSLOG_ACTION_READ).
- everything, after an open.
- syslog syscall allows:
- anything, if CAP_SYSLOG.
- SYSLOG_ACTION_READ_ALL and SYSLOG_ACTION_SIZE_BUFFER, if
dmesg_restrict==0.
- nothing else (EPERM).
The use-cases were:
- dmesg(1) needs to do non-destructive SYSLOG_ACTION_READ_ALLs.
- sysklog(1) needs to open /proc/kmsg, drop privs, and still issue the
destructive SYSLOG_ACTION_READs.
AIUI, dmesg(1) is moving to /dev/kmsg, and systemd-journald doesn't
clear the ring buffer.
Based on the comments in devkmsg_llseek, it sounds like actions besides
reading aren't going to be supported by /dev/kmsg (i.e.
SYSLOG_ACTION_CLEAR), so we have a strict subset of the non-destructive
syslog syscall actions.
To this end, move the check as Josh had done, but also rename the
constants to reflect their new uses (SYSLOG_FROM_CALL becomes
SYSLOG_FROM_READER, and SYSLOG_FROM_FILE becomes SYSLOG_FROM_PROC).
SYSLOG_FROM_READER allows non-destructive actions, and SYSLOG_FROM_PROC
allows destructive actions after a capabilities-constrained
SYSLOG_ACTION_OPEN check.
- /dev/kmsg allows:
- open if CAP_SYSLOG or dmesg_restrict==0
- reading/polling, after open
Addresses https://bugzilla.redhat.com/show_bug.cgi?id=903192
[akpm@linux-foundation.org: use pr_warn_once()]
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Christian Kujau <lists@nerdbynature.de>
Tested-by: Josh Boyer <jwboyer@redhat.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are instances in the kernel where we would like to disable CPU
hotplug (from sysfs) during some important operation. Today the freezer
code depends on this and the code to do it was kinda tailor-made for
that.
Restructure the code and make it generic enough to be useful for other
usecases too.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Russ Anderson <rja@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
should be checked if "cur" is txable, not "port".
Introduced by commit 6e88e1357c "team: use function team_port_txable()
for determing enabled and up port"
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit bf4d1b5 (cpuidle: support multiple drivers) introduced support
for using multiple cpuidle drivers at the same time. It added a
couple of new APIs to register the driver per CPU, but that led to
some unnecessary code complexity related to the kernel config options
deciding whether or not the multiple driver support is enabled. The
code has to work as it did before when the multiple driver support is
not enabled and the multiple driver support has to be compatible with
the previously existing API.
Remove the new API, not used by any driver in the tree yet (but
needed for the HMP cpuidle drivers that will be submitted soon), and
add a new cpumask pointer to the cpuidle driver structure that will
point to the mask of CPUs handled by the given driver. That will
allow the cpuidle_[un]register_driver() API to be used for the
multiple driver support along with the cpuidle_[un]register()
functions added recently.
[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Filters need to be translated to real BPF code for userland, like SO_GETFILTER.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
__DECLARE_TRACE_RCU() currently creates an _rcuidle() tracepoint which
may safely be invoked from what RCU considers to be an idle CPU.
However, these _rcuidle() tracepoints may -not- be invoked from the
handler of an irq taken from idle, because rcu_idle_enter() zeroes
RCU's nesting-level counter, so that the rcu_irq_exit() returning to
idle will trigger a WARN_ON_ONCE().
This commit therefore substitutes rcu_irq_enter() for rcu_idle_exit()
and rcu_irq_exit() for rcu_idle_enter() in order to make the _rcuidle()
tracepoints usable from irq handlers as well as from process context.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
SmartReflex driver interface is natively divided to two parts:
- external SmartReflex interface
- interface between SmartReflex driver and SmartReflex Class
Functions which belong to AVS class interface can use
struct omap_sr* instead of struct voltatedomain*, to provide a
direct connection between SR driver and SR class. This allows
us to optimize and not do additional lookups where none is
required.
sr_enable() and sr_disable() are interface functions between
SR driver and SR class. They are typically used by Class driver
to enable/disable SmartReflex hardware module.
Now they take struct omap_sr* as input parameter.
Signed-off-by: Andrii Tseglytskyi <andrii.tseglytskyi@ti.com>
Acked-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
SmartReflex driver interface is natively divided to two parts:
- external SmartReflex interface
- interface between SmartReflex driver and SmartReflex Class
Functions which belong to AVS class interface can use
struct omap_sr* instead of struct voltatedomain*, to provide a
direct connection between SR driver and SR class. This allows
us to optimize and not do additional lookups where none is
required.
sr_configure_minmax() is interface function between SR driver
and SR class. It is typically used by Class driver to
configure MINMAXAVG module inside SmartReflex module.
Now it takes struct omap_sr* as input parameter.
Signed-off-by: Andrii Tseglytskyi <andrii.tseglytskyi@ti.com>
Acked-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
SmartReflex driver interface is natively divided to two parts:
- external SmartReflex interface
- interface between SmartReflex driver and SmartReflex Class
Functions which belong to AVS class interface can use
struct omap_sr* instead of struct voltatedomain*, to provide a
direct connection between SR driver and SR class. This allows
us to optimize and not do additional lookups where none is
required.
sr_disable_errgen() and sr_configure_errgen() are interface
functions between SR driver and SR class. They are typically
used by Class driver to configure error generator module during
SmartReflex enable/disable sequence.
Now they take struct omap_sr* as input parameter.
Signed-off-by: Andrii Tseglytskyi <andrii.tseglytskyi@ti.com>
Acked-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
The first two fix the case where full RCU debugging is enabled, enabling
function tracing causes a live lock of the system. This is due to the added
debug checks in rcu_dereference_raw() that is used by the function tracer.
These checks are also traced by the function tracer as well as cause enough
overhead to the function tracer to slow down the system enough that
the time to finish an interrupt can take longer than when the next
interrupt is triggered, causing a live lock from the timer interrupt.
Talking this over with Paul McKenney, we came up with a fix that adds
a new rcu_dereference_raw_notrace() that does not perform these added checks,
and let the function tracer use that.
The third commit fixes a failed compile when branch tracing is enabled,
due to the conversion of the trace_test_buffer() selftest that the
branch trace wasn't converted for.
The forth patch fixes a bug caught by the RCU lockdep code where a
rcu_read_lock() is performed when rcu is disabled (either going to
or from idle, or user space). This happened on the irqsoff tracer
as it calls task_uid(). The fix here was to use current_uid() when
possible that doesn't use rcu locking. Which luckily, is always used
when irqsoff calls this code.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJRsQZhAAoJEOdOSU1xswtMquIH/0zyrqrLTnkc5MsNnnJ8kH5R
z1cULts4FqBTUNZ1hdb3BTOu4zywjREIkWfM9qqpBmq9Mq6PBxX7gxWTqYvD4jiX
EatiiCKa7Fyddx4iHJNfvtWgKVYt9WKSNeloRugS9h7NxIZ1wpz21DUpENFQzW2f
jWRnq/AKXFmZ0vn1953mPePtRsg61RYpb7DCkTE1gtUnvL43wMd/Mo6p6BLMEG26
1dDK6EWO/uewl8A4oP5JZYP+AP5Ckd4x1PuQK682AtQw+8S6etaGfeJr0WZmKQoD
0aDZ/NXXSNKChlUFGJusBNJCWryONToa+sdiKuk1h/lW/k9Mail/FChiHBzMiwk=
=uvlD
-----END PGP SIGNATURE-----
Merge tag 'trace-fixes-v3.10-rc3-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"This contains 4 fixes.
The first two fix the case where full RCU debugging is enabled,
enabling function tracing causes a live lock of the system. This is
due to the added debug checks in rcu_dereference_raw() that is used by
the function tracer. These checks are also traced by the function
tracer as well as cause enough overhead to the function tracer to slow
down the system enough that the time to finish an interrupt can take
longer than when the next interrupt is triggered, causing a live lock
from the timer interrupt.
Talking this over with Paul McKenney, we came up with a fix that adds
a new rcu_dereference_raw_notrace() that does not perform these added
checks, and let the function tracer use that.
The third commit fixes a failed compile when branch tracing is
enabled, due to the conversion of the trace_test_buffer() selftest
that the branch trace wasn't converted for.
The forth patch fixes a bug caught by the RCU lockdep code where a
rcu_read_lock() is performed when rcu is disabled (either going to or
from idle, or user space). This happened on the irqsoff tracer as it
calls task_uid(). The fix here was to use current_uid() when possible
that doesn't use rcu locking. Which luckily, is always used when
irqsoff calls this code."
* tag 'trace-fixes-v3.10-rc3-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Use current_uid() for critical time tracing
tracing: Fix bad parameter passed in branch selftest
ftrace: Use the rcu _notrace variants for rcu_dereference_raw() and friends
rcu: Add _notrace variation of rcu_dereference_raw() and hlist_for_each_entry_rcu()
I broke them in this commit:
commit 1be374a051
Author: Andy Lutomirski <luto@amacapital.net>
Date: Wed May 22 14:07:44 2013 -0700
net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg
This patch adds __sys_sendmsg and __sys_sendmsg as common helpers that accept
MSG_CMSG_COMPAT and blocks MSG_CMSG_COMPAT at the syscall entrypoints. It
also reverts some unnecessary checks in sys_socketcall.
Apparently I was suffering from underscore blindness the first time around.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Tested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix timeouts with direct mode authentication in mac80211, from
Stanislaw Gruszka.
2) Aggregation sessions can deadlock in ath9k, from Felix Fietkau.
3) Netfilter's xt_addrtype doesn't work with ipv6 due to route lookups
creating undesirable cache entries, from Florian Westphal.
4) Fix netfilter's ipt_ULOG from generating non-NULL terminated
strings.
5) Fix netdev transmit queue crashes in mac80211, from Johannes Berg.
6) Fix copy and paste error in 802.11 stack that broke reporting of
64-bit station tx statistics, from Felix Fietkau.
7) When qlge_probe fails, it leaks the netdev. Fix from Wei Yongjun.
8) SKB control block (where we store the IP options information,
amongst other things) must be cleared properly otherwise ICMP
sending can crash for IP tunnels. Fix from Eric Dumazet.
9) Verification of Energy Efficient Ether support was coded wrongly,
the test was inversed. Fix from Giuseppe CAVALLARO.
10) TCP handles redirects improperly because the wrong flow key is used
for the route lookup. From Michal Kubecek.
11) Don't interpret MSG_CMSG_COMPAT from userspace, fix from Andy
Lutomirski.
12) The new AF_VSOCK was missing from the lockdep string table, fix from
Federico Vaga.
13) be2net doesn't handle checksumming of IP fragments properly, from
Somnath Kotur.
14) Fix several bugs in the device address list code that lead to
crashes and other misbehaviors. From Jay Vosburgh.
15) Fix ipv6 segmentation handling of fragmented GRE tunnel traffic,
from Pravin B Shalr.
16) Fix usage of stale policies in IPSEC layer, from Paul Moore.
17) Fix team driver dump of ports when there are a large number of them,
from Jiri Pirko.
18) Fix softlockups in UDP ipv4 socket lookup causes by and error in the
hlist_nulls_for_each_entry_rcu() macro. From Eric Dumazet.
19) Fix several regressions added by the high rate accuracy changes to
the htb packet scheduler. From Eric Dumazet.
20) Fix DMA'ing onto the stack in esd_usb2 and peak_usb CAN drivers,
from Olivier Sobrie and Marc Kleine-Budde.
21) Fix unremovable network devices due to missing route pointer
installation in the per-device ipv6 address list entries. From Gao
feng.
22) Apply the tg3 5719 DMA workaround on 5720 chips as well, otherwise
we get stalls. From Nithin Sujir.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (68 commits)
net_sched: htb: do not mix 1ns and 64ns time units
net: fix sk_buff head without data area
tg3: Add read dma workaround for 5720
net: ethernet: xilinx_emaclite: set protocol selector bits when writing ANAR
bnx2x: Fix bridged GSO for 57710/57711 chips
net: fec: add fallback to random MAC address
bnx2x: fix TCP offload for tunneling ipv4 over ipv6
ipv6: assign rt6_info to inet6_ifaddr in init_loopback
net/mlx4_core: Keep VF assigned MAC in the PF admin table
net/mlx4_en: Handle unassigned VF MAC address correctly
net/mlx4_core: Return -EPROBE_DEFER when a VF is probed before PF is sufficiently initialized
net/mlx4_en: Fix adaptive moderation cq update
net: can: peak_usb: Do not do dma on the stack
net: can: esd_usb2: Do not do dma on the stack
net: can: kvaser_usb: fix reception on "USBcan Pro" and "USBcan R" type hardware.
net_sched: restore "overhead xxx" handling
net: force a reload of first item in hlist_nulls_for_each_entry_rcu
hyperv: Fix vlan_proto setting in netvsc_recv_callback()
team: fix port list dump for big number of ports
list: introduce list_first_entry_or_null
...
The "index" field of struct cpufreq_frequency_table was never an
index and isn't used at all by the cpufreq core. It only is useful
for cpufreq drivers for their internal purposes.
Many people nowadays blindly set it in ascending order with the
assumption that the core will use it, which is a mistake.
Rename it to "driver_data" as that's what its purpose is. All of its
users are updated accordingly.
[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The "runtime idle" helper routine, rpm_idle(), currently ignores
return values from .runtime_idle() callbacks executed by it.
However, it turns out that many subsystems use
pm_generic_runtime_idle() which checks the return value of the
driver's callback and executes pm_runtime_suspend() for the device
unless that value is not 0. If that logic is moved to rpm_idle()
instead, pm_generic_runtime_idle() can be dropped and its users
will not need any .runtime_idle() callbacks any more.
Moreover, the PCI, SCSI, and SATA subsystems' .runtime_idle()
routines, pci_pm_runtime_idle(), scsi_runtime_idle(), and
ata_port_runtime_idle(), respectively, as well as a few drivers'
ones may be simplified if rpm_idle() calls rpm_suspend() after 0 has
been returned by the .runtime_idle() callback executed by it.
To reduce overall code bloat, make the changes described above.
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Kevin Hilman <khilman@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Pull cgroup fixes from Tejun Heo:
- Fix for yet another xattr bug which may lead to NULL deref.
- A subtle bug in for_each_descendant_pre(). This bug requires quite
specific conditions to trigger and isn't too likely to actually
happen in the wild, but maybe that just makes it that much more
nastier.
- A warning message added for silly cgroup re-mount (not -o remount,
but unmount followed by mount) behavior.
* 'for-3.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: warn about mismatching options of a new mount of an existing hierarchy
cgroup: fix a subtle bug in descendant pre-order walk
cgroup: initialize xattr before calling d_instantiate()
Roman Gushchin discovered that udp4_lib_lookup2() was not reloading
first item in the rcu protected list, in case the loop was restarted.
This produced soft lockups as in https://lkml.org/lkml/2013/4/16/37
rcu_dereference(X)/ACCESS_ONCE(X) seem to not work as intended if X is
ptr->field :
In some cases, gcc caches the value or ptr->field in a register.
Use a barrier() to disallow such caching, as documented in
Documentation/atomic_ops.txt line 114
Thanks a lot to Roman for providing analysis and numerous patches.
Diagnosed-by: Roman Gushchin <klamm@yandex-team.ru>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Boris Zhmurov <zhmurov@yandex-team.ru>
Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the definitions of offline_pages() and remove_memory()
for CONFIG_MEMORY_HOTREMOVE to memory_hotplug.h, where they belong,
and make them static inline.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Now that the memory offlining should be taken care of by the
companion device offlining code in acpi_scan_hot_remove(), the
ACPI memory hotplug driver doesn't need to offline it in
remove_memory() any more. Moreover, since the return value of
remove_memory() is not used, it's better to make it be a void
function and trigger a BUG() if the memory scheduled for removal is
not offline.
Change the code in accordance with the above observations.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
Since offline_memory_block(mem) is functionally equivalent to
device_offline(&mem->dev), make the only caller of the former use
the latter instead and drop offline_memory_block() entirely.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Toshi Kani <toshi.kani@hp.com>
As noted by Tang Chen, the last_online field in struct memory_block
introduced by commit 4960e05 (Driver core: Introduce offline/online
callbacks for memory blocks) is not really necessary, because
online_pages() restores the previous state if passed ONLINE_KEEP as
the last argument. Therefore, remove that field along with the code
referring to it.
References: http://marc.info/?l=linux-kernel&m=136919777305599&w=2
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Tang Chen <tangchen@cn.fujitsu.com>
udp6 over GRE tunnel does not work after to GRE tso changes. GRE
tso handler passes inner packet but keeps track of outer header
start in SKB_GSO_CB(skb)->mac_offset. udp6 fragment need to
take care of outer header, which start at the mac_offset, while
adding fragment header.
This bug is introduced by commit 68c3316311 (GRE: Add TCP
segmentation offload for GRE).
Reported-by: Dmitry Kravkov <dkravkov@gmail.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Tested-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kvm_host.h header file doesn't handle well
inclusion when archs don't support KVM.
This results in build crashes for such archs when they
want to implement context tracking because this subsystem
includes kvm_host.h in order to implement the
guest_enter/exit APIs but it doesn't handle KVM off case.
To fix this, move the guest_enter()/guest_exit()
declarations and generic implementation to the context
tracking headers. These generic APIs actually belong to
this subsystem, besides other domains boundary tracking
like user_enter() et al.
KVM now properly becomes a user of this library, not the
other buggy way around.
Reported-by: Kevin Hilman <khilman@linaro.org>
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>