Commit graph

14459 commits

Author SHA1 Message Date
Paul Mackerras
f3dfd2656d perfcounters: fix a few minor cleanliness issues
This fixes three issues noticed by Arnd Bergmann:

- Add #ifdef __KERNEL__ and move some things around in perf_counter.h
  to make sure only the bits that userspace needs are exported to
  userspace.

- Use __u64, __s64, __u32 types in the structs exported to userspace
  rather than u64, s64, u32.

- Make the sys_perf_counter_open syscall available to the SPUs on
  Cell platforms.

And one issue that I noticed in looking at the code again:

- Wrap the perf_counter_open syscall with SYSCALL_DEFINE4 so we get
  the proper handling of int arguments on ppc64 (and some other 64-bit
  architectures).

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-02-26 22:43:46 +11:00
Ingo Molnar
0b6de00922 Merge branch 'x86/apic' into perfcounters/core
Conflicts:
	arch/x86/kernel/cpu/perfctr-watchdog.c
2009-02-17 17:20:11 +01:00
Ingo Molnar
494df596f9 Merge branches 'x86/acpi', 'x86/apic', 'x86/cpudetect', 'x86/headers', 'x86/paravirt', 'x86/urgent' and 'x86/xen'; commit 'v2.6.29-rc5' into x86/core 2009-02-17 12:07:00 +01:00
Yinghai Lu
970ec1a821 [IA64] fix __apci_unmap_table
Impact: fix build error

to fix:

  tip/arch/ia64/kernel/acpi.c:203: error: conflicting types for '__acpi_unmap_table'
  tip/include/linux/acpi.h:82: error: previous declaration of '__acpi_unmap_table' was here
  tip/arch/ia64/kernel/acpi.c:203: error: conflicting types for '__acpi_unmap_table'
  tip/include/linux/acpi.h:82: error: previous declaration of '__acpi_unmap_table' was here

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-16 00:43:24 +01:00
Ingo Molnar
b69bc39674 Merge commit 'v2.6.29-rc5' into x86/apic 2009-02-15 09:00:18 +01:00
Ingo Molnar
22796b1572 Merge branch 'core/header-fixes' into x86/headers
Conflicts:
	arch/x86/include/asm/setup.h
2009-02-13 21:05:03 +01:00
Linus Torvalds
b51ebdc40c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ASoC: Only register AC97 bus if it's not done already
  ALSA: hda - Add snd_hda_multi_out_dig_cleanup()
  ALSA: hda - Add missing terminator in slave dig-out array
  ALSA: hda - Change HP dv7 (103c:30f4) quirk from hp-m4 to hp-dv5 model
  ALSA: hda - Register (new) devices at reconfig
  ALSA: mtpav - Fix initial value for input hwport
  ALSA: hda - add id for Intel IbexPeak integrated HDMI codec
  ALSA: hda - compute checksum in HDMI audio infoframe
  ALSA: hda - enable HDMI audio pin out at module loading time
  ALSA: hda - allow multi-channel HDMI audio playback when ELD is not present
  ASoC: Update SDP3430 machine driver for snd_soc_card
  ALSA: hda - Add quirk for Asus z37e (1043:8284)
  sound: Remove OSSlib stuff from linux/soundcard.h
  ASoC: WM8990: Fix kcontrol's private value use in put callback
  ASoC: TLV320AIC3X: Fix kcontrol's private value use in put callback
2009-02-13 08:19:11 -08:00
Paul Mackerras
c07c99b672 perfcounters: make context switch and migration software counters work again
Jaswinder Singh Rajput reported that commit 23a185ca8a caused the
context switch and migration software counters to report zero always.
With that commit, the software counters only count events that occur
between sched-in and sched-out for a task.  This is necessary for the
counter enable/disable prctls and ioctls to work.  However, the
context switch and migration counts are incremented after sched-out
for one task and before sched-in for the next.  Since the increment
doesn't occur while a task is scheduled in (as far as the software
counters are concerned) it doesn't count towards any counter.

Thus the context switch and migration counters need to count events
that occur at any time, provided the counter is enabled, not just
those that occur while the task is scheduled in (from the perf_counter
subsystem's point of view).  The problem though is that the software
counter code can't tell the difference between being enabled and being
scheduled in, and between being disabled and being scheduled out,
since we use the one pair of enable/disable entry points for both.
That is, the high-level disable operation simply arranges for the
counter to not be scheduled in any more, and the high-level enable
operation arranges for it to be scheduled in again.

One way to solve this would be to have sched_in/out operations in the
hw_perf_counter_ops struct as well as enable/disable.  However, this
takes a simpler approach: it adds a 'prev_state' field to the
perf_counter struct that allows a counter's enable method to know
whether the counter was previously disabled or just inactive
(scheduled out), and therefore whether the enable method is being
called as a result of a high-level enable or a schedule-in operation.

This then allows the context switch, migration and page fault counters
to reset their hw.prev_count value in their enable functions only if
they are called as a result of a high-level enable operation.
Although page faults would normally only occur while the counter is
scheduled in, this changes the page fault counter code too in case
there are ever circumstances where page faults get counted against a
task while its counters are not scheduled in.

Reported-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-13 12:20:38 +01:00
Ingo Molnar
b1864e9a1a Merge branch 'x86/core' into perfcounters/core
Conflicts:
	arch/x86/Kconfig
	arch/x86/kernel/apic.c
	arch/x86/kernel/setup_percpu.c
2009-02-13 09:49:38 +01:00
Ingo Molnar
7032e86967 Merge branches 'x86/paravirt', 'x86/pat', 'x86/setup-v2', 'x86/subarch', 'x86/uaccess' and 'x86/urgent' into x86/core 2009-02-13 09:47:32 +01:00
Ingo Molnar
a56cdcb662 Merge branches 'x86/acpi', 'x86/asm', 'x86/cpudetect', 'x86/crashdump', 'x86/debug', 'x86/defconfig', 'x86/doc', 'x86/header-fixes', 'x86/headers' and 'x86/minor-fixes' into x86/core 2009-02-13 09:46:36 +01:00
Ingo Molnar
ab639f3593 Merge branch 'core/percpu' into x86/core 2009-02-13 09:45:09 +01:00
Ingo Molnar
f8a6b2b9ce Merge branch 'linus' into x86/apic
Conflicts:
	arch/x86/kernel/acpi/boot.c
	arch/x86/mm/fault.c
2009-02-13 09:44:22 +01:00
Ingo Molnar
e9c4ffb11f Merge branch 'linus' into perfcounters/core
Conflicts:
	arch/x86/kernel/acpi/boot.c
2009-02-13 09:34:07 +01:00
Linus Torvalds
37bed90094 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits)
  wimax: fix oops in wimax_dev_get_by_genl_info() when looking up non-wimax iface
  net: 4 bytes kernel memory disclosure in SO_BSDCOMPAT gsopt try #2
  netxen: fix compile waring "label ‘set_32_bit_mask’ defined but not used" on IA64 platform
  bnx2: Update version to 1.9.2 and copyright.
  bnx2: Fix jumbo frames error handling.
  bnx2: Update 5709 firmware.
  bnx2: Update 5706/5708 firmware.
  3c505: do not set pcb->data.raw beyond its size
  Documentation/connector/cn_test.c: don't use gfp_any()
  net: don't use in_atomic() in gfp_any()
  IRDA: cnt is off by 1
  netxen: remove pcie workaround
  sun3: print when lance_open() fails
  qlge: bugfix: Add missing rx buf clean index on early exit.
  qlge: bugfix: Fix RX scaling values.
  qlge: bugfix: Fix TSO breakage.
  qlge: bugfix: Add missing dev_kfree_skb_any() call.
  qlge: bugfix: Add missing put_page() call.
  qlge: bugfix: Fix fatal error recovery hang.
  qlge: bugfix: Use netif_receive_skb() and vlan_hwaccel_receive_skb().
  ...
2009-02-12 17:47:15 -08:00
Heiko Carstens
6c5979631b syscall define: fix uml compile bug
With the new system call defines we get this on uml:

arch/um/sys-i386/built-in.o: In function `sys_call_table':
(.rodata+0x308): undefined reference to `sys_sigprocmask'

Reason for this is that uml passes the preprocessor option
-Dsigprocmask=kernel_sigprocmask to gcc when compiling the kernel.
This causes SYSCALL_DEFINE3(sigprocmask, ...) to be expanded to
SYSCALL_DEFINEx(3, kernel_sigprocmask, ...) and finally to a system
call named sys_kernel_sigprocmask.  However sys_sigprocmask is missing
because of this.

To avoid macro expansion for the system call name just concatenate the
name at first define instead of carrying it through severel levels.
This was pointed out by Al Viro.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: WANG Cong <wangcong@zeuux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-11 14:25:36 -08:00
Li Zefan
cfebe563bd cgroups: fix lockdep subclasses overflow
I enabled all cgroup subsystems when compiling kernel, and then:
 # mount -t cgroup -o net_cls xxx /mnt
 # mkdir /mnt/0

This showed up immediately:
 BUG: MAX_LOCKDEP_SUBCLASSES too low!
 turning off the locking correctness validator.

It's caused by the cgroup hierarchy lock:
	for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
		struct cgroup_subsys *ss = subsys[i];
		if (ss->root == root)
			mutex_lock_nested(&ss->hierarchy_mutex, i);
	}

Now we have 9 cgroup subsystems, and the above 'i' for net_cls is 8, but
MAX_LOCKDEP_SUBCLASSES is 8.

This patch uses different lockdep keys for different subsystems.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-11 14:25:36 -08:00
Linus Torvalds
94dba89533 Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  timers: fix TIMER_ABSTIME for process wide cpu timers
  timers: split process wide cpu clocks/timers, fix
  x86: clean up hpet timer reinit
  timers: split process wide cpu clocks/timers, remove spurious warning
  timers: split process wide cpu clocks/timers
  signal: re-add dead task accumulation stats.
  x86: fix hpet timer reinit for x86_64
  sched: fix nohz load balancer on cpu offline
2009-02-11 08:24:32 -08:00
Markus Metzger
9f339e7028 x86, ptrace, mm: fix double-free on race
Ptrace_detach() races with __ptrace_unlink() if the traced task is
reaped while detaching. This might cause a double-free of the BTS
buffer.

Change the ptrace_detach() path to only do the memory accounting in
ptrace_bts_detach() and leave the buffer free to ptrace_bts_untrace()
which will be called from __ptrace_unlink().

The fix follows a proposal from Oleg Nesterov.

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 15:44:20 +01:00
Peter Zijlstra
4da94d49b2 timers: fix TIMER_ABSTIME for process wide cpu timers
The POSIX timer interface allows for absolute time expiry values through the
TIMER_ABSTIME flag, therefore we have to synchronize the timer to the clock
every time we start it.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 14:04:21 +01:00
Peter Zijlstra
3fccfd67df timers: split process wide cpu clocks/timers, fix
To decrease the chance of a missed enable, always enable the timer when we
sample it, we'll always disable it when we find that there are no active timers
in the jiffy tick.

This fixes a flood of warnings reported by Mike Galbraith.

Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-11 14:04:19 +01:00
Ingo Molnar
ffc0467293 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/perfcounters into perfcounters/core 2009-02-11 09:22:14 +01:00
Ingo Molnar
95fd4845ed Merge commit 'v2.6.29-rc4' into perfcounters/core
Conflicts:
	arch/x86/kernel/setup_percpu.c
	arch/x86/mm/fault.c
	drivers/acpi/processor_idle.c
	kernel/irq/handle.c
2009-02-11 09:22:04 +01:00
Paul Mackerras
0475f9ea8e perf_counters: allow users to count user, kernel and/or hypervisor events
Impact: new perf_counter feature

This extends the perf_counter_hw_event struct with bits that specify
that events in user, kernel and/or hypervisor mode should not be
counted (i.e. should be excluded), and adds code to program the PMU
mode selection bits accordingly on x86 and powerpc.

For software counters, we don't currently have the infrastructure to
distinguish which mode an event occurs in, so we currently fail the
counter initialization if the setting of the hw_event.exclude_* bits
would require us to distinguish.  Context switches and CPU migrations
are currently considered to occur in kernel mode.

On x86, this changes the previous policy that only root can count
kernel events.  Now non-root users can count kernel events or exclude
them.  Non-root users still can't use NMI events, though.  On x86 we
don't appear to have any way to control whether hypervisor events are
counted or not, so hw_event.exclude_hv is ignored.

On powerpc, the selection of whether to count events in user, kernel
and/or hypervisor mode is PMU-wide, not per-counter, so this adds a
check that the hw_event.exclude_* settings are the same as other events
on the PMU.  Counters being added to a group have to have the same
settings as the other hardware counters in the group.  Counters and
groups can only be enabled in hw_perf_group_sched_in or power_perf_enable
if they have the same settings as any other counters already on the
PMU.  If we are not running on a hypervisor, the exclude_hv setting
is ignored (by forcing it to 0) since we can't ever get any
hypervisor events.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-02-11 15:06:59 +11:00
Chuck Ebbert
e672f7db76 pkt_sched: type should be __u32 in header
Using u32 in this header breaks the build of iptables.

Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-10 17:18:17 -08:00
Stefan Richter
1db8508cf4 hugetlbfs: fix build failure with !CONFIG_HUGETLBFS
Fix regression due to 5a6fe12595,
"Do not account for the address space used by hugetlbfs using VM_ACCOUNT"
which added an argument to the function hugetlb_file_setup() but not to
the macro hugetlb_file_setup().

Reported-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-10 14:56:59 -08:00
Linus Torvalds
29ef01179d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (23 commits)
  bridge: Fix LRO crash with tun
  IPv6: fix to set device name when new IPv6 over IPv6 tunnel device is created.
  gianfar: Fix boot hangs while bringing up gianfar ethernet
  netfilter: xt_sctp: sctp chunk mapping doesn't work
  netfilter: ctnetlink: fix echo if not subscribed to any multicast group
  netfilter: ctnetlink: allow changing NAT sequence adjustment in creation
  netfilter: nf_conntrack_ipv6: don't track ICMPv6 negotiation message
  netfilter: fix tuple inversion for Node information request
  netxen: fix msi-x interrupt handling
  de2104x: force correct order when writing to rx ring
  tun: Fix unicast filter overflow
  drivers/isdn: introduce missing kfree
  drivers/atm: introduce missing kfree
  sunhme: Don't match PCI devices in SBUS probe.
  9p: fix endian issues [attempt 3]
  net_dma: call dmaengine_get only if NET_DMA enabled
  3c509: Fix resume from hibernation for PnP mode.
  sungem: Soft lockup in sungem on Netra AC200 when switching interface up
  RxRPC: Fix a potential NULL dereference
  r8169: Don't update statistics counters when interface is down
  ...
2009-02-10 11:48:11 -08:00
Mel Gorman
5a6fe12595 Do not account for the address space used by hugetlbfs using VM_ACCOUNT
When overcommit is disabled, the core VM accounts for pages used by anonymous
shared, private mappings and special mappings. It keeps track of VMAs that
should be accounted for with VM_ACCOUNT and VMAs that never had a reserve
with VM_NORESERVE.

Overcommit for hugetlbfs is much riskier than overcommit for base pages
due to contiguity requirements. It avoids overcommiting on both shared and
private mappings using reservation counters that are checked and updated
during mmap(). This ensures (within limits) that hugepages exist in the
future when faults occurs or it is too easy to applications to be SIGKILLed.

As hugetlbfs makes its own reservations of a different unit to the base page
size, VM_ACCOUNT should never be set. Even if the units were correct, we would
double account for the usage in the core VM and hugetlbfs. VM_NORESERVE may
be set because an application can request no reserves be made for hugetlbfs
at the risk of getting killed later.

With commit fc8744adc8, VM_NORESERVE and
VM_ACCOUNT are getting unconditionally set for hugetlbfs-backed mappings. This
breaks the accounting for both the core VM and hugetlbfs, can trigger an
OOM storm when hugepage pools are too small lockups and corrupted counters
otherwise are used. This patch brings hugetlbfs more in line with how the
core VM treats VM_NORESERVE but prevents VM_ACCOUNT being set.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-10 10:48:42 -08:00
Tejun Heo
6cd61c0baa elf: add ELF_CORE_COPY_KERNEL_REGS()
ELF core dump is used for both user land core dump and kernel crash
dump.  Depending on architecture, register might need to be accessed
differently for userland and kernel.  Allow architectures to define
ELF_CORE_COPY_KERNEL_REGS() and use different operation for kernel
register dump.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-10 00:41:26 +01:00
Arnd Bergmann
43a990765a sound: Remove OSSlib stuff from linux/soundcard.h
Removed OSSlib stuff from linux/soundcard.h to fix the warnings for
'make headers_check'.

This patch breaks building against OSSlib with the kernel headers
instead of its own headers. It should still work with any
version of the library from the 2003 onwards which provide
their own headers for the latest interface.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2009-02-10 00:02:54 +01:00
Linus Torvalds
d7c41b6165 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: scatterwalk - Avoid flush_dcache_page on slab pages
  crypto: shash - Fix tfm destruction
  crypto: api - Fix zeroing on free
  crypto: shash - Fix module refcount
  crypto: api - Fix algorithm test race that broke aead initialisation
2009-02-09 08:52:02 -08:00
Kyle McMartin
a5ef7ca0e2 x86: spinlocks: define dummy __raw_spin_is_contended
Architectures other than mips and x86 are not using ticket spinlocks.
Therefore, the contention on the lock is meaningless, since there is
nobody known to be waiting on it (arguably /fairly/ unfair locks).

Dummy it out to return 0 on other architectures.

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-09 08:15:39 -08:00
Ingo Molnar
249d51b53a Merge commit 'v2.6.29-rc4' into core/percpu
Conflicts:
	arch/x86/mach-voyager/voyager_smp.c
	arch/x86/mm/fault.c
2009-02-09 14:58:11 +01:00
Yinghai Lu
7d97277b75 acpi/x86: introduce __apci_map_table, v4
to prevent wrongly overwriting fixmap that still want to use.

ACPI used to rely on low mappings being all linearly mapped and
grew a habit: it never really unmapped certain kinds of tables
after use.

This can cause problems - for example the hypothetical case
when some spurious access still references it.

v2: remove prev_map and prev_size in __apci_map_table
v3: let acpi_os_unmap_memory() call early_iounmap too, so remove extral calling to
early_acpi_os_unmap_memory
v4: fix typo in one acpi_get_table_with_size calling

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Acked-by: Len Brown <len.brown@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 13:35:07 +01:00
Paul Mackerras
23a185ca8a perf_counters: make software counters work as per-cpu counters
Impact: kernel crash fix

Yanmin Zhang reported that using a PERF_COUNT_TASK_CLOCK software
counter as a per-cpu counter would reliably crash the system, because
it calls __task_delta_exec with a null pointer.  The page fault,
context switch and cpu migration counters also won't function
correctly as per-cpu counters since they reference the current task.

This fixes the problem by redirecting the task_clock counter to the
cpu_clock counter when used as a per-cpu counter, and by implementing
per-cpu page fault, context switch and cpu migration counters.

Along the way, this:

- Initializes counter->ctx earlier, in perf_counter_alloc, so that
  sw_perf_counter_init can use it
- Adds code to kernel/sched.c to count task migrations into each
  cpu, in rq->nr_migrations_in
- Exports the per-cpu context switch and task migration counts
  via new functions added to kernel/sched.c
- Makes sure that if sw_perf_counter_init fails, we don't try to
  initialize the counter as a hardware counter.  Since the user has
  passed a negative, non-raw event type, they clearly don't intend
  for it to be interpreted as a hardware event.

Reported-by: "Zhang Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 12:47:16 +01:00
Brian Gerst
d3770449d3 percpu: make PER_CPU_BASE_SECTION overridable by arches
Impact: bug fix

IA-64 needs to put percpu data in the seperate section even on UP.
Fixes regression caused by "percpu: refactor percpu.h"

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-09 10:30:29 +01:00
Cornelia Huck
766ccb9ed4 async: Rename _special -> _domain for clarity.
Rename the async_*_special() functions to async_*_domain(), which
describes the purpose of these functions much better.
[Broke up long lines to silence checkpatch]

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2009-02-08 09:56:11 -08:00
Jaswinder Singh Rajput
0fb807c3e5 unconditionally include asm/types.h from linux/types.h
Reported-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-02-08 11:00:25 +05:30
Linus Torvalds
e83102cab0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI PM: make the PM core more careful with drivers using the new PM framework
  PCI PM: Read power state from device after trying to change it on resume
  PCI PM: Do not disable and enable bridges during suspend-resume
  PCI: PCIe portdrv: Simplify suspend and resume
  PCI PM: Fix saving of device state in pci_legacy_suspend
  PCI PM: Check if the state has been saved before trying to restore it
  PCI PM: Fix handling of devices without drivers
  PCI: return error on failure to read PCI ROMs
  PCI: properly clean up ASPM link state on device remove
2009-02-07 10:46:30 -08:00
Rusty Russell
7f9a50a5b8 module: remove over-zealous check in __module_get()
Impact: fix spurious BUG_ON() triggered under load

module_refcount() isn't reliable outside stop_machine(), as demonstrated
by Karsten Keil <kkeil@suse.de>, networking can trigger it under load
(an inc on one cpu and dec on another while module_refcount() is tallying
 can give false results, for example).

Almost noone should be using __module_get, but that's another issue.

Cc: Karsten Keil <kkeil@suse.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-07 08:33:01 -08:00
David S. Miller
b4bd07c20b net_dma: call dmaengine_get only if NET_DMA enabled
Based upon a patch from Atsushi Nemoto <anemo@mba.ocn.ne.jp>

--------------------
The commit 649274d993 ("net_dma:
acquire/release dma channels on ifup/ifdown") added unconditional call
of dmaengine_get() to net_dma.  The API should be called only if
NET_DMA was enabled.
--------------------

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Dan Williams <dan.j.williams@intel.com>
2009-02-06 22:06:43 -08:00
Jaswinder Singh Rajput
527bdfee18 make linux/types.h as assembly safe
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
2009-02-06 20:47:58 +05:30
Ingo Molnar
7d8e23df69 timers: split process wide cpu clocks/timers, remove spurious warning
Mike Galbraith reported that the new warning in thread_group_cputimer()
triggers en masse with Amarok running.

Oleg Nesterov observed:

  Can't fastpath_timer_check()->thread_group_cputimer() have the
  false warning too? Suppose we had the timer, then posix_cpu_timer_del()
  removes this timer, but task_cputime_zero(&sig->cputime_expires) still
  not true.

Remove the spurious debug warning.

Reported-by: Mike Galbraith <efault@gmx.de>
Explained-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-06 14:57:51 +01:00
Ingo Molnar
9d45cf9e36 Merge branch 'x86/urgent' into x86/apic
Conflicts:
	arch/x86/mach-default/setup.c

Semantic merge:
	arch/x86/kernel/irqinit_32.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 22:30:01 +01:00
Ingo Molnar
a146649bc1 smp, generic: introduce arch_disable_smp_support(), build fix
This function should be provided on UP too.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 22:27:57 +01:00
Ingo Molnar
65a4e574d2 smp, generic: introduce arch_disable_smp_support() instead of disable_ioapic_setup()
Impact: cleanup

disable_ioapic_setup() in init/main.c is ugly as the function is
x86-specific. The #ifdef inline prototype there is ugly too.

Replace it with a generic arch_disable_smp_support() function - which
has a weak alias for non-x86 architectures and for non-ioapic x86 builds.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 22:27:54 +01:00
Johannes Weiner
777c6c5f1f wait: prevent exclusive waiter starvation
With exclusive waiters, every process woken up through the wait queue must
ensure that the next waiter down the line is woken when it has finished.

Interruptible waiters don't do that when aborting due to a signal.  And if
an aborting waiter is concurrently woken up through the waitqueue, noone
will ever wake up the next waiter.

This has been observed with __wait_on_bit_lock() used by
lock_page_killable(): the first contender on the queue was aborting when
the actual lock holder woke it up concurrently.  The aborted contender
didn't acquire the lock and therefor never did an unlock followed by
waking up the next waiter.

Add abort_exclusive_wait() which removes the process' wait descriptor from
the waitqueue, iff still queued, or wakes up the next waiter otherwise.
It does so under the waitqueue lock.  Racing with a wake up means the
aborting process is either already woken (removed from the queue) and will
wake up the next waiter, or it will remove itself from the queue and the
concurrent wake up will apply to the next waiter after it.

Use abort_exclusive_wait() in __wait_event_interruptible_exclusive() and
__wait_on_bit_lock() when they were interrupted by other means than a wake
up through the queue.

[akpm@linux-foundation.org: coding-style fixes]
Reported-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Mentored-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chuck Lever <cel@citi.umich.edu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>		["after some testing"]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-05 12:56:48 -08:00
Andrea Righi
1f5e31d7e5 fbmem: don't call copy_from/to_user() with mutex held
Avoid calling copy_from/to_user() with fb_info->lock mutex held in fbmem
ioctl().

fb_mmap() is called under mm->mmap_sem (A) held, that also acquires
fb_info->lock (B); fb_ioctl() takes fb_info->lock (B) and does
copy_from/to_user() that might acquire mm->mmap_sem (A), causing a
deadlock.

NOTE: it doesn't push down the fb_info->lock in each own driver's
fb_ioctl(), so there are still potential deadlocks elsewhere.

Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Cc: Dave Jones <davej@redhat.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Johannes Weiner <hannes@saeurebad.de>
Cc: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-05 12:56:46 -08:00
Peter Zijlstra
ac7b900490 generic swap(): don't return a value from swap()
The swap() macro is accidentally retuning the value of its first argument.
Change it into a doesn't-return-anything macro before someone goes and
relies upon this behaviour.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wu Fengguang <wfg@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-05 12:56:46 -08:00
Peter Zijlstra
4cd4c1b40d timers: split process wide cpu clocks/timers
Change the process wide cpu timers/clocks so that we:

 1) don't mess up the kernel with too many threads,
 2) don't have a per-cpu allocation for each process,
 3) have no impact when not used.

In order to accomplish this we're going to split it into two parts:

 - clocks; which can take all the time they want since they run
           from user context -- ie. sys_clock_gettime(CLOCK_PROCESS_CPUTIME_ID)

 - timers; which need constant time sampling but since they're
           explicity used, the user can pay the overhead.

The clock readout will go back to a full sum of the thread group, while the
timers will run of a global 'clock' that only runs when needed, so only
programs that make use of the facility pay the price.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 13:04:33 +01:00