Commit graph

45583 commits

Author SHA1 Message Date
Linus Torvalds
4aa705b18b ARM: SoC: platform support for v4.2
Our SoC branch usually contains expanded support for new SoCs and
 other core platform code. Some highlights from this round:
 
 - sunxi: SMP support for A23 SoC
 - socpga: big-endian support
 - pxa: conversion to common clock framework
 - bcm: SMP support for BCM63138
 - imx: support new I.MX7D SoC
 - zte: basic support for ZX296702 SoC
 
  Conflicts:
 	arch/arm/mach-socfpga/core.h
 
 Trivial remove/remove conflict with our cleanup branch.
 Resolution: remove both sides
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVi4RMAAoJEFk3GJrT+8Zl6/kP/1Rv9O++1Kxua6R54Og6AF1J
 0miFr2fnUrUWUYg/NVbseRH5bBe6N6ir3SQMfde8W2/QibEjOoEwSwrle+mC/eiq
 CE0x0gtyRvXMrMU/FWkOvbmmw9uv5oz1z3IHZV6AiecNuSMLUBPfamryikQ8C+d1
 O/QZtX543tJQJDOBihO5cuhoVVM37UX0unNmqGsyswlyqTPF8FxcIJAYVNtnxjmj
 AFaOB0nDJKLKFTiX2Ype2wOxxJX1lrLatNo4W4T+YaaK+i1uCOhgTdSN+n49K7YA
 KNDFEgZFQqT8VMJyG+eJVeYF+cI7yWQ7lBzIftPUjPk/7+dIHBjWPz2QdjVz3U38
 kxncf4S9xGAF5G2rcKe4mFrfT3Y8QLWQpA/jFs06yLwW1O3Hlfq3DzMdGNcF7hth
 17LOP8namn9+NepZEp/vAlFzRRypxWWtbkPNBIItkImC6zn0IiGjBy50DE1io27W
 hmQcnMb7d+0wWl2Y8OmR2lZSB97JiRZkRYMCVHVt+0zGJzp4prLvl9wbjh1VXkPv
 ERCDJ9nCmZsl7ZVmIXMI7KNXYuPNp7R/QAzCvuSUueswF0qxTAQ0VSSBwRMqvQsQ
 UUNC6p63VnjUeMUdn2EBsUQZ0Uqw3t2U5TtvooHNt9FkiGsSpwjWrvVD+LItaPoJ
 GPeeJrJaYQsDvTrO8wjU
 =ZtPK
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC platform support updates from Kevin Hilman:
 "Our SoC branch usually contains expanded support for new SoCs and
  other core platform code.  Some highlights from this round:

   - sunxi: SMP support for A23 SoC
   - socpga: big-endian support
   - pxa: conversion to common clock framework
   - bcm: SMP support for BCM63138
   - imx: support new I.MX7D SoC
   - zte: basic support for ZX296702 SoC"

* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (134 commits)
  ARM: zx: Add basic defconfig support for ZX296702
  ARM: dts: zx: add an initial zx296702 dts and doc
  clk: zx: add clock support to zx296702
  dt-bindings: Add #defines for ZTE ZX296702 clocks
  ARM: socfpga: fix build error due to secondary_startup
  MAINTAINERS: ARM64: EXYNOS: Extend entry for ARM64 DTS
  ARM: ep93xx: simone: support for SPI-based MMC/SD cards
  MAINTAINERS: update Shawn's email to use kernel.org one
  ARM: socfpga: support suspend to ram
  ARM: socfpga: add CPU_METHOD_OF_DECLARE for Arria 10
  ARM: socfpga: use CPU_METHOD_OF_DECLARE for socfpga_cyclone5
  ARM: EXYNOS: register power domain driver from core_initcall
  ARM: EXYNOS: use PS_HOLD based poweroff for all supported SoCs
  ARM: SAMSUNG: Constify platform_device_id
  ARM: EXYNOS: Constify irq_domain_ops
  ARM: EXYNOS: add coupled cpuidle support for Exynos3250
  ARM: EXYNOS: add exynos_get_boot_addr() helper
  ARM: EXYNOS: add exynos_set_boot_addr() helper
  ARM: EXYNOS: make exynos_core_restart() less verbose
  ARM: EXYNOS: fix exynos_boot_secondary() return value on timeout
  ...
2015-06-26 11:34:35 -07:00
Linus Torvalds
c11d716218 ARM: SoC cleanups for v4.2
A relatively small setup of cleanups this time around, and similar to last time
 the bulk of it is removal of legacy board support:
 
 - OMAP: removal of legacy (non-DT) booting for several platforms
 - i.MX: remove some legacy board files
 
 Conflicts: None
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVi4RJAAoJEFk3GJrT+8ZlS78P/28n9U/KBoBkHWFhLmsgAUC/
 X06CfjxcFuRndLoj96jxhWdaf65GwFsDPCsWfI270X7kYlu08dn2AlLOOHxNf1Ou
 DvfqXz8dBl3z8pg0VHcZzUaV9CwfbvIHbalD2rLQ26sF6vRfHrF7AC+ITTi2TEpY
 CMX7LF1igX8nnRRZpl0Oya8Uvr8vsjuuvSq6I2GFav/JhZpYhz19FqbVPtu13Kuw
 +AZrwvgn5KwSxYw6cDjwWgbSBBuZi2m22LJxrs8z/ckRVA0ErpKLD83GZ5DeA+Ii
 m+f0xv/EE59AnfCnlpnZizxeQQ7BSVAZbaSp4GuXgLtg+1zJZP9tsJ8gLGuSLIo4
 TMMcB7K2VQMs4orAEvd0B7QB2WEtu4NDgKmD7k7tSy0uCMqxY/ItYDFVLpLRz9+r
 cnB469H312MJuk+7eH324n62lWhlQ8h1D2zHXMDCo4kvgZg/Qcbl6joj34B6oTqz
 Pa1RoJwrh4CsihDF/2aUt1IOZ+4mm0hhg4gocZEyqvf7Xdya2oFiyoUQCLnPQfbd
 Vvw7JMxnIW9iQbYzu4eHlZin3TK53osmIepj7pnlrmdLcM046fLONDxsg9JovsrS
 +TwHE00DZdAgpH/z1aUo8Ft4xO60FkPK2YcOCUvKZKi3mIHenvZZQo+s28suNni1
 Q1dz9aZWNvaunmRl4EcH
 =0zof
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC cleanups from Kevin Hilman:
 "A relatively small setup of cleanups this time around, and similar to
  last time the bulk of it is removal of legacy board support:

   - OMAP: removal of legacy (non-DT) booting for several platforms

   - i.MX: remove some legacy board files"

* tag 'armsoc-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (36 commits)
  ARM: fix EFM32 build breakage caused by cpu_resume_arm
  ARM: 8389/1: Add cpu_resume_arm() for firmwares that resume in ARM state
  ARM: v7 setup function should invalidate L1 cache
  mach-omap2: Remove use of deprecated marco, PTR_RET in devices.c
  ARM: OMAP2+: Remove calls to deprecacted marco,PTR_RET in the files,fb.c and pmu.c
  ARM: OMAP2+: Constify irq_domain_ops
  ARM: OMAP2+: use symbolic defines for console loglevels instead of numbers
  ARM: at91: remove useless Makefile.boot
  ARM: at91: remove at91rm9200_sdramc.h
  ARM: at91: remove mach/at91_ramc.h and mach/at91rm9200_mc.h
  ARM: at91/pm: use the atmel-mc syscon defines
  pcmcia: at91_cf: Use syscon to configure the MC/smc
  ARM: at91: declare the at91rm9200 memory controller as a syscon
  mfd: syscon: Add Atmel MC (Memory Controller) registers definition
  ARM: at91: drop sam9_smc.c
  ata: at91: use syscon to configure the smc
  ARM: ux500: delete static resource defines
  ARM: ux500: rename ux500_map_io
  ARM: ux500: look up PRCMU resource from DT
  ARM: ux500: kill off L2CC static map
  ...
2015-06-26 11:08:27 -07:00
Linus Torvalds
47a469421d Merge branch 'akpm' (patches from Andrew)
Merge second patchbomb from Andrew Morton:

 - most of the rest of MM

 - lots of misc things

 - procfs updates

 - printk feature work

 - updates to get_maintainer, MAINTAINERS, checkpatch

 - lib/ updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (96 commits)
  exit,stats: /* obey this comment */
  coredump: add __printf attribute to cn_*printf functions
  coredump: use from_kuid/kgid when formatting corename
  fs/reiserfs: remove unneeded cast
  NILFS2: support NFSv2 export
  fs/befs/btree.c: remove unneeded initializations
  fs/minix: remove unneeded cast
  init/do_mounts.c: add create_dev() failure log
  kasan: remove duplicate definition of the macro KASAN_FREE_PAGE
  fs/efs: femove unneeded cast
  checkpatch: emit "NOTE: <types>" message only once after multiple files
  checkpatch: emit an error when there's a diff in a changelog
  checkpatch: validate MODULE_LICENSE content
  checkpatch: add multi-line handling for PREFER_ETHER_ADDR_COPY
  checkpatch: suggest using eth_zero_addr() and eth_broadcast_addr()
  checkpatch: fix processing of MEMSET issues
  checkpatch: suggest using ether_addr_equal*()
  checkpatch: avoid NOT_UNIFIED_DIFF errors on cover-letter.patch files
  checkpatch: remove local from codespell path
  checkpatch: add --showfile to allow input via pipe to show filenames
  ...
2015-06-26 09:52:05 -07:00
Linus Torvalds
c13c810063 RTC for 4.2
Core:
  - Coding style and whitespace fixes (interface, Makefile and Kconfig)
  - New rtc_tm_sub() helper
  - New CONFIG_RTC_SYSTOHC_DEVICE option
  - Removed rtc_set_mmss()
 
 New drivers:
  - Mediatek MT6397
  - Cortina Gemini
 
 Drivers:
  - Year 2106 fixes for isl1208, pcf8563 and sunxi
  - update author email for at32ap700x and efi
  - ds1307: alarm fix
  - efi: use correct EFI 'epoch'
  - hym8563: make irq optional
  - imxdi: cleanups and better handling of the security/tamper monitoring
  - snvs: fix wakealarm
  - Compilation fixes or warning removal for gemini, mt6397, palmas, pfc8563
  - Trivial cleanups for ab8500, ds1216, ds1286, ds1672, ep93xx,
 hid-sensor-time, max6900, max8998, max77686, max77802, mc13xxx, mv, mxc, s3c,
 spear, v3020
  - Kconfig fixes for stmp3xxx and xgene
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJVi7VHAAoJEKbNnwlvZCyzv+YP/1sMAsrusvx3cUEmhJurNk5H
 +vPUdogbr8df9voLkHqTaK1B5+LrL5EXYu+y/6VOwaLIiYv7P9k/eOeZ44fraidc
 H7xnDQ7cOGzE5LpjPg8+o56rh3rGDF1UdZBwbFTgfv4gmZi+GAf+2MEUam+dVVtF
 K9+7Xu77A6EgW+h2uHix8GVtflHdRk/Vxjxl0uOR9BM9O883N5dYYZ7sjenfG65D
 dctAEo+U1xUCw3GCIQ63DvD0joThQk7IB9xJmK/lE5nIv4tbGp5PV8VSNkjKxbFz
 1wa9SR1dhBdjbJ6ZeHh8N5EJ0d5MrssW1blsb7/06MHTu0KYgCndU5EEDOE2pVqD
 GJooYa6RsNqz/KElJ7/0z4T+DJxW7BWaZYRIwv4jpuA4/Y74O5FDw0ULOqxhB8zd
 kVgfWb0feNQikaZio8gOyzIBkw3CBZBu+f6HMJGaeYWSzTNsDLgA8ee7mUawMP31
 ljOxOuPMiTGUbAczK9URo/acZYWNyCOQdm85FxCuwCuPe48m7CyTU6wWBX0Vwj5q
 y3sHH5stcykCesneNj+IJ8v/knqo9d6M43CbCbwBeo0DJskuXF6jjQ9QLBNZP9U0
 7qQa7isL3j3J7T5EouJIBQH2yy4SFiffVOZOPbJ8l+iRUPZUMdOEs2VkQs8BKIZ+
 znu/Ov941h8/HbyI6rcl
 =mugF
 -----END PGP SIGNATURE-----

Merge tag 'rtc-v4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC updates from Alexandre Belloni:
 "Core:
   - Coding style and whitespace fixes (interface, Makefile and Kconfig)
   - New rtc_tm_sub() helper
   - New CONFIG_RTC_SYSTOHC_DEVICE option
   - Removed rtc_set_mmss()

  New drivers:
   - Mediatek MT6397
   - Cortina Gemini

  Drivers:
   - Year 2106 fixes for isl1208, pcf8563 and sunxi
   - update author email for at32ap700x and efi
   - ds1307: alarm fix
   - efi: use correct EFI 'epoch'
   - hym8563: make irq optional
   - imxdi: cleanups and better handling of the security/tamper monitoring
   - snvs: fix wakealarm
   - Compilation fixes or warning removal for gemini, mt6397, palmas, pfc8563
   - Trivial cleanups for ab8500, ds1216, ds1286, ds1672, ep93xx,
     hid-sensor-time, max6900, max8998, max77686, max77802, mc13xxx, mv,
     mxc, s3c, spear, v3020
   - Kconfig fixes for stmp3xxx and xgene"

* tag 'rtc-v4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (48 commits)
  rtc: remove useless I2C dependencies
  rtc: whitespace fixes
  rtc: Properly sort Makefile
  MAINTAINERS: Add RTC subsystem repository
  rtc: pfc8563: fix uninitialized variable warning
  rtc: ds1307: Enable the mcp794xx alarm after programming time
  rtc: hym8563: make the irq optional
  rtc: gemini: fix cocci warnings
  rtc: mv: correct 24 hour error message
  rtc: mv: use BIT()
  rtc: efi: use correct EFI 'epoch'
  rtc: interface: Remove rtc_set_mmss()
  sparc: time: Replace update_persistent_clock() with CONFIG_RTC_SYSTOHC
  rtc: NTP: Add CONFIG_RTC_SYSTOHC_DEVICE for NTP synchronization
  rtc: sunxi: Replace deprecated rtc_tm_to_time()
  rtc: isl1208: Replace deprecated rtc_tm_to_time()
  rtc: Introduce rtc_tm_sub() helper function
  rtc: pcf8563: Replace deprecated rtc_time_to_tm() and rtc_tm_to_time()
  rtc: palmas: Initialise bb_charging flag before using it
  rtc: simplify use of devm_ioremap_resource
  ...
2015-06-25 18:55:33 -07:00
Linus Torvalds
9390bd0d14 Merge branch 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox updates from Jassi Brar.

* 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
  mailbox/bcm2835: Fix mailbox full detection.
  dt: mailbox: Remove 'mbox-names property is discouraged' message from binding
  mailbox: Add ability for clients to request channels by name
  mailbox: Enable BCM2835 mailbox support
  dt/bindings: Add binding for the BCM2835 mailbox driver
  mailbox: Fix up error handling in mbox_request_channel()
  mailbox: Make mbox_chan_ops const
  mailbox: altera: Add dependency on HAS_IOMEM
2015-06-25 18:38:40 -07:00
Linus Torvalds
0db9723cac Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management updates from Zhang Rui:
 "Specifics:

   - enhance Thermal Framework with several new capabilities:

       * use power estimates
       * compute weights with relative integers instead of percentages
       * allow governors to have private data in thermal zones
       * export thermal zone parameters through sysfs

     Thanks to the ARM thermal team (Javi, Punit, KP).

   - introduce a new thermal governor: power allocator.  First in kernel
     closed loop PI(D) controller for thermal control.  Thanks to ARM
     thermal team.

   - enhance OF thermal to allow thermal zones to have sustainable power
     HW specification.  Thanks to Punit.

   - introduce thermal driver for Intel Quark SoC x1000platform.  Thanks
     to Ong, Boon Leong.

   - introduce QPNP PMIC temperature alarm driver.  Thanks to Ivan T. I.

   - introduce thermal driver for Hisilicon hi6220.  Thanks to
     kongxinwei.

   - enhance Exynos thermal driver to handle Exynos5433 TMU.  Thanks to
     Chanwoo C.

   - TI thermal driver now has a better implementation for EOCZ bit.
     From Pavel M.

   - add id for Skylake processors in int340x processor thermal driver.

   - a couple of small fixes and cleanups."

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (36 commits)
  thermal: hisilicon: add new hisilicon thermal sensor driver
  dt-bindings: Document the hi6220 thermal sensor bindings
  thermal: of-thermal: add support for reading coefficients property
  thermal: support slope and offset coefficients
  thermal: power_allocator: round the division when divvying up power
  thermal: exynos: Add the support for Exynos5433 TMU
  thermal: cpu_cooling: Fix power calculation when CPUs are offline
  thermal: cpu_cooling: Remove cpu_dev update on policy CPU update
  thermal: export thermal_zone_parameters to sysfs
  thermal: cpu_cooling: Check memory allocation of power_table
  ti-soc-thermal: request temperature periodically if hw can't do that itself
  ti-soc-thermal: implement eocz bit to make driver useful on omap3
  cleanup ti-soc-thermal
  thermal: remove stale THERMAL_POWER_ACTOR select
  thermal: Default OF created trip points to writable
  thermal: core: Add Kconfig option to enable writable trips
  thermal: x86_pkg_temp: drop const for thermal_zone_parameters
  of: thermal: Introduce sustainable power for a thermal zone
  thermal: add trace events to the power allocator governor
  thermal: introduce the Power Allocator governor
  ...
2015-06-25 17:51:55 -07:00
Linus Torvalds
f7b08217c7 Merge branch 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
Pull DMI updates from Jean Delvare:
 "The most important change is the new sysfs interface to the DMI table,
  which will let user-space tools (such as dmidecode) access the table
  without relying on /dev/mem"

* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  firmware: dmi: struct dmi_header should be packed
  firmware: dmi_scan: Coding style cleanups
  Documentation: ABI: sysfs-firmware-dmi: add -entries suffix to file name
  firmware: dmi_scan: add SBMIOS entry and DMI tables
  firmware: dmi_scan: Trim DMI table length before exporting it
  firmware: dmi_scan: Rename dmi_table to dmi_decode_table
  firmware: dmi: List my quilt tree
  firmware: dmi_scan: Only honor end-of-table for 64-bit tables
2015-06-25 17:14:01 -07:00
Rasmus Villemoes
94df290404 lib/string.c: introduce strreplace()
Strings are sometimes sanitized by replacing a certain character (often
'/') by another (often '!').  In a few places, this is done the same way
Schlemiel the Painter would do it.  Others are slightly smarter but still
do multiple strchr() calls.  Introduce strreplace() to do this using a
single function call and a single pass over the string.

One would expect the return value to be one of three things: void, s, or
the number of replacements made.  I chose the fourth, returning a pointer
to the end of the string.  This is more likely to be useful (for example
allowing the caller to avoid a strlen call).

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:40 -07:00
Vasily Averin
3ea4331c60 check_syslog_permissions() cleanup
Patch fixes drawbacks in heck_syslog_permissions() noticed by AKPM:
"from_file handling makes me cry.

That's not a boolean - it's an enumerated value with two values
currently defined.

But the code in check_syslog_permissions() treats it as a boolean and
also hardwires the knowledge that SYSLOG_FROM_PROC == 1 (or == `true`).

And the name is wrong: it should be called from_proc to match
SYSLOG_FROM_PROC."

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:39 -07:00
Tejun Heo
6fe29354be printk: implement support for extended console drivers
printk log_buf keeps various metadata for each message including its
sequence number and timestamp.  The metadata is currently available only
through /dev/kmsg and stripped out before passed onto console drivers.  We
want this metadata to be available to console drivers too so that console
consumers can get full information including the metadata and dictionary,
which among other things can be used to detect whether messages got lost
in transit.

This patch implements support for extended console drivers.  Consoles can
indicate that they want extended messages by setting the new CON_EXTENDED
flag and they'll be fed messages formatted the same way as /dev/kmsg.

 "<level>,<sequnum>,<timestamp>,<contflag>;<message text>\n"

If extended consoles exist, in-kernel fragment assembly is disabled.  This
ensures that all messages emitted to consoles have full metadata including
sequence number.  The contflag carries enough information to reassemble
the fragments from the reader side trivially.  Note that this only affects
/dev/kmsg.  Regular console and /proc/kmsg outputs are not affected by
this change.

* Extended message formatting for console drivers is enabled iff there
  are registered extended consoles.

* Comment describing /dev/kmsg message format updated to add missing
  contflag field and help distinguishing variable from verbatim terms.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: David Miller <davem@davemloft.net>
Cc: Kay Sievers <kay@vrfy.org>
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:38 -07:00
Tejun Heo
d43ff430f4 printk: guard the amount written per line by devkmsg_read()
This patchset updates netconsole so that it can emit messages with the
same header as used in /dev/kmsg which gives neconsole receiver full log
information which enables things like structured logging and detection
of lost messages.

This patch (of 7):

devkmsg_read() uses 8k buffer and assumes that the formatted output
message won't overrun which seems safe given LOG_LINE_MAX, the current use
of dict and the escaping method being used; however, we're planning to use
devkmsg formatting wider and accounting for the buffer size properly isn't
that complicated.

This patch defines CONSOLE_EXT_LOG_MAX as 8192 and updates devkmsg_read()
so that it limits output accordingly.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: David Miller <davem@davemloft.net>
Cc: Kay Sievers <kay@vrfy.org>
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:38 -07:00
Josh Triplett
3033f14ab7 clone: support passing tls argument via C rather than pt_regs magic
clone has some of the quirkiest syscall handling in the kernel, with a
pile of special cases, historical curiosities, and architecture-specific
calling conventions.  In particular, clone with CLONE_SETTLS accepts a
parameter "tls" that the C entry point completely ignores and some
assembly entry points overwrite; instead, the low-level arch-specific
code pulls the tls parameter out of the arch-specific register captured
as part of pt_regs on entry to the kernel.  That's a massive hack, and
it makes the arch-specific code only work when called via the specific
existing syscall entry points; because of this hack, any new clone-like
system call would have to accept an identical tls argument in exactly
the same arch-specific position, rather than providing a unified system
call entry point across architectures.

The first patch allows architectures to handle the tls argument via
normal C parameter passing, if they opt in by selecting
HAVE_COPY_THREAD_TLS.  The second patch makes 32-bit and 64-bit x86 opt
into this.

These two patches came out of the clone4 series, which isn't ready for
this merge window, but these first two cleanup patches were entirely
uncontroversial and have acks.  I'd like to go ahead and submit these
two so that other architectures can begin building on top of this and
opting into HAVE_COPY_THREAD_TLS.  However, I'm also happy to wait and
send these through the next merge window (along with v3 of clone4) if
anyone would prefer that.

This patch (of 2):

clone with CLONE_SETTLS accepts an argument to set the thread-local
storage area for the new thread.  sys_clone declares an int argument
tls_val in the appropriate point in the argument list (based on the
various CLONE_BACKWARDS variants), but doesn't actually use or pass along
that argument.  Instead, sys_clone calls do_fork, which calls
copy_process, which calls the arch-specific copy_thread, and copy_thread
pulls the corresponding syscall argument out of the pt_regs captured at
kernel entry (knowing what argument of clone that architecture passes tls
in).

Apart from being awful and inscrutable, that also only works because only
one code path into copy_thread can pass the CLONE_SETTLS flag, and that
code path comes from sys_clone with its architecture-specific
argument-passing order.  This prevents introducing a new version of the
clone system call without propagating the same architecture-specific
position of the tls argument.

However, there's no reason to pull the argument out of pt_regs when
sys_clone could just pass it down via C function call arguments.

Introduce a new CONFIG_HAVE_COPY_THREAD_TLS for architectures to opt into,
and a new copy_thread_tls that accepts the tls parameter as an additional
unsigned long (syscall-argument-sized) argument.  Change sys_clone's tls
argument to an unsigned long (which does not change the ABI), and pass
that down to copy_thread_tls.

Architectures that don't opt into copy_thread_tls will continue to ignore
the C argument to sys_clone in favor of the pt_regs captured at kernel
entry, and thus will be unable to introduce new versions of the clone
syscall.

Patch co-authored by Josh Triplett and Thiago Macieira.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thiago Macieira <thiago.macieira@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:38 -07:00
Joe Perches
8c7fbe5795 stddef.h: move offsetofend inside #ifndef/#endif guard, neaten
Commit 3876488444 ("include/stddef.h: Move offsetofend() from vfio.h
to a generic kernel header") added offsetofend outside the normal
include #ifndef/#endif guard.  Move it inside.

Miscellanea:

o remove unnecessary blank line
o standardize offsetof macros whitespace style

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:38 -07:00
Daniel Borkmann
b86a50c3b5 compiler-intel: fix wrong compiler barrier() macro
Cleanup commit 73679e5082 ("compiler-intel.h: Remove duplicate
definition") removed the double definition of __memory_barrier()
intrinsics.

However, in doing so, it also removed the preceding #undef barrier by
accident, meaning, the actual barrier() macro from compiler-gcc.h with
inline asm is still in place as __GNUC__ is provided.

Subsequently, barrier() can never be defined as __memory_barrier() from
compiler.h since it already has a definition in place and if we trust
the comment in compiler-intel.h, ecc doesn't support gcc specific asm
statements.

I don't have an ecc at hand (unsure if that's still used in the field?)
and only found this by accident during code review, a revert of that
cleanup would be simplest option.

Fixes: 73679e5082 ("compiler-intel.h: Remove duplicate definition")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
Cc: Pranith Kumar <bobby.prani@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: mancha security <mancha1@zoho.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:38 -07:00
Joe Perches
cb984d101b compiler-gcc: integrate the various compiler-gcc[345].h files
As gcc major version numbers are going to advance rather rapidly in the
future, there's no real value in separate files for each compiler
version.

Deduplicate some of the macros #defined in each file too.

Neaten comments using normal kernel commenting style.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Cc: Sasha Levin <levinsasha928@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Alan Modra <amodra@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:38 -07:00
Joe Perches
f6d133f877 compiler-gcc.h: neatening
- Move the inline and noinline blocks together

 - Comment neatening

 - Alignment of __attribute__ uses

 - Consistent naming of __must_be_array macro argument

 - Multiline macro neatening

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Cc: Sasha Levin <levinsasha928@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Alan Modra <amodra@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:37 -07:00
Dan Streetman
479305fd71 zpool: remove zpool_evict()
Remove zpool_evict() helper function.  As zbud is currently the only
zpool implementation that supports eviction, add zpool and zpool_ops
references to struct zbud_pool and directly call zpool_ops->evict(zpool,
handle) on eviction.

Currently zpool provides the zpool_evict helper which locks the zpool
list lock and searches through all pools to find the specific one
matching the caller, and call the corresponding zpool_ops->evict
function.  However, this is unnecessary, as the zbud pool can simply
keep a reference to the zpool that created it, as well as the zpool_ops,
and directly call the zpool_ops->evict function, when it needs to evict
a page.  This avoids a spinlock and list search in zpool for each
eviction.

Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Cc: Seth Jennings <sjennings@variantweb.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25 17:00:37 -07:00
Linus Torvalds
64e22b8685 Merge branch 'for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata updates from Tejun Heo:

 - a number of libata core changes to better support NCQ TRIM.

 - ahci now supports MSI-X in single IRQ mode to support a new
   controller which doesn't implement MSI or INTX.

 - ahci now supports edge-triggered IRQ mode to support a new controller
   which for some odd reason did edge-triggered IRQ.

 - the usual controller support additions and changes.

* 'for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (27 commits)
  libata: Do not blacklist Micron M500DC
  ata: ahci_mvebu: add suspend/resume support
  ahci, msix: Fix build error for !PCI_MSI
  ahci: Add support for Cavium's ThunderX host controller
  ahci: Add generic MSI-X support for single interrupts to SATA PCI driver
  libata: finally use __initconst in ata_parse_force_one()
  drivers: ata: add support for Ceva sata host controller
  devicetree:bindings: add devicetree bindings for ceva ahci
  ahci: added support for Freescale AHCI sata
  ahci: Store irq number in struct ahci_host_priv
  ahci: Move interrupt enablement code to a separate function
  Doc: libata: Fix spelling typo found in libata.xml
  ata:sata_nv - Change 1 to true for bool type variable.
  ata: add Broadcom AHCI SATA3 driver for STB chips
  Documentation: devicetree: add Broadcom SATA binding
  libata: Fix regression when the NCQ Send and Receive log page is absent
  ata: hpt366: fix constant cast warning
  ata: ahci_xgene: potential NULL dereference in probe
  ata: ahci_xgene: Add AHCI Support for 2nd HW version of APM X-Gene SoC AHCI SATA Host controller.
  libahci: Add support to handle HOST_IRQ_STAT as edge trigger latch.
  ...
2015-06-25 16:49:21 -07:00
Linus Torvalds
e4bc13adfd Merge branch 'for-4.2/writeback' of git://git.kernel.dk/linux-block
Pull cgroup writeback support from Jens Axboe:
 "This is the big pull request for adding cgroup writeback support.

  This code has been in development for a long time, and it has been
  simmering in for-next for a good chunk of this cycle too.  This is one
  of those problems that has been talked about for at least half a
  decade, finally there's a solution and code to go with it.

  Also see last weeks writeup on LWN:

        http://lwn.net/Articles/648292/"

* 'for-4.2/writeback' of git://git.kernel.dk/linux-block: (85 commits)
  writeback, blkio: add documentation for cgroup writeback support
  vfs, writeback: replace FS_CGROUP_WRITEBACK with SB_I_CGROUPWB
  writeback: do foreign inode detection iff cgroup writeback is enabled
  v9fs: fix error handling in v9fs_session_init()
  bdi: fix wrong error return value in cgwb_create()
  buffer: remove unusued 'ret' variable
  writeback: disassociate inodes from dying bdi_writebacks
  writeback: implement foreign cgroup inode bdi_writeback switching
  writeback: add lockdep annotation to inode_to_wb()
  writeback: use unlocked_inode_to_wb transaction in inode_congested()
  writeback: implement unlocked_inode_to_wb transaction and use it for stat updates
  writeback: implement [locked_]inode_to_wb_and_lock_list()
  writeback: implement foreign cgroup inode detection
  writeback: make writeback_control track the inode being written back
  writeback: relocate wb[_try]_get(), wb_put(), inode_{attach|detach}_wb()
  mm: vmscan: disable memcg direct reclaim stalling if cgroup writeback support is in use
  writeback: implement memcg writeback domain based throttling
  writeback: reset wb_domain->dirty_limit[_tstmp] when memcg domain size changes
  writeback: implement memcg wb_domain
  writeback: update wb_over_bg_thresh() to use wb_domain aware operations
  ...
2015-06-25 16:00:17 -07:00
Linus Torvalds
ad90fb9751 Merge branch 'for-4.2/sg' of git://git.kernel.dk/linux-block
Pull asm/scatterlist.h removal from Jens Axboe:
 "We don't have any specific arch scatterlist anymore, since parisc
  finally switched over.  Kill the include"

* 'for-4.2/sg' of git://git.kernel.dk/linux-block:
  remove scatterlist.h generation from arch Kbuild files
  remove <asm/scatterlist.h>
2015-06-25 15:22:36 -07:00
Linus Torvalds
6a398a3ef4 Merge branch 'for-4.2/drivers' of git://git.kernel.dk/linux-block
Pull block driver updates from Jens Axboe:
 "This contains:

   - a few race fixes for null_blk, from Akinobu Mita.

   - a series of fixes for mtip32xx, from Asai Thambi and Selvan Mani at
     Micron.

   - NVMe:
        * Fix for missing error return on allocation failure, from Axel
          Lin.

        * Code consolidation and cleanups from Christoph.

        * Memory barrier addition, syncing queue count and queue
          pointers. From Jon Derrick.

        * Various fixes from Keith, an addition to support user
          issue reset from sysfs or ioctl, and automatic namespace
          rescan.

        * Fix from Matias, avoiding losing some request flags when
          marking the request failfast.

   - small cleanups and sparse fixups for ps3vram.  From Geert
     Uytterhoeven and Geoff Lavand.

   - s390/dasd dead code removal, from Jarod Wilson.

   - a set of fixes and optimizations for loop, from Ming Lei.

   - conversion to blkdev_reread_part() of loop, dasd, ndb.  From Ming
     Lei.

   - updates to cciss.  From Tomas Henzl"

* 'for-4.2/drivers' of git://git.kernel.dk/linux-block: (44 commits)
  mtip32xx: Fix accessing freed memory
  block: nvme-scsi: Catch kcalloc failure
  NVMe: Fix IO for extended metadata formats
  nvme: don't overwrite req->cmd_flags on sync cmd
  mtip32xx: increase wait time for hba reset
  mtip32xx: fix minor number
  mtip32xx: remove unnecessary sleep in mtip_ftl_rebuild_poll()
  mtip32xx: fix crash on surprise removal of the drive
  mtip32xx: Abort I/O during secure erase operation
  mtip32xx: fix incorrectly setting MTIP_DDF_SEC_LOCK_BIT
  mtip32xx: remove unused variable 'port->allocated'
  mtip32xx: fix rmmod issue
  MAINTAINERS: Update ps3vram block driver
  block/ps3vram: Remove obsolete reference to MTD
  block/ps3vram: Fix sparse warnings
  NVMe: Automatic namespace rescan
  NVMe: Memory barrier before queue_count is incremented
  NVMe: add sysfs and ioctl controller reset
  null_blk: restart request processing on completion handler
  null_blk: prevent timer handler running on a different CPU where started
  ...
2015-06-25 15:12:50 -07:00
Linus Torvalds
bfffa1cc9d Merge branch 'for-4.2/core' of git://git.kernel.dk/linux-block
Pull core block IO update from Jens Axboe:
 "Nothing really major in here, mostly a collection of smaller
  optimizations and cleanups, mixed with various fixes.  In more detail,
  this contains:

   - Addition of policy specific data to blkcg for block cgroups.  From
     Arianna Avanzini.

   - Various cleanups around command types from Christoph.

   - Cleanup of the suspend block I/O path from Christoph.

   - Plugging updates from Shaohua and Jeff Moyer, for blk-mq.

   - Eliminating atomic inc/dec of both remaining IO count and reference
     count in a bio.  From me.

   - Fixes for SG gap and chunk size support for data-less (discards)
     IO, so we can merge these better.  From me.

   - Small restructuring of blk-mq shared tag support, freeing drivers
     from iterating hardware queues.  From Keith Busch.

   - A few cfq-iosched tweaks, from Tahsin Erdogan and me.  Makes the
     IOPS mode the default for non-rotational storage"

* 'for-4.2/core' of git://git.kernel.dk/linux-block: (35 commits)
  cfq-iosched: fix other locations where blkcg_to_cfqgd() can return NULL
  cfq-iosched: fix sysfs oops when attempting to read unconfigured weights
  cfq-iosched: move group scheduling functions under ifdef
  cfq-iosched: fix the setting of IOPS mode on SSDs
  blktrace: Add blktrace.c to BLOCK LAYER in MAINTAINERS file
  block, cgroup: implement policy-specific per-blkcg data
  block: Make CFQ default to IOPS mode on SSDs
  block: add blk_set_queue_dying() to blkdev.h
  blk-mq: Shared tag enhancements
  block: don't honor chunk sizes for data-less IO
  block: only honor SG gap prevention for merges that contain data
  block: fix returnvar.cocci warnings
  block, dm: don't copy bios for request clones
  block: remove management of bi_remaining when restoring original bi_end_io
  block: replace trylock with mutex_lock in blkdev_reread_part()
  block: export blkdev_reread_part() and __blkdev_reread_part()
  suspend: simplify block I/O handling
  block: collapse bio bit space
  block: remove unused BIO_RW_BLOCK and BIO_EOF flags
  block: remove BIO_EOPNOTSUPP
  ...
2015-06-25 14:29:53 -07:00
Linus Torvalds
d857da7b70 A very large number of cleanups and bug fixes --- in particular for
the ext4 encryption patches, which is a new feature added in the last
 merge window.  Also fix a number of long-standing xfstest failures.
 (Quota writes failing due to ENOSPC, a race between truncate and
 writepage in data=journalled mode that was causing generic/068 to
 fail, and other corner cases.)
 
 Also add support for FALLOC_FL_INSERT_RANGE, and improve jbd2
 performance eliminating locking when a buffer is modified more than
 once during a transaction (which is very common for allocation
 bitmaps, for example), in which case the state of the journalled
 buffer head doesn't need to change.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJVi3PeAAoJEPL5WVaVDYGj+I0H/jRPexvyvnGfxiqs1sxIlbSk
 cwewFJSsuKsy/pGYdmHvozWZyWGGORc89NrxoNwdbG+axvHbgUWt/3+vF+rzmaek
 vX4v9QvCEo4PfpRgzbnYJFhbxGMJtwci887sq1o/UoNXikFYT2kz8rpdf0++eO5W
 /GJNRA5ZUY0L0eeloUILAMrBr7KjtkI2oXwOZt5q68jh7B3n3XdNQXyEiQS/28aK
 QYcFrqA/e2Fiuk6l5OSGBCP38mySu+x0nBTLT5LFwwrUBnoZvGtdjM6Sj/yADDDn
 uP/Zpq56aLzkFRwwItrDaF26BIf2MhIH/WUYs65CraEGxjMaiPuzAudGA/iUVL8=
 =1BdR
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
 "A very large number of cleanups and bug fixes --- in particular for
  the ext4 encryption patches, which is a new feature added in the last
  merge window.  Also fix a number of long-standing xfstest failures.
  (Quota writes failing due to ENOSPC, a race between truncate and
  writepage in data=journalled mode that was causing generic/068 to
  fail, and other corner cases.)

  Also add support for FALLOC_FL_INSERT_RANGE, and improve jbd2
  performance eliminating locking when a buffer is modified more than
  once during a transaction (which is very common for allocation
  bitmaps, for example), in which case the state of the journalled
  buffer head doesn't need to change"

[ I renamed "ext4_follow_link()" to "ext4_encrypted_follow_link()" in
  the merge resolution, to make it clear that that function is _only_
  used for encrypted symlinks.  The function doesn't actually work for
  non-encrypted symlinks at all, and they use the generic helpers
                                         - Linus ]

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (52 commits)
  ext4: set lazytime on remount if MS_LAZYTIME is set by mount
  ext4: only call ext4_truncate when size <= isize
  ext4: make online defrag error reporting consistent
  ext4: minor cleanup of ext4_da_reserve_space()
  ext4: don't retry file block mapping on bigalloc fs with non-extent file
  ext4: prevent ext4_quota_write() from failing due to ENOSPC
  ext4: call sync_blockdev() before invalidate_bdev() in put_super()
  jbd2: speedup jbd2_journal_dirty_metadata()
  jbd2: get rid of open coded allocation retry loop
  ext4: improve warning directory handling messages
  jbd2: fix ocfs2 corrupt when updating journal superblock fails
  ext4: mballoc: avoid 20-argument function call
  ext4: wait for existing dio workers in ext4_alloc_file_blocks()
  ext4: recalculate journal credits as inode depth changes
  jbd2: use GFP_NOFS in jbd2_cleanup_journal_tail()
  ext4: use swap() in mext_page_double_lock()
  ext4: use swap() in memswap()
  ext4: fix race between truncate and __ext4_journalled_writepage()
  ext4 crypto: fail the mount if blocksize != pagesize
  ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate
  ...
2015-06-25 14:06:55 -07:00
Jean Delvare
9ea650c804 firmware: dmi: struct dmi_header should be packed
Apparently the compiler does fine without it, but it feels safer and
clearer to add the missing attribute.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
2015-06-25 09:06:57 +02:00
Ivan Khoronzhuk
d7f96f97c4 firmware: dmi_scan: add SBMIOS entry and DMI tables
Some utils, like dmidecode and smbios, need to access SMBIOS entry
table area in order to get information like SMBIOS version, size, etc.
Currently it's done via /dev/mem. But for situation when /dev/mem
usage is disabled, the utils have to use dmi sysfs instead, which
doesn't represent SMBIOS entry and adds code/delay redundancy when direct
access for table is needed.

So this patch creates dmi/tables and adds SMBIOS entry point to allow
utils in question to work correctly without /dev/mem. Also patch adds
raw dmi table to simplify dmi table processing in user space, as
proposed by Jean Delvare.

Tested-by: Roy Franz <roy.franz@linaro.org>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@globallogic.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
2015-06-25 09:06:56 +02:00
Linus Torvalds
aefbef10e3 Merge branch 'akpm' (patches from Andrew)
Merge first patchbomb from Andrew Morton:

 - a few misc things

 - ocfs2 udpates

 - kernel/watchdog.c feature work (took ages to get right)

 - most of MM.  A few tricky bits are held up and probably won't make 4.2.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (91 commits)
  mm: kmemleak_alloc_percpu() should follow the gfp from per_alloc()
  mm, thp: respect MPOL_PREFERRED policy with non-local node
  tmpfs: truncate prealloc blocks past i_size
  mm/memory hotplug: print the last vmemmap region at the end of hot add memory
  mm/mmap.c: optimization of do_mmap_pgoff function
  mm: kmemleak: optimise kmemleak_lock acquiring during kmemleak_scan
  mm: kmemleak: avoid deadlock on the kmemleak object insertion error path
  mm: kmemleak: do not acquire scan_mutex in kmemleak_do_cleanup()
  mm: kmemleak: fix delete_object_*() race when called on the same memory block
  mm: kmemleak: allow safe memory scanning during kmemleak disabling
  memcg: convert mem_cgroup->under_oom from atomic_t to int
  memcg: remove unused mem_cgroup->oom_wakeups
  frontswap: allow multiple backends
  x86, mirror: x86 enabling - find mirrored memory ranges
  mm/memblock: allocate boot time data structures from mirrored memory
  mm/memblock: add extra "flags" to memblock to allow selection of memory based on attribute
  mm: do not ignore mapping_gfp_mask in page cache allocation paths
  mm/cma.c: fix typos in comments
  mm/oom_kill.c: print points as unsigned int
  mm/hugetlb: handle races in alloc_huge_page and hugetlb_reserve_pages
  ...
2015-06-24 20:47:21 -07:00
Linus Torvalds
cfcc0ad47f Merge tag 'for-f2fs-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
 "New features:
   - per-file encryption (e.g., ext4)
   - FALLOC_FL_ZERO_RANGE
   - FALLOC_FL_COLLAPSE_RANGE
   - RENAME_WHITEOUT

  Major enhancement/fixes:
   - recovery broken superblocks
   - enhance f2fs_trim_fs with a discard_map
   - fix a race condition on dentry block allocation
   - fix a deadlock during summary operation
   - fix a missing fiemap result

  .. and many minor bug fixes and clean-ups were done"

* tag 'for-f2fs-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (83 commits)
  f2fs: do not trim preallocated blocks when truncating after i_size
  f2fs crypto: add alloc_bounce_page
  f2fs crypto: fix to handle errors likewise ext4
  f2fs: drop the volatile_write flag only
  f2fs: skip committing valid superblock
  f2fs: setting discard option in parse_options()
  f2fs: fix to return exact trimmed size
  f2fs: support FALLOC_FL_INSERT_RANGE
  f2fs: hide common code in f2fs_replace_block
  f2fs: disable the discard option when device doesn't support
  f2fs crypto: remove alloc_page for bounce_page
  f2fs: fix a deadlock for summary page lock vs. sentry_lock
  f2fs crypto: clean up error handling in f2fs_fname_setup_filename
  f2fs crypto: avoid f2fs_inherit_context for symlink
  f2fs crypto: do not set encryption policy for non-directory by ioctl
  f2fs crypto: allow setting encryption policy once
  f2fs crypto: check context consistent for rename2
  f2fs: avoid duplicated code by reusing f2fs_read_end_io
  f2fs crypto: use per-inode tfm structure
  f2fs: recovering broken superblock during mount
  ...
2015-06-24 20:38:29 -07:00
Linus Torvalds
14738e0331 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input subsystem updates from Dmitry Torokhov:
 "Thanks to Samuel Thibault input device (keyboard) LEDs are no longer
  hardwired within the input core but use LED subsystem and so allow use
  of different triggers; Hans de Goede did a large update for the ALPS
  touchpad driver; we have new TI drv2665 haptics driver and DA9063
  OnKey driver, and host of other drivers got various fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (55 commits)
  Input: pixcir_i2c_ts - fix receive error
  MAINTAINERS: remove non existent input mt git tree
  Input: improve usage of gpiod API
  tty/vt/keyboard: define LED triggers for VT keyboard lock states
  tty/vt/keyboard: define LED triggers for VT LED states
  Input: export LEDs as class devices in sysfs
  Input: cyttsp4 - use swap() in cyttsp4_get_touch()
  Input: goodix - do not explicitly set evbits in input device
  Input: goodix - export id and version read from device
  Input: goodix - fix variable length array warning
  Input: goodix - fix alignment issues
  Input: add OnKey driver for DA9063 MFD part
  Input: elan_i2c - add product IDs FW names
  Input: elan_i2c - add support for multi IC type and iap format
  Input: focaltech - report finger width to userspace
  tty: remove platform_sysrq_reset_seq
  Input: synaptics_i2c - use proper boolean values
  Input: psmouse - use true instead of 1 for boolean values
  Input: cyapa - fix a few typos in comments
  Input: stmpe-ts - enforce device tree only mode
  ...
2015-06-24 19:56:58 -07:00
Linus Torvalds
93a4b1b946 Here is the bulk of pin control changes for the v4.2 series:
- Core functionality:
   - Enable exclusive pin ownership: it is possible to flag a pin
     controller so that GPIO and other functions cannot use a single
     pin simultaneously.
 
 - New drivers:
   - NXP LPC18xx System Control Unit pin controller
   - Imagination Pistachio SoC pin controller
 
 - New subdrivers:
   - Freescale i.MX7d SoC
   - Intel Sunrisepoint-H PCH
   - Renesas PFC R8A7793
   - Renesas PFC R8A7794
   - Mediatek MT6397, MT8127
   - SiRF Atlas 7
   - Allwinner A33
   - Qualcomm MSM8660
   - Marvell Armada 395
   - Rockchip RK3368
 
 - Cleanups:
   - A big cleanup of the Marvell MVEBU driver rectifying it to
     correspond to reality
   - Drop platform device probing from the SH PFC driver, we are now a
     DT only shop for SuperH
   - Drop obsolte multi-platform check for SH PFC
   - Various janitorial: constification, grammar etc
 
 - Improvements:
   - The AT91 GPIO portions now supports the set_multiple() feature
   - Split out SPI pins on the Xilinx Zynq
   - Support DTs without specific function nodes in the i.MX driver
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVin37AAoJEEEQszewGV1zIlQP/i6+C47z3OV67hYAOmlGoynl
 wsdTFbyp+GIPl3N1r0lRzxOfQsuc9t93iDMrC5ssN9VFaj8MgH/j3XKWf5A55iVn
 u7nNQzIFjzTwl58/Pu4oM+d9l5i26o44teFKh3xI4aup4AFed3+lDkQtRipgo29c
 V4y+6SaQxQ46e2qaOAM20gEagm2a8EvChn1Zo/HLQnnmZcKBxgObJna7iTZWm+fN
 LzyBWtczFYPxfQ9IqYzklyeou4ohfrcHzqN71IEtmGMXxob+i04QS9FQXaPitgBG
 UORjwFVh8690n3ETQobjLrylOF5F/3+RdCGqanYOLgaJ0aix4+EByLz9FbxLPnJk
 4Utijk2SKxLUb3dXZIfpwKtmPmvLJkFqwSazN5WDIg9Rjqz/H1p9UTWP0cfPRwJa
 9INDZeK833kjYdtK6UMBpuNFkgGtpKTlhMX/cI78KYsEwVgK8r69b7uNr+2OUMgh
 4i7dbHgb5/NpHlUlacVPTBvXf7C1iQ//vqh0Oc20lp/mAY1tVGuYRHno6QVyRtfS
 DmCNPtbAgCa9FmP/t5NA8a3wana2ObTT2NCNMGEue7tJxVX4YaLpwIAEnUSHSJOQ
 seI8HT2M1yEiSes9V+OuigHt3pKk68fMe0ZqDkovcd4QBlub6WTAPXWrXpbHtBCo
 k+hT8TlDYaDbQkNDzXtg
 =UyKm
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control updates from Linus Walleij:
 "Here is the bulk of pin control changes for the v4.2 series: Quite a
  lot of new SoC subdrivers and two new main drivers this time, apart
  from that business as usual.

  Details:

  Core functionality:
   - Enable exclusive pin ownership: it is possible to flag a pin
     controller so that GPIO and other functions cannot use a single pin
     simultaneously.

  New drivers:
   - NXP LPC18xx System Control Unit pin controller
   - Imagination Pistachio SoC pin controller

  New subdrivers:
   - Freescale i.MX7d SoC
   - Intel Sunrisepoint-H PCH
   - Renesas PFC R8A7793
   - Renesas PFC R8A7794
   - Mediatek MT6397, MT8127
   - SiRF Atlas 7
   - Allwinner A33
   - Qualcomm MSM8660
   - Marvell Armada 395
   - Rockchip RK3368

  Cleanups:
   - A big cleanup of the Marvell MVEBU driver rectifying it to
     correspond to reality
   - Drop platform device probing from the SH PFC driver, we are now a
     DT only shop for SuperH
   - Drop obsolte multi-platform check for SH PFC
   - Various janitorial: constification, grammar etc

  Improvements:
   - The AT91 GPIO portions now supports the set_multiple() feature
   - Split out SPI pins on the Xilinx Zynq
   - Support DTs without specific function nodes in the i.MX driver"

* tag 'pinctrl-v4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (99 commits)
  pinctrl: rockchip: add support for the rk3368
  pinctrl: rockchip: generalize perpin driver-strength setting
  pinctrl: sh-pfc: r8a7794: add SDHI pin groups
  pinctrl: sh-pfc: r8a7794: add MMCIF pin groups
  pinctrl: sh-pfc: add R8A7794 PFC support
  pinctrl: make pinctrl_register() return proper error code
  pinctrl: mvebu: armada-39x: add support for Armada 395 variant
  pinctrl: mvebu: armada-39x: add missing SATA functions
  pinctrl: mvebu: armada-39x: add missing PCIe functions
  pinctrl: mvebu: armada-38x: add ptp functions
  pinctrl: mvebu: armada-38x: add ua1 functions
  pinctrl: mvebu: armada-38x: add nand functions
  pinctrl: mvebu: armada-38x: add sata functions
  pinctrl: mvebu: armada-xp: add dram functions
  pinctrl: mvebu: armada-xp: add nand rb function
  pinctrl: mvebu: armada-xp: add spi1 function
  pinctrl: mvebu: armada-39x: normalize ref clock naming
  pinctrl: mvebu: armada-xp: rename spi to spi0
  pinctrl: mvebu: armada-370: align spi1 clock pin naming
  pinctrl: mvebu: armada-370: align VDD cpu-pd pin naming with datasheet
  ...
2015-06-24 19:21:02 -07:00
Linus Torvalds
d59b92f93d == Changes to existing drivers ==
- Supply MODULE_DEVICE_TABLE() to ensure probing
  - Constify struct; da9052_bl
  - Enable compile test; lcd_l4f00242t03, lcd_lms283fg05, backlight_gpio
  - Suspend/resume bugfix; lp855x_bl
  - devm_gpiod_get_optional() API fixup; pwm_bl
  - Error handling fixup; backlight
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJViXR7AAoJEFGvii+H/HdhZPsP/3kxXQNCt2voEj96MFLXWLwq
 1cbWbPboslNkdYPY9ygLTIqop/NOoEwiZMLE2Ea3uBDFewLU27vEX4waG+ILITjG
 /KxBAUXnNvFTrv9X+5hiOm6D+6kLk2M2eEygSC6f9QgBXP4VjApkVvpehdDKWT5x
 X11wlfx/TYhLcj5iggyW39fACp8Aig7LvOigS7fjfhwn1PAjWVw6NLrxmIlysWbH
 8qMfL0u8Ks5BYSh4xr5ATrB6OLx5Hu3mv9d8AK8o4XsRXOrFtR/dMThOLcJq/bFi
 4Vp4roi/30RSpow1yKPS3+TBRFbn2PG+6G6GVbWCO/uVkQiaxteyM79P0gEy5Y2a
 8WvV3vOMYY1/FszCOIfrJbj4No5/Bc2fObLXYDursLYMOPhUNrWeyRxbLfHTpKR3
 kim8XFGzLE5qFLqQWheqkHDq24y1iz6fl4YEZ8avf1rnDfzNJx8fnHk2uXZrW6ru
 HdjXbGC4pht1j6uM+DDROZ3iM5+2AMb/ASPLSCqslXXG82BCva3wasrMd3RttEQN
 bUltoWgWAMonIYoNx3CYwOGS9sWFllq1b0dTl1qDQPRT55sR4zcMZbMp16A0whJ7
 bJQMflbkgWDCeiMg5vXrImxmBBwfAe6IQ3yfEMUNNf+CzJxVnGjvpWvDWHo5iqhd
 Ul8lq/XdnXvpSSUugWhM
 =8cYG
 -----END PGP SIGNATURE-----

Merge tag 'backlight-for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight

Pull backlight updates from Lee Jones:
 "Changes to existing drivers:

   - supply MODULE_DEVICE_TABLE() to ensure probing
   - constify struct; da9052_bl
   - enable compile test; lcd_l4f00242t03, lcd_lms283fg05, backlight_gpio
   - suspend/resume bugfix; lp855x_bl
   - devm_gpiod_get_optional() API fixup; pwm_bl
   - error handling fixup; backlight"

* tag 'backlight-for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
  backlight: Change the return type of backlight_update_status() to int
  backlight: pwm_bl: Simplify usage of devm_gpiod_get_optional
  backlight: lp855x: Don't clear level on suspend/blank
  backlight: Allow compile test of GPIO consumers if !GPIOLIB
  video: backlight: da9052: Constify platform_device_id
  gpio-backlight: Discover driver during boot time
2015-06-24 18:57:00 -07:00
Larry Finger
8a8c35fadf mm: kmemleak_alloc_percpu() should follow the gfp from per_alloc()
Beginning at commit d52d3997f8 ("ipv6: Create percpu rt6_info"), the
following INFO splat is logged:

  ===============================
  [ INFO: suspicious RCU usage. ]
  4.1.0-rc7-next-20150612 #1 Not tainted
  -------------------------------
  kernel/sched/core.c:7318 Illegal context switch in RCU-bh read-side critical section!
  other info that might help us debug this:
  rcu_scheduler_active = 1, debug_locks = 0
   3 locks held by systemd/1:
   #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff815f0c8f>] rtnetlink_rcv+0x1f/0x40
   #1:  (rcu_read_lock_bh){......}, at: [<ffffffff816a34e2>] ipv6_add_addr+0x62/0x540
   #2:  (addrconf_hash_lock){+...+.}, at: [<ffffffff816a3604>] ipv6_add_addr+0x184/0x540
  stack backtrace:
  CPU: 0 PID: 1 Comm: systemd Not tainted 4.1.0-rc7-next-20150612 #1
  Hardware name: TOSHIBA TECRA A50-A/TECRA A50-A, BIOS Version 4.20   04/17/2014
  Call Trace:
    dump_stack+0x4c/0x6e
    lockdep_rcu_suspicious+0xe7/0x120
    ___might_sleep+0x1d5/0x1f0
    __might_sleep+0x4d/0x90
    kmem_cache_alloc+0x47/0x250
    create_object+0x39/0x2e0
    kmemleak_alloc_percpu+0x61/0xe0
    pcpu_alloc+0x370/0x630

Additional backtrace lines are truncated.  In addition, the above splat
is followed by several "BUG: sleeping function called from invalid
context at mm/slub.c:1268" outputs.  As suggested by Martin KaFai Lau,
these are the clue to the fix.  Routine kmemleak_alloc_percpu() always
uses GFP_KERNEL for its allocations, whereas it should follow the gfp
from its callers.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: <stable@vger.kernel.org>	[3.18+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:46 -07:00
Dan Streetman
d1dc6f1bcf frontswap: allow multiple backends
Change frontswap single pointer to a singly linked list of frontswap
implementations.  Update Xen tmem implementation as register no longer
returns anything.

Frontswap only keeps track of a single implementation; any
implementation that registers second (or later) will replace the
previously registered implementation, and gets a pointer to the previous
implementation that the new implementation is expected to pass all
frontswap functions to if it can't handle the function itself.  However
that method doesn't really make much sense, as passing that work on to
every implementation adds unnecessary work to implementations; instead,
frontswap should simply keep a list of all registered implementations
and try each implementation for any function.  Most importantly, neither
of the two currently existing frontswap implementations in the kernel
actually do anything with any previous frontswap implementation that
they replace when registering.

This allows frontswap to successfully manage multiple implementations by
keeping a list of them all.

Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:45 -07:00
Tony Luck
b05b9f5f9d x86, mirror: x86 enabling - find mirrored memory ranges
UEFI GetMemoryMap() uses a new attribute bit to mark mirrored memory
address ranges.  See UEFI 2.5 spec pages 157-158:

  http://www.uefi.org/sites/default/files/resources/UEFI%202_5.pdf

On EFI enabled systems scan the memory map and tell memblock about any
mirrored ranges.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Xiexiuqi <xiexiuqi@huawei.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:45 -07:00
Tony Luck
a3f5bafcc0 mm/memblock: allocate boot time data structures from mirrored memory
Try to allocate all boot time kernel data structures from mirrored
memory.

If we run out of mirrored memory print warnings, but fall back to using
non-mirrored memory to make sure that we still boot.

By number of bytes, most of what we allocate at boot time is the page
structures.  64 bytes per 4K page on x86_64 ...  or about 1.5% of total
system memory.  For workloads where the bulk of memory is allocated to
applications this may represent a useful improvement to system
availability since 1.5% of total memory might be a third of the memory
allocated to the kernel.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Xiexiuqi <xiexiuqi@huawei.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:45 -07:00
Tony Luck
fc6daaf931 mm/memblock: add extra "flags" to memblock to allow selection of memory based on attribute
Some high end Intel Xeon systems report uncorrectable memory errors as a
recoverable machine check.  Linux has included code for some time to
process these and just signal the affected processes (or even recover
completely if the error was in a read only page that can be replaced by
reading from disk).

But we have no recovery path for errors encountered during kernel code
execution.  Except for some very specific cases were are unlikely to ever
be able to recover.

Enter memory mirroring. Actually 3rd generation of memory mirroing.

Gen1: All memory is mirrored
	Pro: No s/w enabling - h/w just gets good data from other side of the
	     mirror
	Con: Halves effective memory capacity available to OS/applications

Gen2: Partial memory mirror - just mirror memory begind some memory controllers
	Pro: Keep more of the capacity
	Con: Nightmare to enable. Have to choose between allocating from
	     mirrored memory for safety vs. NUMA local memory for performance

Gen3: Address range partial memory mirror - some mirror on each memory
      controller
	Pro: Can tune the amount of mirror and keep NUMA performance
	Con: I have to write memory management code to implement

The current plan is just to use mirrored memory for kernel allocations.
This has been broken into two phases:

1) This patch series - find the mirrored memory, use it for boot time
   allocations

2) Wade into mm/page_alloc.c and define a ZONE_MIRROR to pick up the
   unused mirrored memory from mm/memblock.c and only give it out to
   select kernel allocations (this is still being scoped because
   page_alloc.c is scary).

This patch (of 3):

Add extra "flags" to memblock to allow selection of memory based on
attribute.  No functional changes

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Xiexiuqi <xiexiuqi@huawei.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:44 -07:00
Aneesh Kumar K.V
8809aa2d28 mm: clarify that the function operates on hugepage pte
We have confusing functions to clear pmd, pmd_clear_* and pmd_clear.  Add
_huge_ to pmdp_clear functions so that we are clear that they operate on
hugepage pte.

We don't bother about other functions like pmdp_set_wrprotect,
pmdp_clear_flush_young, because they operate on PTE bits and hence
indicate they are operating on hugepage ptes

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:44 -07:00
Xie XiuQi
cc3e2af42e memory-failure: change type of action_result's param 3 to enum
Change type of action_result's param 3 to enum for type consistency,
and rename mf_outcome to mf_result for clearly.

Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Chen Gong <gong.chen@linux.intel.com>
Cc: Jim Davis <jim.epost@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:43 -07:00
Xie XiuQi
cc637b1704 memory-failure: export page_type and action result
Export 'outcome' and 'action_page_type' to mm.h, so we could use
this emnus outside.

This patch is preparation for adding trace events for memory-failure
recovery action.

Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Chen Gong <gong.chen@linux.intel.com>
Cc: Jim Davis <jim.epost@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:43 -07:00
Johannes Weiner
dc56401fc9 mm: oom_kill: simplify OOM killer locking
The zonelist locking and the oom_sem are two overlapping locks that are
used to serialize global OOM killing against different things.

The historical zonelist locking serializes OOM kills from allocations with
overlapping zonelists against each other to prevent killing more tasks
than necessary in the same memory domain.  Only when neither tasklists nor
zonelists from two concurrent OOM kills overlap (tasks in separate memcgs
bound to separate nodes) are OOM kills allowed to execute in parallel.

The younger oom_sem is a read-write lock to serialize OOM killing against
the PM code trying to disable the OOM killer altogether.

However, the OOM killer is a fairly cold error path, there is really no
reason to optimize for highly performant and concurrent OOM kills.  And
the oom_sem is just flat-out redundant.

Replace both locking schemes with a single global mutex serializing OOM
kills regardless of context.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:43 -07:00
Johannes Weiner
16e951966f mm: oom_kill: clean up victim marking and exiting interfaces
Rename unmark_oom_victim() to exit_oom_victim().  Marking and unmarking
are related in functionality, but the interface is not symmetrical at
all: one is an internal OOM killer function used during the killing, the
other is for an OOM victim to signal its own death on exit later on.
This has locking implications, see follow-up changes.

While at it, rename mark_tsk_oom_victim() to mark_oom_victim(), which
is easier on the eye.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:43 -07:00
Naoya Horiguchi
ead07f6a86 mm/memory-failure: introduce get_hwpoison_page() for consistent refcount handling
memory_failure() can run in 2 different mode (specified by
MF_COUNT_INCREASED) in page refcount perspective.  When
MF_COUNT_INCREASED is set, memory_failure() assumes that the caller
takes a refcount of the target page.  And if cleared, memory_failure()
takes it in it's own.

In current code, however, refcounting is done differently in each caller.
For example, madvise_hwpoison() uses get_user_pages_fast() and
hwpoison_inject() uses get_page_unless_zero().  So this inconsistent
refcounting causes refcount failure especially for thp tail pages.
Typical user visible effects are like memory leak or
VM_BUG_ON_PAGE(!page_count(page)) in isolate_lru_page().

To fix this refcounting issue, this patch introduces get_hwpoison_page()
to handle thp tail pages in the same manner for each caller of hwpoison
code.

memory_failure() might fail to split thp and in such case it returns
without completing page isolation.  This is not good because PageHWPoison
on the thp is still set and there's no easy way to unpoison such thps.  So
this patch try to roll back any action to the thp in "non anonymous thp"
case and "thp split failed" case, expecting an MCE(SRAR) generated by
later access afterward will properly free such thps.

[akpm@linux-foundation.org: fix CONFIG_HWPOISON_INJECT=m]
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:42 -07:00
Kirill A. Shutemov
c761471b58 mm: avoid tail page refcounting on non-THP compound pages
Reintroduce 8d63d99a5d ("mm: avoid tail page refcounting on non-THP
compound pages") after removing bogus VM_BUG_ON_PAGE() in
put_unrefcounted_compound_page().

THP uses tail page refcounting to be able to split huge pages at any time.
 Tail page refcounting is not needed for other users of compound pages and
it's harmful because of overhead.

We try to exclude non-THP pages from tail page refcounting using
__compound_tail_refcounted() check.  It excludes most common non-THP
compound pages: SL*B and hugetlb, but it doesn't catch rest of __GFP_COMP
users -- drivers.

And it's not only about overhead.

Drivers might want to use compound pages to get refcounting semantics
suitable for mapping high-order pages to userspace.  But tail page
refcounting breaks it.

Tail page refcounting uses ->_mapcount in tail pages to store GUP pins on
them.  It means GUP pins would affect page_mapcount() for tail pages.
It's not a problem for THP, because it never maps tail pages.  But unlike
THP, drivers map parts of compound pages with PTEs and it makes
page_mapcount() be called for tail pages.

In particular, GUP pins would shift PSS up and affect /proc/kpagecount for
such pages.  But, I'm not aware about anything which can lead to crash or
other serious misbehaviour.

Since currently all THP pages are anonymous and all drivers pages are not,
we can fix the __compound_tail_refcounted() check by requiring PageAnon()
to enable tail page refcounting.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:42 -07:00
Rasmus Villemoes
a9919c7935 mm: only define hashdist variable when needed
For !CONFIG_NUMA, hashdist will always be 0, since it's setter is
otherwise compiled out.  So we can save 4 bytes of data and some .text
(although mostly in __init functions) by only defining it for
CONFIG_NUMA.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: David Rientjes <rientjes@google.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:41 -07:00
Laurent Dufour
4abad2ca4a mm: new arch_remap() hook
Some architectures would like to be triggered when a memory area is moved
through the mremap system call.

This patch introduces a new arch_remap() mm hook which is placed in the
path of mremap, and is called before the old area is unmapped (and the
arch_unmap() hook is called).

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:41 -07:00
Laurent Dufour
2ae416b142 mm: new mm hook framework
CRIU is recreating the process memory layout by remapping the checkpointee
memory area on top of the current process (criu).  This includes remapping
the vDSO to the place it has at checkpoint time.

However some architectures like powerpc are keeping a reference to the
vDSO base address to build the signal return stack frame by calling the
vDSO sigreturn service.  So once the vDSO has been moved, this reference
is no more valid and the signal frame built later are not usable.

This patch serie is introducing a new mm hook framework, and a new
arch_remap hook which is called when mremap is done and the mm lock still
hold.  The next patch is adding the vDSO remap and unmap tracking to the
powerpc architecture.

This patch (of 3):

This patch introduces a new set of header file to manage mm hooks:
- per architecture empty header file (arch/x/include/asm/mm-arch-hooks.h)
- a generic header (include/linux/mm-arch-hooks.h)

The architecture which need to overwrite a hook as to redefine it in its
header file, while architecture which doesn't need have nothing to do.

The default hooks are defined in the generic header and are used in the
case the architecture is not defining it.

In a next step, mm hooks defined in include/asm-generic/mm_hooks.h should
be moved here.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:41 -07:00
Rasmus Villemoes
1ed58b6051 linux/slab.h: fix three off-by-one typos in comment
The first is a keyboard-off-by-one, the other two the ordinary mathy kind.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:41 -07:00
Gavin Guo
4066c33d03 mm/slab_common: support the slub_debug boot option on specific object size
The slub_debug=PU,kmalloc-xx cannot work because in the
create_kmalloc_caches() the s->name is created after the
create_kmalloc_cache() is called.  The name is NULL in the
create_kmalloc_cache() so the kmem_cache_flags() would not set the
slub_debug flags to the s->flags.  The fix here set up a kmalloc_names
string array for the initialization purpose and delete the dynamic name
creation of kmalloc_caches.

[akpm@linux-foundation.org: s/kmalloc_names/kmalloc_info/, tweak comment text]
Signed-off-by: Gavin Guo <gavin.guo@canonical.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:40 -07:00
Chris Metcalf
fe4ba3c343 watchdog: add watchdog_cpumask sysctl to assist nohz
Change the default behavior of watchdog so it only runs on the
housekeeping cores when nohz_full is enabled at build and boot time.
Allow modifying the set of cores the watchdog is currently running on
with a new kernel.watchdog_cpumask sysctl.

In the current system, the watchdog subsystem runs a periodic timer that
schedules the watchdog kthread to run.  However, nohz_full cores are
designed to allow userspace application code running on those cores to
have 100% access to the CPU.  So the watchdog system prevents the
nohz_full application code from being able to run the way it wants to,
thus the motivation to suppress the watchdog on nohz_full cores, which
this patchset provides by default.

However, if we disable the watchdog globally, then the housekeeping
cores can't benefit from the watchdog functionality.  So we allow
disabling it only on some cores.  See Documentation/lockup-watchdogs.txt
for more information.

[jhubbard@nvidia.com: fix a watchdog crash in some configurations]
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:40 -07:00
Chris Metcalf
b5242e98c1 smpboot: allow excluding cpus from the smpboot threads
This patch series allows the watchdog to run by default only on the
housekeeping cores when nohz_full is in effect; this seems to be a good
compromise short of turning it off completely (since the nohz_full cores
can't tolerate a watchdog).

To provide customizability, we add /proc/sys/kernel/watchdog_cpumask so
that the set of cores running the watchdog can be tuned to different
values after bootup.

To implement this customizability, we add a new
smpboot_update_cpumask_percpu_thread() API to the smpboot_thread
subsystem that lets us park or unpark "unwanted" threads.

And now that threads can be parked for long periods of time, we tweak the
/proc/<pid>/stat and /proc/<pid>/status code so parked threads aren't
reported as running, which is otherwise confusing.

This patch (of 3):

This change allows some cores to be excluded from running the
smp_hotplug_thread tasks.  The following commit to update
kernel/watchdog.c to use this functionality is the motivating example, and
more information on the motivation is provided there.

A new smp_hotplug_thread field is introduced, "cpumask", which is cpumask
field managed by the smpboot subsystem that indicates whether or not the
given smp_hotplug_thread should run on that core; the cpumask is checked
when deciding whether to unpark the thread.

To limit the cpumask to less than cpu_possible, you must call
smpboot_update_cpumask_percpu_thread() after registering.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:40 -07:00
Fabian Frederick
5286d20c4e configfs: unexport/make static config_item_init()
config_item_init() is only used in item.c

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:39 -07:00