Commit graph

42289 commits

Author SHA1 Message Date
Eric Dumazet
b193722731 net: reorganize sk_buff for faster __copy_skb_header()
With proliferation of bit fields in sk_buff, __copy_skb_header() became
quite expensive, showing as the most expensive function in a GSO
workload.

__copy_skb_header() performance is also critical for non GSO TCP
operations, as it is used from skb_clone()

This patch carefully moves all the fields that were not copied in a
separate zone : cloned, nohdr, fclone, peeked, head_frag, xmit_more

Then I moved all other fields and all other copied fields in a section
delimited by headers_start[0]/headers_end[0] section so that we
can use a single memcpy() call, inlined by compiler using long
word load/stores.

I also tried to make all copies in the natural orders of sk_buff,
to help hardware prefetching.

I made sure sk_buff size did not change.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-29 12:27:20 -04:00
Will Deacon
c02607aad2 iommu: introduce domain attribute for nesting IOMMUs
Some IOMMUs, such as the ARM SMMU, support two stages of translation.
The idea behind such a scheme is to allow a guest operating system to
use the IOMMU for DMA mappings in the first stage of translation, with
the hypervisor then installing mappings in the second stage to provide
isolation of the DMA to the physical range assigned to that virtual
machine.

In order to allow IOMMU domains to be used for second-stage translation,
this patch adds a new iommu_attr (IOMMU_ATTR_NESTING) for setting
second-stage domains prior to device attach. The attribute can also be
queried to see if a domain is actually making use of nesting.

Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2014-09-29 10:05:06 -06:00
Tomas Winkler
1f180359f4 mei: remove include to pci header from mei module files
Remove inclusion of linux/pci.h in mei layer
however we need to include the headers that before
got included implicitly from linux/pci.h.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-29 11:56:02 -04:00
Sergei Shtylyov
0043325495 usb: hcd: add generic PHY support
Add the generic PHY support, analogous to the USB PHY support. Intended it to be
used with the PCI EHCI/OHCI drivers and the xHCI platform driver.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-29 11:54:02 -04:00
Antoine Tenart
3d46e73dfd usb: rename phy to usb_phy in HCD
The USB PHY member of the HCD structure is renamed to 'usb_phy' and
modifications are done in all drivers accessing it.
This is in preparation to adding the generic PHY support.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
[Sergei: added missing 'drivers/usb/misc/lvstest.c' file, resolved rejects,
updated changelog.]
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-29 11:52:59 -04:00
Tero Kristo
c08ee14cc6 clk: ti: change clock init to use generic of_clk_init
Previously, the TI clock driver initialized all the clocks hierarchically
under each separate clock provider node. Now, each clock that requires
IO access will instead check their parent node to find out which IO range
to use.

This patch allows the TI clock driver to use a few new features provided
by the generic of_clk_init, and also allows registration of clock nodes
outside the clock hierarchy (for example, any external clocks.)

Signed-off-by: Tero Kristo <t-kristo@ti.com>
Cc: Mike Turquette <mturquette@linaro.org>
Cc: Paul Walmsley <paul@pwsan.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Ujfalusi <peter.ujfalusi@ti.com>
Cc: Jyri Sarha <jsarha@ti.com>
Cc: Stefan Assmann <sassmann@kpanic.de>
Acked-by: Tony Lindgren <tony@atomide.com>
2014-09-29 11:51:13 +03:00
Eric Dumazet
3d9a0d2f82 dql: dql_queued() should write first to reduce bus transactions
While doing high throughput test on a BQL enabled NIC,
I found a very high cost in ndo_start_xmit() when accessing BQL data.

It turned out the problem was caused by compiler trying to be
smart, but involving a bad MESI transaction :

  0.05 │  mov    0xc0(%rax),%edi    // LOAD dql->num_queued
  0.48 │  mov    %edx,0xc8(%rax)    // STORE dql->last_obj_cnt = count
 58.23 │  add    %edx,%edi
  0.58 │  cmp    %edi,0xc4(%rax)
  0.76 │  mov    %edi,0xc0(%rax)    // STORE dql->num_queued += count
  0.72 │  js     bd8

I got an incredible 10 % gain [1] by making sure cpu do not attempt
to get the cache line in Shared mode, but directly requests for
ownership.

New code :
	mov    %edx,0xc8(%rax)  // STORE dql->last_obj_cnt = count
	add    %edx,0xc0(%rax)  // RMW   dql->num_queued += count
	mov    0xc4(%rax),%ecx  // LOAD dql->adj_limit
	mov    0xc0(%rax),%edx  // LOAD dql->num_queued
	cmp    %edx,%ecx

The TX completion was running from another cpu, with high interrupts
rate.

Note that I am using barrier() as a soft hint, as mb() here could be
too heavy cost.

[1] This was a netperf TCP_STREAM with TSO disabled, but GSO enabled.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-29 00:04:55 -04:00
Dan Williams
7bced39751 net_dma: simple removal
Per commit "7787380336 net_dma: mark broken" net_dma is no longer used
and there is no plan to fix it.

This is the mechanical removal of bits in CONFIG_NET_DMA ifdef guards.
Reverting the remainder of the net_dma induced changes is deferred to
subsequent patches.

Marked for stable due to Roman's report of a memory leak in
dma_pin_iovec_pages():

    https://lkml.org/lkml/2014/9/3/177

Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: David Whipple <whipple@securedatainnovations.ch>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: <stable@vger.kernel.org>
Reported-by: Roman Gushchin <klamm@yandex-team.ru>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2014-09-28 07:05:16 -07:00
Linus Torvalds
1e3827bf8a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "Assorted fixes + unifying __d_move() and __d_materialise_dentry() +
  minimal regression fix for d_path() of victims of overwriting rename()
  ported on top of that"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vfs: Don't exchange "short" filenames unconditionally.
  fold swapping ->d_name.hash into switch_names()
  fold unlocking the children into dentry_unlock_parents_for_move()
  kill __d_materialise_dentry()
  __d_materialise_dentry(): flip the order of arguments
  __d_move(): fold manipulations with ->d_child/->d_subdirs
  don't open-code d_rehash() in d_materialise_unique()
  pull rehashing and unlocking the target dentry into __d_materialise_dentry()
  ufs: deal with nfsd/iget races
  fuse: honour max_read and max_write in direct_io mode
  shmem: fix nlink for rename overwrite directory
2014-09-27 17:05:14 -07:00
Linus Torvalds
6111da3432 Merge branch 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
 "This is quite late but these need to be backported anyway.

  This is the fix for a long-standing cpuset bug which existed from
  2009.  cpuset makes use of PF_SPREAD_{PAGE|SLAB} flags to modify the
  task's memory allocation behavior according to the settings of the
  cpuset it belongs to; unfortunately, when those flags have to be
  changed, cpuset did so directly even whlie the target task is running,
  which is obviously racy as task->flags may be modified by the task
  itself at any time.  This obscure bug manifested as corrupt
  PF_USED_MATH flag leading to a weird crash.

  The bug is fixed by moving the flag to task->atomic_flags.  The first
  two are prepatory ones to help defining atomic_flags accessors and the
  third one is the actual fix"

* 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags
  sched: add macros to define bitops for task atomic flags
  sched: fix confusing PFA_NO_NEW_PRIVS constant
2014-09-27 16:45:33 -07:00
Mike Turquette
4dc7ed32f3 Allwinner Clocks Additions for 3.18
The most important part of this serie is the addition of the phase API to
 handle the MMC clocks in the Allwinner SoCs.
 
 Apart from that, the A23 gained a new mbus driver, and there's a fix for a
 incorrect divider table on the APB0 clock.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUJm3zAAoJEBx+YmzsjxAgxJwQAJk3+Oq3J54jzRxKLGjUpfy9
 Ma9p/78ZSnYlYWrEn62vzu7sGeMJsPo4Lsmy+Hch2r765+PzFZw9oDaxjFT5poQy
 Mv8F7Uyetc99sGAfmg/fKnzgQpp1t+9+kB42cV6lzjXolqX/ACcIjzFOzROXEF9B
 2bnQ3RwXqvQhKKryDBg9+hJYt1R15d4SxQ7Rn6lb6WsZTxjGVO0cvvU3tp4QGQgg
 ZDUkJNLzLYdMK9XUNyqreatmz+HMxL5vYHeEWFz388ECp9DRUPT3MqlQcUqgSLlD
 eMqQPOnd5p5ZEUdB8qAAtf4kIbQTaVa7/4u37sE/+fogw6Pq/6a2Jqppl9aJWD7I
 PDFjxSMl77W5mQZSEanbc0a0qmqAqtZokDusP0bc0ETSZzmPVvohjW5Fa9Awyi0j
 PeN2bTglaFDPsHxKlQ31HF/e/almXkpiIXegeG0e/3VrGSrghFMQtqLEUXgVPu10
 4PV8x7O2ib1VVAowwOb10qGv0fLGC8UCqL9zXVNlCy268ijjKMlNyK3U1sllphba
 fWBYgtg9+1YHONI1SewuYibAqROC7ICDXiqDkJVb6UWmO39HBcOFDb3HJ0EIj8T4
 9v1clkVy1vONIqfvi1SeTekLovpROOxhxGtyXTpdx5qdlVhBjkEsNVHc5jh6BPHr
 o9TlBnnmIPajvF9wMN+H
 =ZkI9
 -----END PGP SIGNATURE-----

Merge tag 'sunxi-clocks-for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into clk-next

Allwinner Clocks Additions for 3.18

The most important part of this serie is the addition of the phase API to
handle the MMC clocks in the Allwinner SoCs.

Apart from that, the A23 gained a new mbus driver, and there's a fix for a
incorrect divider table on the APB0 clock.
2014-09-27 12:52:33 -07:00
Martin K. Petersen
2341c2f8c3 block: Add T10 Protection Information functions
The T10 Protection Information format is also used by some devices that
do not go through the SCSI layer (virtual block devices, NVMe). Relocate
the relevant functions to a block layer library that can be used without
involving SCSI.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-27 09:14:59 -06:00
Martin K. Petersen
4eaf99bead block: Don't merge requests if integrity flags differ
We'd occasionally merge requests with conflicting integrity flags.
Introduce a merge helper which checks that the requests have compatible
integrity payloads.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-27 09:14:57 -06:00
Martin K. Petersen
aae7df5019 block: Integrity checksum flag
Make the choice of checksum a per-I/O property by introducing a flag
that can be inspected by the SCSI layer. There are several reasons for
this:

 1. It allows us to switch choice of checksum without unloading and
    reloading the HBA driver.

 2. During error recovery we need to be able to tell the HBA that
    checksums read from disk should not be verified and converted to IP
    checksums.

 3. For error injection purposes we need to be able to write a bad guard
    tag to storage. Since the storage device only supports T10 CRC we
    need to be able to disable IP checksum conversion on the HBA.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-27 09:14:55 -06:00
Martin K. Petersen
b1f0138857 block: Relocate bio integrity flags
Move flags affecting the integrity code out of the bio bi_flags and into
the block integrity payload.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-27 09:14:54 -06:00
Martin K. Petersen
3aec2f41a8 block: Add a disk flag to block integrity profile
So far we have relied on the app tag size to determine whether a disk
has been formatted with T10 protection information or not. However, not
all target devices provide application tag storage.

Add a flag to the block integrity profile that indicates whether the
disk has been formatted with protection information.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-27 09:14:51 -06:00
Martin K. Petersen
8288f496eb block: Add prefix to block integrity profile flags
Add a BLK_ prefix to the integrity profile flags. Also rename the flags
to be more consistent with the generate/verify terminology in the rest
of the integrity code.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-27 09:14:51 -06:00
Martin K. Petersen
1859308853 block: Clean up the code used to generate and verify integrity metadata
Instead of the "operate" parameter we pass in a seed value and a pointer
to a function that can be used to process the integrity metadata. The
generation function is changed to have a return value to fit into this
scheme.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-27 09:14:51 -06:00
Martin K. Petersen
3be91c4a3d block: Deprecate the use of the term sector in the context of block integrity
The protection interval is not necessarily tied to the logical block
size of a block device. Stop using the terms "sector" and "sectors".

Going forward we will use the term "seed" to describe the initial
reference tag value for a given I/O. "Interval" will be used to describe
the portion of the data buffer that a given piece of protection
information is associated with.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-27 09:14:50 -06:00
Martin K. Petersen
5f9378fa9c block: Remove bip_buf
bip_buf is not really needed so we can remove it.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-27 09:14:50 -06:00
Martin K. Petersen
8492b68bc4 block: Remove integrity tagging functions
None of the filesystems appear interested in using the integrity tagging
feature. Potentially because very few storage devices actually permit
using the application tag space.

Remove the tagging functions.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-27 09:14:50 -06:00
Martin K. Petersen
180b2f95dd block: Replace bi_integrity with bi_special
For commands like REQ_COPY we need a way to pass extra information along
with each bio. Like integrity metadata this information must be
available at the bottom of the stack so bi_private does not suffice.

Rename the existing bi_integrity field to bi_special and make it a union
so we can have different bio extensions for each class of command.

We previously used bi_integrity != NULL as a way to identify whether a
bio had integrity metadata or not. Introduce a REQ_INTEGRITY to be the
indicator now that bi_special can contain different things.

In addition, bio_integrity(bio) will now return a pointer to the
integrity payload (when applicable).

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-27 09:14:46 -06:00
Martin K. Petersen
e7258c1a26 block: Get rid of bdev_integrity_enabled()
bdev_integrity_enabled() is only used by bio_integrity_enabled().
Combine these two functions.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-27 09:14:44 -06:00
Paolo Bonzini
e77d99d4a4 Changes for KVM for arm/arm64 for 3.18
This includes a bunch of changes:
  - Support read-only memory slots on arm/arm64
  - Various changes to fix Sparse warnings
  - Correctly detect write vs. read Stage-2 faults
  - Various VGIC cleanups and fixes
  - Dynamic VGIC data strcuture sizing
  - Fix SGI set_clear_pend offset bug
  - Fix VTTBR_BADDR Mask
  - Correctly report the FSC on Stage-2 faults
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJUJWAdAAoJEEtpOizt6ddy9cMH+gIoUPnRJLe+PPcOOyxOx6pr
 +CnD/zAd0sLvxZLP/LBOzu99H3YrbO5kwI/172/8G1zUNI2hp6YxEEJaBCTHrz6l
 RwgLy7a3EMMY51nJo5w2dkFUo8cUX9MsHqMpl2Xb7Dvo2ZHp+nDqRjwRY6yi+t4V
 dWSJTRG6X+DIWyysij6jBtfKU6MpU+4NW3Zdk1fapf8QDkn+cBtV5X2QcmERCaIe
 A1j9hiGi43KA3XWeeePU3aVaxC2XUhTayP8VsfVxoNG2manaS6lqjmbif5ghs/0h
 rw7R3/Aj0MJny2zT016MkvKJKRukuVRD6e1lcYghqnSJhL2FossowZ9fHRADpqU=
 =QgU8
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-next

Changes for KVM for arm/arm64 for 3.18

This includes a bunch of changes:
 - Support read-only memory slots on arm/arm64
 - Various changes to fix Sparse warnings
 - Correctly detect write vs. read Stage-2 faults
 - Various VGIC cleanups and fixes
 - Dynamic VGIC data strcuture sizing
 - Fix SGI set_clear_pend offset bug
 - Fix VTTBR_BADDR Mask
 - Correctly report the FSC on Stage-2 faults

Conflicts:
	virt/kvm/eventfd.c
	[duplicate, different patch where the kvm-arm version broke x86.
	 The kvm tree instead has the right one]
2014-09-27 11:03:33 +02:00
Maxime Ripard
9824cf73c3 clk: Add a function to retrieve phase
The current phase API doesn't look into the actual hardware to get the phase
value, but will rather get it from a variable only set by the set_phase
function.

This will cause issue when the client driver will never call the set_phase
function, where we can end up having a reported phase that will not match what
the hardware has been programmed to by the bootloader or what phase is
programmed out of reset.

Add a new get_phase function for the drivers to implement so that we can get
this value.

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Hans de Goede <hdegoede@redhat.com>
2014-09-27 08:57:51 +02:00
Maxime Ripard
355bb165cd clk: Include of.h in clock-provider.h
CLK_OF_DECLARE relies on OF_DECLARE_1 that is defined in of.h. Fixes build
errors when one use CLK_OF_DECLARE but doesn't include of.h

Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
2014-09-27 08:57:50 +02:00
Mike Turquette
e59c5371fb clk: introduce clk_set_phase function & callback
A common operation for a clock signal generator is to shift the phase of
that signal. This patch introduces a new function to the clk.h API to
dynamically adjust the phase of a clock signal. Additionally this patch
introduces support for the new function in the common clock framework
via the .set_phase call back in struct clk_ops.

Signed-off-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Hans de Goede <hdegoede@redhat.com>
2014-09-27 08:57:38 +02:00
Miklos Szeredi
2c80929c4c fuse: honour max_read and max_write in direct_io mode
The third argument of fuse_get_user_pages() "nbytesp" refers to the number of
bytes a caller asked to pack into fuse request. This value may be lesser
than capacity of fuse request or iov_iter.  So fuse_get_user_pages() must
ensure that *nbytesp won't grow.

Now, when helper iov_iter_get_pages() performs all hard work of extracting
pages from iov_iter, it can be done by passing properly calculated
"maxsize" to the helper.

The other caller of iov_iter_get_pages() (dio_refill_pages()) doesn't need
this capability, so pass LONG_MAX as the maxsize argument here.

Fixes: c9c37e2e63 ("fuse: switch to iov_iter_get_pages()")
Reported-by: Werner Baumann <werner.baumann@onlinehome.de>
Tested-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-09-26 21:16:51 -04:00
Jyri Sarha
c873d14d30 clk: add gpio gated clock
The added gpio-gate-clock is a basic clock that can be enabled and
disabled trough a gpio output. The DT binding document for the clock
is also added. For EPROBE_DEFER handling the registering of the clock
has to be delayed until of_clk_get() call time.

Signed-off-by: Jyri Sarha <jsarha@ti.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
2014-09-26 16:51:42 -07:00
Mike Turquette
b6b2fe5b6e Tegra clk updates for 3.18
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJUHBvYAAoJEFFb18rurjwTyaoP/37XBxRXfN2mL+WroUC7rb4m
 qb7PKg0RTOjjWV6Dp2EzmXThxsy0sGRpML9ZNp6LYIqPOH4sBHNeRK0+Qf1dB1XZ
 /DySj4oqEgJ+x8GHTqceM5x4C5fwDrF4sl5uacYIZcA42ih3nY331zlJaNPpkSpx
 iP91SJsQghK9OpdnOTl8oOV9p6PIjFfn712GPIRL7PBIBkVswdkJHufozzvcX+mI
 O30/FGEGbQCmCs1pFXufFNwJQcOy0qK5zyZPhGbzyji9gkCXz4OQOqWLdE1kXbWY
 CDfolp0Xdr+hLN/6YkQBEkZSjvKzLZf8OOYmVsXWXplVDjLf9/2p/Kq33p8JGKUa
 R8BC17MtmKjNgJ7g0Zo870meRWDcsLOfr60Wu4MIhzSJKSVDp6oq0ZqYay10/ZuF
 iMv49Mcpbgl4d0D6DgJztMPYwQZrMOkTLByGo+s/UxB+gY2XtvUO/6mPchYfhYUX
 M7y28Im4j5pmTKw5WhulfDWTBUPul8tvqGLdB4i9zKOA0iszby7pnGzjKfZd4RYg
 FD5Byzd225X92R/kKxWPTv4SQBM7flVkfvJLbOV2SiedoCHcyLXJjtSgaFlE3XYY
 gUt9uqM5cwiw6hwPVzP1su02419L6cZYkCWOlVMNlMZ3fjd6js6NUtNjxH30tU7H
 TrwJ4tAYaIkuTw0jBnFE
 =WIPU
 -----END PGP SIGNATURE-----

Merge tag 'tegra-clk-3.18' of git://nv-tegra.nvidia.com/user/pdeschrijver/linux into clk-next

Tegra clk updates for 3.18
2014-09-26 16:09:39 -07:00
Eric Dumazet
f4a775d144 net: introduce __skb_header_release()
While profiling TCP stack, I noticed one useless atomic operation
in tcp_sendmsg(), caused by skb_header_release().

It turns out all current skb_header_release() users have a fresh skb,
that no other user can see, so we can avoid one atomic operation.

Introduce __skb_header_release() to clearly document this.

This gave me a 1.5 % improvement on TCP_RR workload.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 15:40:06 -04:00
David S. Miller
57219dc7bf Merge tag 'master-2014-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
pull request: wireless-next 2014-09-22

Please pull this batch of updates intended for the 3.18 stream...

For the mac80211 bits, Johannes says:

"This time, I have some rate minstrel improvements, support for a very
small feature from CCX that Steinar reverse-engineered, dynamic ACK
timeout support, a number of changes for TDLS, early support for radio
resource measurement and many fixes. Also, I'm changing a number of
places to clear key memory when it's freed and Intel claims copyright
for code they developed."

For the bluetooth bits, Johan says:

"Here are some more patches intended for 3.18. Most of them are cleanups
or fixes for SMP. The only exception is a fix for BR/EDR L2CAP fixed
channels which should now work better together with the L2CAP
information request procedure."

For the iwlwifi bits, Emmanuel says:

"I fix here dvm which was broken by my last pull request. Arik
continues to work on TDLS and Luca solved a few issues in CT-Kill. Eyal
keeps digging into rate scaling code, more to come soon. Besides this,
nothing really special here."

Beyond that, there are the usual big batches of updates to ath9k, b43,
mwifiex, and wil6210 as well as a handful of other bits here and there.
Also, rtlwifi gets some btcoexist attention from Larry.

Please let me know if there are problems!
====================

Had to adjust the wil6210 code to comply with Joe Perches's recent
change in net-next to make the netdev_*() routines return void instead
of 'int'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 15:39:24 -04:00
Joe Perches
6ea754eb76 net: Change netdev_<level> logging functions to return void
No caller or macro uses the return value so make all
the functions return void.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 15:17:17 -04:00
Alexei Starovoitov
17a5267067 bpf: verifier (add verifier core)
This patch adds verifier core which simulates execution of every insn and
records the state of registers and program stack. Every branch instruction seen
during simulation is pushed into state stack. When verifier reaches BPF_EXIT,
it pops the state from the stack and continues until it reaches BPF_EXIT again.
For program:
1: bpf_mov r1, xxx
2: if (r1 == 0) goto 5
3: bpf_mov r0, 1
4: goto 6
5: bpf_mov r0, 2
6: bpf_exit
The verifier will walk insns: 1, 2, 3, 4, 6
then it will pop the state recorded at insn#2 and will continue: 5, 6

This way it walks all possible paths through the program and checks all
possible values of registers. While doing so, it checks for:
- invalid instructions
- uninitialized register access
- uninitialized stack access
- misaligned stack access
- out of range stack access
- invalid calling convention
- instruction encoding is not using reserved fields

Kernel subsystem configures the verifier with two callbacks:

- bool (*is_valid_access)(int off, int size, enum bpf_access_type type);
  that provides information to the verifer which fields of 'ctx'
  are accessible (remember 'ctx' is the first argument to eBPF program)

- const struct bpf_func_proto *(*get_func_proto)(enum bpf_func_id func_id);
  returns argument constraints of kernel helper functions that eBPF program
  may call, so that verifier can checks that R1-R5 types match the prototype

More details in Documentation/networking/filter.txt and in kernel/bpf/verifier.c

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 15:05:15 -04:00
Alexei Starovoitov
0246e64d9a bpf: handle pseudo BPF_LD_IMM64 insn
eBPF programs passed from userspace are using pseudo BPF_LD_IMM64 instructions
to refer to process-local map_fd. Scan the program for such instructions and
if FDs are valid, convert them to 'struct bpf_map' pointers which will be used
by verifier to check access to maps in bpf_map_lookup/update() calls.
If program passes verifier, convert pseudo BPF_LD_IMM64 into generic by dropping
BPF_PSEUDO_MAP_FD flag.

Note that eBPF interpreter is generic and knows nothing about pseudo insns.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 15:05:15 -04:00
Alexei Starovoitov
51580e798c bpf: verifier (add docs)
this patch adds all of eBPF verfier documentation and empty bpf_check()

The end goal for the verifier is to statically check safety of the program.

Verifier will catch:
- loops
- out of range jumps
- unreachable instructions
- invalid instructions
- uninitialized register access
- uninitialized stack access
- misaligned stack access
- out of range stack access
- invalid calling convention

More details in Documentation/networking/filter.txt

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 15:05:14 -04:00
Alexei Starovoitov
09756af468 bpf: expand BPF syscall with program load/unload
eBPF programs are similar to kernel modules. They are loaded by the user
process and automatically unloaded when process exits. Each eBPF program is
a safe run-to-completion set of instructions. eBPF verifier statically
determines that the program terminates and is safe to execute.

The following syscall wrapper can be used to load the program:
int bpf_prog_load(enum bpf_prog_type prog_type,
                  const struct bpf_insn *insns, int insn_cnt,
                  const char *license)
{
    union bpf_attr attr = {
        .prog_type = prog_type,
        .insns = ptr_to_u64(insns),
        .insn_cnt = insn_cnt,
        .license = ptr_to_u64(license),
    };

    return bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
}
where 'insns' is an array of eBPF instructions and 'license' is a string
that must be GPL compatible to call helper functions marked gpl_only

Upon succesful load the syscall returns prog_fd.
Use close(prog_fd) to unload the program.

User space tests and examples follow in the later patches

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 15:05:14 -04:00
Alexei Starovoitov
db20fd2b01 bpf: add lookup/update/delete/iterate methods to BPF maps
'maps' is a generic storage of different types for sharing data between kernel
and userspace.

The maps are accessed from user space via BPF syscall, which has commands:

- create a map with given type and attributes
  fd = bpf(BPF_MAP_CREATE, union bpf_attr *attr, u32 size)
  returns fd or negative error

- lookup key in a given map referenced by fd
  err = bpf(BPF_MAP_LOOKUP_ELEM, union bpf_attr *attr, u32 size)
  using attr->map_fd, attr->key, attr->value
  returns zero and stores found elem into value or negative error

- create or update key/value pair in a given map
  err = bpf(BPF_MAP_UPDATE_ELEM, union bpf_attr *attr, u32 size)
  using attr->map_fd, attr->key, attr->value
  returns zero or negative error

- find and delete element by key in a given map
  err = bpf(BPF_MAP_DELETE_ELEM, union bpf_attr *attr, u32 size)
  using attr->map_fd, attr->key

- iterate map elements (based on input key return next_key)
  err = bpf(BPF_MAP_GET_NEXT_KEY, union bpf_attr *attr, u32 size)
  using attr->map_fd, attr->key, attr->next_key

- close(fd) deletes the map

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 15:05:14 -04:00
Alexei Starovoitov
749730ce42 bpf: enable bpf syscall on x64 and i386
done as separate commit to ease conflict resolution

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 15:05:14 -04:00
Alexei Starovoitov
99c55f7d47 bpf: introduce BPF syscall and maps
BPF syscall is a multiplexor for a range of different operations on eBPF.
This patch introduces syscall with single command to create a map.
Next patch adds commands to access maps.

'maps' is a generic storage of different types for sharing data between kernel
and userspace.

Userspace example:
/* this syscall wrapper creates a map with given type and attributes
 * and returns map_fd on success.
 * use close(map_fd) to delete the map
 */
int bpf_create_map(enum bpf_map_type map_type, int key_size,
                   int value_size, int max_entries)
{
    union bpf_attr attr = {
        .map_type = map_type,
        .key_size = key_size,
        .value_size = value_size,
        .max_entries = max_entries
    };

    return bpf(BPF_MAP_CREATE, &attr, sizeof(attr));
}

'union bpf_attr' is backwards compatible with future extensions.

More details in Documentation/networking/filter.txt and in manpage

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 15:05:14 -04:00
Sebastian Reichel
1670d8569e Immutable branch with restart handler patches for v3.18
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUJQ8/AAoJEMsfJm/On5mBMNgP+QEUHpRKJaOGU3jX/ftHH/t3
 EoNUx7lZt6Q0c9MB2ySAxILYpWUujc9N0tDkRDyW7mTWunF8gEGiRN+iKaSbzcUN
 Y4VffRAbxBasIaBqRtpDl08ycODh6Xu1t8sAao03DdhnMNLGNNO79s3UFHsubdTC
 cXx9mfYR/2SHV/0BXiFvKi8ovdqUspdp9cyZO/qc0PVFGbsADx3MNGGzkvWfgvcE
 6vXnKnUkZrNl5JPiG77kTKZnDsjEMXggmA9DGWKijFCJjGIbuLiuIDf63Zp+eQ52
 mJMRA+ViP/dDgAxY1dkWBcF5nOBT1vTYwLfy69jEoQeHzcomiHVoDKmCSBOpeAEH
 G8VoasWKWYpYnlcOJb+XgkA3QTe6mOPgAPzNsbYr0Ep7hMFw66mOQgKbgi6k4Qts
 HHimG9pnBYpPlBUfvNh+6K4dHAm0C2IyoZyMhKWsyFH6hkhS8TVM8j0gPR8rTTmk
 0a9/e2vxcFnfBe3UAJaqzWRVFsBkOHrTNpG1hvID3Oq8IeywSBXw2VMSR93+mwaB
 sa/GCZKlqHGpOfmtILlhiXQX0E/tTHmcrI2VqyCpX0J2CW+MiGvkcGOwKHOJciSA
 Cj9D68y837QU/DCpMQ6ec/5wqWqZKz8yQb8kxb6vJcL19JcVKdAiPzbuOI49C3Ux
 YxDWoUutzDfVoUD5RhcJ
 =cP1w
 -----END PGP SIGNATURE-----

Merge tag 'tags/restart-handler-for-v3.18' into next

Immutable branch with restart handler patches for v3.18
2014-09-26 19:45:11 +02:00
Pablo Neira Ayuso
34666d467c netfilter: bridge: move br_netfilter out of the core
Jesper reported that br_netfilter always registers the hooks since
this is part of the bridge core. This harms performance for people that
don't need this.

This patch modularizes br_netfilter so it can be rmmod'ed, thus,
the hooks can be unregistered. I think the bridge netfilter should have
been a separated module since the beginning, Patrick agreed on that.

Note that this is breaking compatibility for users that expect that
bridge netfilter is going to be available after explicitly 'modprobe
bridge' or via automatic load through brctl.

However, the damage can be easily undone by modprobing br_netfilter.
The bridge core also spots a message to provide a clue to people that
didn't notice that this has been deprecated.

On top of that, the plan is that nftables will not rely on this software
layer, but integrate the connection tracking into the bridge layer to
enable stateful filtering and NAT, which is was bridge netfilter users
seem to require.

This patch still keeps the fake_dst_ops in the bridge core, since this
is required by when the bridge port is initialized. So we can safely
modprobe/rmmod br_netfilter anytime.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Florian Westphal <fw@strlen.de>
2014-09-26 18:42:31 +02:00
Pablo Neira Ayuso
7276ca3fa2 netfilter: bridge: nf_bridge_copy_header as static inline in header
Move nf_bridge_copy_header() as static inline in netfilter_bridge.h
header file. This patch prepares the modularization of the br_netfilter
code.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-26 18:42:30 +02:00
Sebastian Andrzej Siewior
d74d5d1b72 tty: serial: 8250_core: add run time pm
While comparing the OMAP-serial and the 8250 part of this I noticed that
the latter does not use run time-pm. Here are the pieces. It is
basically a get before first register access and a last_busy + put after
last access. This has to be enabled from userland _and_ UART_CAP_RPM is
required for this.
The runtime PM can usually work transparently in the background however
there is one exception to this: After serial8250_tx_chars() completes
there still may be unsent bytes in the FIFO (depending on CPU speed vs
baud rate + flow control). Even if the TTY-buffer is empty we do not
want RPM to disable the device because it won't send the remaining
bytes. Instead we leave serial8250_tx_chars() with RPM enabled and wait
for the FIFO empty interrupt. Once we enter serial8250_tx_chars() with
an empty buffer we know that the FIFO is empty and since we are not going
to send anything, we can disable the device.
That xchg() is to ensure that serial8250_tx_chars() can be called
multiple times and only the first invocation will actually invoke the
runtime PM function. So that the last invocation of __stop_tx() will
disable runtime pm.

NOTE: do not enable RPM on the device unless you know what you do! If
the device goes idle, it won't be woken up by incomming RX data _unless_
there is a wakeup irq configured which is usually the RX pin configure
for wakeup via the reset module. The RX activity will then wake up the
device from idle. However the first character is garbage and lost. The
following bytes will be received once the device is up in time. On the
beagle board xm (omap3) it takes approx 13ms from the first wakeup byte
until the first byte that is received properly if the device was in
core-off.

v5…v8:
	- drop RPM from serial8250_set_mctrl() it will be used in
	  restore path which already has RPM active and holds
	  dev->power.lock
v4…v5:
	- add a wrapper around rpm function and introduce UART_CAP_RPM
	  to ensure RPM put is invoked after the TX FIFO is empty.
v3…v4:
	- added runtime to the console code
	- removed device_may_wakeup() from serial8250_set_sleep()

Cc: mika.westerberg@linux.intel.com
Reviewed-by: Tony Lindgren <tony@atomide.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-26 18:01:56 +02:00
Sebastian Andrzej Siewior
234abab143 tty: serial: 8250_core: allow to set ->throttle / ->unthrottle callbacks
The OMAP UART provides support for HW assisted flow control. What is
missing is the support to throttle / unthrottle callbacks which are used
by the omap-serial driver at the moment.
This patch adds the callbacks. It should be safe to add them since they
are only invoked from the serial_core (uart_throttle()) if the feature
flags are set.

Reviewed-by: Tony Lindgren <tony@atomide.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-26 18:01:56 +02:00
Linus Torvalds
d2865c7d4e Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
 "A CONFIG_STACK_GROWSUP=y fix, and a hotplug llc CPU mask fix"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Fix unreleased llc_shared_mask bit during CPU hotplug
  sched: Fix end_of_stack() and location of stack canary for architectures using CONFIG_STACK_GROWSUP
2014-09-26 08:38:09 -07:00
Mika Westerberg
6ab3430129 mfd: Add ACPI support
If an MFD device is backed by ACPI namespace, we should allow subdevice
drivers to access their corresponding ACPI companion devices through normal
means (e.g using ACPI_COMPANION()).

This patch adds such support to the MFD core. If the MFD parent device
does not specify any ACPI _HID/_CID for the child device, the child
device will share the parent ACPI companion device. Otherwise the child
device will be assigned with the corresponding ACPI companion, if found
in the namespace below the parent.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2014-09-26 08:24:05 +01:00
Jeff Lance
f0933a60d1 mfd: ti_am335x_tscadc: Update logic in CTRL register for 5-wire TS
The logic in AFE_Pen_Ctrl bitmask in the CTRL register is different for five
wire versus four or eight wire touschscreens. This patch should fix this for
five-wire touch screens. There should be no change needed here for four and
eight wire tousch screens.

Signed-off-by: Jeff Lance <j-lance1@ti.com>
[bigeasy: keep the change mfd only]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2014-09-26 08:24:03 +01:00
Mark Brown
0b496b4c95 mfd: tps65217: Tell regmap what registers are valid
Allow regmap to provide debugfs access to the register map by telling it
what registers are valid.

Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2014-09-26 08:23:50 +01:00
Guodong Xu
8bdf87b400 mfd: Add HI6421 PMIC Core driver
This adds driver to support HiSilicon Hi6421 PMIC. Hi6421 includes multi-
functions, such as regulators, codec, ADCs, Coulomb counter, etc.
This driver includes core APIs _only_.

Drivers for individul components, like voltage regulators, are
implemented in corresponding driver directories and files.

Registers in Hi6421 are memory mapped, so using regmap-mmio API.

Signed-off-by: Guodong Xu <guodong.xu@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2014-09-26 08:23:43 +01:00