Commit graph

34408 commits

Author SHA1 Message Date
Neil Horman
bd7c4b604a netpoll: convert mutex into a semaphore
Bart Van Assche recently reported a warning to me:

<IRQ>  [<ffffffff8103d79f>] warn_slowpath_common+0x7f/0xc0
[<ffffffff8103d7fa>] warn_slowpath_null+0x1a/0x20
[<ffffffff814761dd>] mutex_trylock+0x16d/0x180
[<ffffffff813968c9>] netpoll_poll_dev+0x49/0xc30
[<ffffffff8136a2d2>] ? __alloc_skb+0x82/0x2a0
[<ffffffff81397715>] netpoll_send_skb_on_dev+0x265/0x410
[<ffffffff81397c5a>] netpoll_send_udp+0x28a/0x3a0
[<ffffffffa0541843>] ? write_msg+0x53/0x110 [netconsole]
[<ffffffffa05418bf>] write_msg+0xcf/0x110 [netconsole]
[<ffffffff8103eba1>] call_console_drivers.constprop.17+0xa1/0x1c0
[<ffffffff8103fb76>] console_unlock+0x2d6/0x450
[<ffffffff8104011e>] vprintk_emit+0x1ee/0x510
[<ffffffff8146f9f6>] printk+0x4d/0x4f
[<ffffffffa0004f1d>] scsi_print_command+0x7d/0xe0 [scsi_mod]

This resulted from my commit ca99ca14c which introduced a mutex_trylock
operation in a path that could execute in interrupt context.  When mutex
debugging is enabled, the above warns the user when we are in fact
exectuting in interrupt context
interrupt context.

After some discussion, It seems that a semaphore is the proper mechanism to use
here.  While mutexes are defined to be unusable in interrupt context, no such
condition exists for semaphores (save for the fact that the non blocking api
calls, like up and down_trylock must be used when in irq context).

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Bart Van Assche <bvanassche@acm.org>
CC: Bart Van Assche <bvanassche@acm.org>
CC: David Miller <davem@davemloft.net>
CC: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-01 15:00:24 -04:00
David S. Miller
14d3692f04 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
The following patchset contains relevant updates for the Netfilter
tree, they are:

* Enhancements for ipset: Add the counter extension for sets, this
  information can be used from the iptables set match, to change
  the matching behaviour. Jozsef required to add the extension
  infrastructure and moved the existing timeout support upon it.
  This also includes a change in net/sched/em_ipset to adapt it to
  the new extension structure.

* Enhancements for performance boosting in nfnetlink_queue: Add new
  configuration flags that allows user-space to receive big packets (GRO)
  and to disable checksumming calculation. This were proposed by Eric
  Dumazet during the Netfilter Workshop 2013 in Copenhagen. Florian
  Westphal was kind enough to find the time to materialize the proposal.

* A sparse fix from Simon, he noticed it in the SCTP NAT helper, the fix
  required a change in the interface of sctp_end_cksum.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-29 14:29:06 -04:00
Jozsef Kadlecsik
6e01781d1c netfilter: ipset: set match: add support to match the counters
The new revision of the set match supports to match the counters
and to suppress updating the counters at matching too.

At the set:list types, the updating of the subcounters can be
suppressed as well.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-29 20:09:03 +02:00
Jozsef Kadlecsik
34d666d489 netfilter: ipset: Introduce the counter extension in the core
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-29 20:08:59 +02:00
Jozsef Kadlecsik
1feab10d7e netfilter: ipset: Unified hash type generation
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-29 20:08:56 +02:00
Jozsef Kadlecsik
4d73de38c2 netfilter: ipset: Unified bitmap type generation
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-29 20:08:54 +02:00
Jozsef Kadlecsik
075e64c041 netfilter: ipset: Introduce extensions to elements in the core
Introduce extensions to elements in the core and prepare timeout as
the first one.

This patch also modifies the em_ipset classifier to use the new
extension struct layout.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-29 20:08:54 +02:00
Jozsef Kadlecsik
8672d4d1a0 netfilter: ipset: Move often used IPv6 address masking function to header file
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-29 20:08:50 +02:00
Jozsef Kadlecsik
43c56e595b netfilter: ipset: Make possible to test elements marked with nomatch
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-29 20:08:44 +02:00
Pravin B Shelar
5f5624cf15 ipv6: Kill ipv6 dependency of icmpv6_send().
Following patch adds icmp-registration module for ipv6.  It allows
ipv6 protocol to register icmp_sender which is used for sending
ipv6 icmp msgs.  This extra layer allows us to kill ipv6 dependency
for sending icmp packets.

This patch also fixes ip_tunnel compilation problem when ip_tunnel
is statically compiled in kernel but ipv6 is module

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-29 13:54:36 -04:00
Nicolas Dichtel
e8d9612c18 sock_diag: allow to dump bpf filters
This patch allows to dump BPF filters attached to a socket with
SO_ATTACH_FILTER.
Note that we check CAP_SYS_ADMIN before allowing to dump this info.

For now, only AF_PACKET sockets use this feature.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-29 13:21:30 -04:00
Rony Efraim
2cccb9e4f3 net/mlx4: Add support to get VF config
Support getting VF config.

Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-26 23:29:14 -04:00
Rony Efraim
e6b6a23163 net/mlx4: Add VF MAC spoof checking support
Add ndo_set_vf_spoofchk support

Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-26 23:29:14 -04:00
Rony Efraim
3f7fb021d0 net/mlx4: Add set VF default vlan ID and priority support
Add support to ndo_set_vf_vlan in the driver. Once this call is used the vport
is considered to be in VST mode. In this mode, the PPF driver configures
Ethernet QPs created by this VF to use this vlan id and priority. Currently
RoCE isn't supported on that mode.

The special values of VID=4095 or VID=0,UP=0 are considered as VGT.

Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-26 23:29:13 -04:00
Rony Efraim
8f7ba3ca12 net/mlx4: Add set VF mac address support
Add ndo_set_vf_mac support which allows to set the MAC address
for mlx4 VF Ethernet NICs from the host.

Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-26 23:29:13 -04:00
Alexander Duyck
5a8eb24292 pci: Add SRIOV helper function to determine if VFs are assigned to guest
This function is meant to add a helper function that will determine if a PF
has any VFs that are currently assigned to a guest.  We currently have been
implementing this function per driver, and going forward I would like to avoid
that by making this function generic and using this helper.

v2: Removed extern from declaration of pci_vfs_assigned in pci.h and return
    0 if SR-IOV is disabled with is inline with other PCI SRIOV functions.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-04-24 19:31:58 -07:00
Amir Vadai
ec693d4701 net/mlx4_en: Add HW timestamping (TS) support
The patch allows to enable/disable HW timestamping for incoming and/or
outgoing packets. It adds and initializes all structs and callbacks
needed by kernel TS API.
To enable/disable HW timestamping appropriate ioctl should be used.
Currently HWTSTAMP_FILTER_ALL/NONE and HWTSAMP_TX_ON/OFF only are
supported.
When enabling TS on receive flow - VLAN stripping will be disabled.
Also were made all relevant changes in RX/TX flows to consider TS request
and plant HW timestamps into relevant structures.
mlx4_ib was fixed to compile with new mlx4_cq_alloc() signature.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-24 16:30:14 -04:00
Eugenia Emantayev
ddd8a6c12d net/mlx4_core: Read HCA frequency and map internal clock
Read HCA frequency, read PCI clock bar and offset, map internal clock to
PCI bar.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-24 16:30:13 -04:00
Eugenia Emantayev
d998735f44 net/mlx4_core: Add timestamping device capability
Add new device capability for timestamping support and query FW to retrieve it.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-24 16:30:13 -04:00
John W. Linville
6ed0e321a0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2013-04-24 10:54:20 -04:00
John W. Linville
ec094144cd Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
Conflicts:
	drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c
2013-04-23 14:09:39 -04:00
David S. Miller
6e0895c2ea Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/emulex/benet/be_main.c
	drivers/net/ethernet/intel/igb/igb_main.c
	drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c
	include/net/scm.h
	net/batman-adv/routing.c
	net/ipv4/tcp_input.c

The e{uid,gid} --> {uid,gid} credentials fix conflicted with the
cleanup in net-next to now pass cred structs around.

The be2net driver had a bug fix in 'net' that overlapped with the VLAN
interface changes by Patrick McHardy in net-next.

An IGB conflict existed because in 'net' the build_skb() support was
reverted, and in 'net-next' there was a comment style fix within that
code.

Several batman-adv conflicts were resolved by making sure that all
calls to batadv_is_my_mac() are changed to have a new bat_priv first
argument.

Eric Dumazet's TS ECR fix in TCP in 'net' conflicted with the F-RTO
rewrite in 'net-next', mostly overlapping changes.

Thanks to Stephen Rothwell and Antonio Quartulli for help with several
of these merge resolutions.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-22 20:32:51 -04:00
Johannes Berg
a42c74ee60 Merge remote-tracking branch 'wireless-next/master' into mac80211-next 2013-04-22 15:31:43 +02:00
Patrick McHardy
9fae27b337 net: vlan: fix dummy function signatures for CONFIG_VLAN=n
Fix up some function signatures for CONFIG_VLAN=n that were missed during
the 802.1ad support patches.

Found by the kbuild robot.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-21 15:56:59 -04:00
Linus Torvalds
830ac8524f Merge branch 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull kdump fixes from Peter Anvin:
 "The kexec/kdump people have found several problems with the support
  for loading over 4 GiB that was introduced in this merge cycle.  This
  is partly due to a number of design problems inherent in the way the
  various pieces of kdump fit together (it is pretty horrifically manual
  in many places.)

  After a *lot* of iterations this is the patchset that was agreed upon,
  but of course it is now very late in the cycle.  However, because it
  changes both the syntax and semantics of the crashkernel option, it
  would be desirable to avoid a stable release with the broken
  interfaces."

I'm not happy with the timing, since originally the plan was to release
the final 3.9 tomorrow.  But apparently I'm doing an -rc8 instead...

* 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  kexec: use Crash kernel for Crash kernel low
  x86, kdump: Change crashkernel_high/low= to crashkernel=,high/low
  x86, kdump: Retore crashkernel= to allocate under 896M
  x86, kdump: Set crashkernel_low automatically
2013-04-20 18:40:36 -07:00
Linus Torvalds
db93f8b420 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
 "Three groups of fixes:

   1. Make sure we don't execute the early microcode patching if family
      < 6, since it would touch MSRs which don't exist on those
      families, causing crashes.

   2. The Xen partial emulation of HyperV can be dealt with more
      gracefully than just disabling the driver.

   3. More EFI variable space magic.  In particular, variables hidden
      from runtime code need to be taken into account too."

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, microcode: Verify the family before dispatching microcode patching
  x86, hyperv: Handle Xen emulation of Hyper-V more gracefully
  x86,efi: Implement efi_no_storage_paranoia parameter
  efi: Export efi_query_variable_store() for efivars.ko
  x86/Kconfig: Make EFI select UCS2_STRING
  efi: Distinguish between "remaining space" and actually used space
  efi: Pass boot services variable info to runtime code
  Move utf16 functions to kernel core and rename
  x86,efi: Check max_size only if it is non-zero.
  x86, efivars: firmware bug workarounds should be in platform code
2013-04-20 18:38:48 -07:00
Linus Torvalds
c437d888c7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking updates from David Miller:

 1) ax88796 does 64-bit divides which causes link errors on ARM, fix
    from Arnd Bergmann.

 2) Once an improper offload setting is detected on an SKB we don't rate
    limit the log message so we can very easily live lock.  From Ben
    Greear.

 3) Openvswitch cannot report vport configuration changes reliably
    because it didn't preallocate the netlink notification message
    before changing state.  From Jesse Gross.

 4) The effective UID/GID SCM credentials fix, from Linus.

 5) When a user explicitly asks for wireless authentication, cfg80211
    isn't told about the AP detachment leaving inconsistent state.  Fix
    from Johannes Berg.

 6) Fix self-MAC checks in batman-adv on multi-mesh nodes, from Antonio
    Quartulli.

 7) Revert build_skb() change sin IGB driver, can result in memory
    corruption.  From Alexander Duyck.

 8) Fix setting VLANs on virtual functions in IXGBE, from Greg Rose.

 9) Fix TSO races in qlcnic driver, from Sritej Velaga.

10) In bnx2x the kernel driver and UNDI firmware can try to program the
    chip at the same time, resulting in corruption.  Add proper
    synchronization.  From Dmitry Kravkov.

11) Fix corruption of status block in firmware ram in bxn2x, from Ariel
    Elior.

12) Fix load balancing hash regression of bonding driver in forwarding
    configurations, from Eric Dumazet.

13) Fix TS ECR regression in TCP by calling tcp_replace_ts_recent() in
    all the right spots, from Eric Dumazet.

14) Fix several bonding bugs having to do with address manintainence,
    including not removing address when configuration operations
    encounter errors, missed locking on the address lists, missing
    refcounting on VLAN objects, etc.  All from Nikolay Aleksandrov.

15) Add workarounds for firmware bugs in LTE qmi_wwan devices, wherein
    the devices fail to add a proper ethernet header while on LTE
    networks but otherwise properly do so on 2G and 3G ones.  From Bjørn
    Mork.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits)
  net: fix incorrect credentials passing
  net: rate-limit warn-bad-offload splats.
  net: ax88796: avoid 64 bit arithmetic
  qlge: Update version to 1.00.00.32.
  qlge: Fix ethtool autoneg advertising.
  qlge: Fix receive path to drop error frames
  net: qmi_wwan: prevent duplicate mac address on link (firmware bug workaround)
  net: qmi_wwan: fixup destination address (firmware bug workaround)
  net: qmi_wwan: fixup missing ethernet header (firmware bug workaround)
  bonding: in bond_mc_swap() bond's mc addr list is walked without lock
  bonding: disable netpoll on enslave failure
  bonding: primary_slave & curr_active_slave are not cleaned on enslave failure
  bonding: vlans don't get deleted on enslave failure
  bonding: mc addresses don't get deleted on enslave failure
  pkt_sched: fix error return code in fw_change_attrs()
  irda: small read past the end of array in debug code
  tcp: call tcp_replace_ts_recent() from tcp_ack()
  netfilter: xt_rpfilter: skip locally generated broadcast/multicast, too
  netfilter: ipset: bitmap:ip,mac: fix listing with timeout
  bonding: fix l23 and l34 load balancing in forwarding path
  ...
2013-04-20 18:21:05 -07:00
H. Peter Anvin
c0a9f451e4 Merge remote-tracking branch 'efi/urgent' into x86/urgent
Matt Fleming (1):
      x86, efivars: firmware bug workarounds should be in platform
      code

Matthew Garrett (3):
      Move utf16 functions to kernel core and rename
      efi: Pass boot services variable info to runtime code
      efi: Distinguish between "remaining space" and actually used
      space

Richard Weinberger (2):
      x86,efi: Check max_size only if it is non-zero.
      x86,efi: Implement efi_no_storage_paranoia parameter

Sergey Vlasov (2):
      x86/Kconfig: Make EFI select UCS2_STRING
      efi: Export efi_query_variable_store() for efivars.ko

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-19 17:09:03 -07:00
Daniel Borkmann
6e94d1ef37 net: socket: move ktime2ts to ktime header api
Currently, ktime2ts is a small helper function that is only used in
net/socket.c. Move this helper into the ktime API as a small inline
function, so that i) it's maintained together with ktime routines,
and ii) also other files can make use of it. The function is named
ktime_to_timespec_cond() and placed into the generic part of ktime,
since we internally make use of ktime_to_timespec(). ktime_to_timespec()
itself does not check the ktime variable for zero, hence, we name
this function ktime_to_timespec_cond() for only a conditional
conversion, and adapt its users to it.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19 16:39:13 -04:00
Patrick McHardy
3ab1f683bf nfnetlink: add support for memory mapped netlink
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19 14:58:36 -04:00
Patrick McHardy
ec464e5dc5 netfilter: rename netlink related "pid" variables to "portid"
Get rid of the confusing mix of pid and portid and use portid consistently
for all netlink related socket identities.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19 14:58:36 -04:00
Patrick McHardy
f9c2288837 netlink: implement memory mapped recvmsg()
Add support for mmap'ed recvmsg(). To allow the kernel to construct messages
into the mapped area, a dataless skb is allocated and the data pointer is
set to point into the ring frame. This means frames will be delivered to
userspace in order of allocation instead of order of transmission. This
usually doesn't matter since the order is either not determinable by
userspace or message creation/transmission is serialized. The only case
where this can have a visible difference is nfnetlink_queue. Userspace
can't assume mmap'ed messages have ordered IDs anymore and needs to check
this if using batched verdicts.

For non-mapped sockets, nothing changes.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19 14:57:58 -04:00
Patrick McHardy
9652e931e7 netlink: add mmap'ed netlink helper functions
Add helper functions for looking up mmap'ed frame headers, reading and
writing their status, allocating skbs with mmap'ed data areas and a poll
function.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19 14:57:57 -04:00
Patrick McHardy
0ebd0ac5ff net: add function to allocate sk_buff head without data area
Add a function to allocate a sk_buff head without any data. This will
be used by memory mapped netlink to attach data from the mmaped area
to the skb.

Additionally change skb_release_all() to check whether the skb has a
data area to allow the skb destructor to clear the data pointer in case
only a head has been allocated.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19 14:57:57 -04:00
Patrick McHardy
e32123e598 netlink: rename ssk to sk in struct netlink_skb_params
Memory mapped netlink needs to store the receiving userspace socket
when sending from the kernel to userspace. Rename 'ssk' to 'sk' to
avoid confusion.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19 14:57:56 -04:00
Patrick McHardy
8ad227ff89 net: vlan: add 802.1ad support
Add support for 802.1ad VLAN devices. This mainly consists of checking for
ETH_P_8021AD in addition to ETH_P_8021Q in a couple of places and check
offloading capabilities based on the used protocol.

Configuration is done using "ip link":

# ip link add link eth0 eth0.1000 \
	type vlan proto 802.1ad id 1000
# ip link add link eth0.1000 eth0.1000.1000 \
	type vlan proto 802.1q id 1000

52:54:00:12:34:56 > 92:b1:54:28:e4:8c, ethertype 802.1Q (0x8100), length 106: vlan 1000, p 0, ethertype 802.1Q, vlan 1000, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto ICMP (1), length 84)
    20.1.0.2 > 20.1.0.1: ICMP echo request, id 3003, seq 8, length 64
92:b1:54:28:e4:8c > 52:54:00:12:34:56, ethertype 802.1Q-QinQ (0x88a8), length 106: vlan 1000, p 0, ethertype 802.1Q, vlan 1000, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 47944, offset 0, flags [none], proto ICMP (1), length 84)
    20.1.0.1 > 20.1.0.2: ICMP echo reply, id 3003, seq 8, length 64

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19 14:46:06 -04:00
Patrick McHardy
86a9bad3ab net: vlan: add protocol argument to packet tagging functions
Add a protocol argument to the VLAN packet tagging functions. In case of HW
tagging, we need that protocol available in the ndo_start_xmit functions,
so it is stored in a new field in the skb. The new field fits into a hole
(on 64 bit) and doesn't increase the sks's size.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19 14:46:06 -04:00
Patrick McHardy
1fd9b1fc31 net: vlan: prepare for 802.1ad support
Make the encapsulation protocol value a property of VLAN devices and change
the device lookup functions to take the protocol value into account.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19 14:45:27 -04:00
Patrick McHardy
80d5c3689b net: vlan: prepare for 802.1ad VLAN filtering offload
Change the rx_{add,kill}_vid callbacks to take a protocol argument in
preparation of 802.1ad support. The protocol argument used so far is
always htons(ETH_P_8021Q).

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19 14:45:27 -04:00
Patrick McHardy
f646968f8f net: vlan: rename NETIF_F_HW_VLAN_* feature flags to NETIF_F_HW_VLAN_CTAG_*
Rename the hardware VLAN acceleration features to include "CTAG" to indicate
that they only support CTAGs. Follow up patches will introduce 802.1ad
server provider tagging (STAGs) and require the distinction for hardware not
supporting acclerating both.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19 14:45:26 -04:00
John W. Linville
5a22483e5a Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-04-18 15:01:30 -04:00
Linus Torvalds
0a82a8d132 Revert "block: add missing block_bio_complete() tracepoint"
This reverts commit 3a366e614d.

Wanlong Gao reports that it causes a kernel panic on his machine several
minutes after boot. Reverting it removes the panic.

Jens says:
 "It's not quite clear why that is yet, so I think we should just revert
  the commit for 3.9 final (which I'm assuming is pretty close).

  The wifi is crap at the LSF hotel, so sending this email instead of
  queueing up a revert and pull request."

Reported-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Requested-by: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-18 09:00:26 -07:00
Yinghai Lu
55a20ee780 x86, kdump: Retore crashkernel= to allocate under 896M
Vivek found old kexec-tools does not work new kernel anymore.

So change back crashkernel= back to old behavoir, and add crashkernel_high=
to let user decide if buffer could be above 4G, and also new kexec-tools will
be needed.

-v2: let crashkernel=X override crashkernel_high=
    update description about _high will be ignored by crashkernel=X
-v3: update description about kernel-parameters.txt according to Vivek.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1366089828-19692-4-git-send-email-yinghai@kernel.org
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-17 12:35:33 -07:00
Yinghai Lu
c729de8fce x86, kdump: Set crashkernel_low automatically
Chao said that kdump does does work well on his system on 3.8
without extra parameter, even iommu does not work with kdump.
And now have to append crashkernel_low=Y in first kernel to make
kdump work.

We have now modified crashkernel=X to allocate memory beyong 4G (if
available) and do not allocate low range for crashkernel if the user
does not specify that with crashkernel_low=Y.  This causes regression
if iommu is not enabled.  Without iommu, swiotlb needs to be setup in
first 4G and there is no low memory available to second kernel.

Set crashkernel_low automatically if the user does not specify that.

For system that does support IOMMU with kdump properly, user could
specify crashkernel_low=0 to save that 72M low ram.

-v3: add swiotlb_size() according to Konrad.
-v4: add comments what 8M is for according to hpa.
     also update more crashkernel_low= in kernel-parameters.txt
-v5: update changelog according to Vivek.
-v6: Change description about swiotlb referring according to HATAYAMA.

Reported-by: WANG Chao <chaowang@redhat.com>
Tested-by: WANG Chao <chaowang@redhat.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1366089828-19692-2-git-send-email-yinghai@kernel.org
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-17 12:35:32 -07:00
David S. Miller
92cf1f23cc Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch
Jesse Gross says:

====================
A number of improvements for net-next/3.10.

Highlights include:

 * Properly exposing linux/openvswitch.h to userspace after the uapi
   changes.

 * Simplification of locking. It immediately makes things simpler to
   reason about and avoids holding RTNL mutex for longer than
   necessary. In the near future it will also enable tunnel
   registration and more fine-grained locking.

 * Miscellaneous cleanups and simplifications.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-17 13:30:32 -04:00
Linus Torvalds
fca83168aa Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix erroneous netfilter drop of SIP packets generated by some Cisco
    phones, from Patrick McHardy.

 2) Fix netfilter IPSET refcounting in list_set_add(), from Jozsef
    Kadlecsik.

 3) Fix TCP syncookies route lookup key, we don't use the same values we
    would use for the usual SYN receive processing, from Dmitry Popov.

 4) Fix NULL deref in bond_slave_netdev_event(), from Nikolay
    Aleksandrov.

 5) When bonding enslave fails, we can forget to clear the IFF_BONDING
    bit, fix also from Nikolay Aleksandrov.

 6) skb->csum_start is 16-bits, which is almost always just fine.  But
    if we reallocate the headroom of an SKB this can push the
    skb->csum_start value outside of it's valid range.  This can easily
    happen when collapsing multiple SKBs from the retransmit queue
    together.

    Fix from Thomas Graf.

 7) Fix NULL deref in be2net driver due to missing check of
    __vlan_put_tag() return value, from Ivan Vecera.

 8) tun_set_iff() returns zero instead of error code on failure, fix
    from Wei Yongjun.

 9) Like GARP, 802 MRP needs to hold the app->lock when adding MAD
    events and queueing PDUs.  Fix from David Ward.

10) Build fix, MVMDIO needs PHYLIB, from Thomas Petazzoni..

11) Fix mac80211 static with ipv6 modular build, from Cong Wang.

12) If userland specifies a path cost explicitly, do not override it
    when the carrier state changes.  From Stephen Hemminger.

13) mvnets calculates the TX queue to use incorrectly resulting in
    garbage pointer derefs and crashes, fix from Willy Tarreau.

14) cdc_mbim does erroneous sizeof(ETH_HLEN).  Fix from Bjorn Mork.

15) IP fragmentation can leak a refcount-less route out from an RCU
    protected section.  This results in crashes and all sorts of hard to
    diagnose behavior.  Fix from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (24 commits)
  qlcnic: fix beaconing test for 82xx adapter
  net: drop dst before queueing fragments
  net: fec: fix regression in link change accounting
  net: cdc_mbim: remove bogus sizeof()
  drivers: net: ethernet: cpsw: get slave VLAN id from slave node instead of cpsw node
  net: mvneta: fix improper tx queue usage in mvneta_tx()
  esp4: fix error return code in esp_output()
  bridge: make user modified path cost sticky
  ipv6: statically link register_inet6addr_notifier()
  net: mvmdio: add select PHYLIB
  net/802/mrp: fix possible race condition when calling mrp_pdu_queue()
  tuntap: fix error return code in tun_set_iff()
  be2net: take care of __vlan_put_tag return value
  can: sja1000: fix handling on dt properties on little endian systems
  can: mcp251x: add missing IRQF_ONESHOT to request_threaded_irq
  netfilter: nf_nat: fix race when unloading protocol modules
  tcp: Reallocate headroom if it would overflow csum_start
  stmmac: prevent interrupt loop with MMC RX IPC Counter
  bonding: IFF_BONDING is not stripped on enslave failure
  bonding: fix netdev event NULL pointer dereference
  ...
2013-04-17 08:50:59 -07:00
Linus Torvalds
b4cbb197c7 vm: add vm_iomap_memory() helper function
Various drivers end up replicating the code to mmap() their memory
buffers into user space, and our core memory remapping function may be
very flexible but it is unnecessarily complicated for the common cases
to use.

Our internal VM uses pfn's ("page frame numbers") which simplifies
things for the VM, and allows us to pass physical addresses around in a
denser and more efficient format than passing a "phys_addr_t" around,
and having to shift it up and down by the page size.  But it just means
that drivers end up doing that shifting instead at the interface level.

It also means that drivers end up mucking around with internal VM things
like the vma details (vm_pgoff, vm_start/end) way more than they really
need to.

So this just exports a function to map a certain physical memory range
into user space (using a phys_addr_t based interface that is much more
natural for a driver) and hides all the complexity from the driver.
Some drivers will still end up tweaking the vm_page_prot details for
things like prefetching or cacheability etc, but that's actually
relevant to the driver, rather than caring about what the page offset of
the mapping is into the particular IO memory region.

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-16 16:45:45 -07:00
Sascha Herrmann
43b5abe064 at86rf230: add irq type configuration option
Add option to at86rf230 platform data to configure the type of the
interrupt used by the driver. The irq polarity of the device will
be configured accordingly.

Signed-off-by: Sascha Herrmann <sascha@ps.nvbi.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-16 16:34:07 -04:00
Johannes Berg
b2e506bfc4 mac80211: parse VHT channel switch IEs
VHT introduces multiple IEs that need to be parsed for a
wide bandwidth channel switch. Two are (currently) needed
in mac80211:
 * wide bandwidth channel switch element
 * channel switch wrapper element

The former is contained in the latter for beacons and probe
responses, but not for the spectrum management action frames
so the IE parser needs a new argument to differentiate them.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-04-16 15:29:45 +02:00
Johannes Berg
1b3a2e494b mac80211: handle extended channel switch announcement
Handle the (public) extended channel switch announcement
action frames. Parts of the data in these frames isn't
really in IEs, but put it into the elems struct anyway
to simplify the handling.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-04-16 15:29:45 +02:00