Commit graph

36499 commits

Author SHA1 Message Date
Antonio Quartulli
f316318157 batman-adv: deselect current GW on client mode switch off
When switching from gw_mode client to either off or server
the current selected gateway has to be deselected.
In this way when client mode is enabled again a gateway
re-election is forced and a GW_ADD event is consequently
sent.

The current behaviour instead is to keep the current gateway
leading to no GW_ADD event when gw_mode client is selected
for a second time

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2014-01-08 20:49:40 +01:00
Antonio Quartulli
ebf38fb7ab batman-adv: remove FSF address from GPL disclaimer
As suggested by checkpatch, remove all the references to the
FSF address since the kernel already has one reference in
its documentation.

In this way it is easier to update it in case of future
changes.

Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2014-01-08 20:49:39 +01:00
Antonio Quartulli
3fba7325bb batman-adv: don't switch byte order too often if not needed
If possible, operations like ntohs/ntohl should not be
performed too often. Use a variable to locally store the
converted value and then use it.

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2014-01-08 20:49:39 +01:00
Antonio Quartulli
a48bcacdb3 batman-adv: properly rename define in distributed arp table header file
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2014-01-08 20:49:38 +01:00
John W. Linville
300e5fd160 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2014-01-08 13:44:29 -05:00
John W. Linville
2eff7c791a This is the first NFC fixes pull request for 3.13.
It only contains one fix for a regression introduced with commit
 e29a9e2ae1. Without this fix, we can not establish a p2p link in
 target mode. Only initiator mode works.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQIcBAABAgAGBQJSywyaAAoJEIqAPN1PVmxK3eMP/2XKZa2qzcMZfWZBLEMXQHce
 UO36BqZF0nre0ZUZQmaXzM5L0PM/UNxvhf2DsLx3s54/Tk/o0wHYSM/7GfX58dkX
 YAYSbG3viQM9L1dRhgfGQwRxXUcd8M85fLUH9SdJeUzCIgxXEDFnlpkeaKPE+UaM
 PgCHRXbW+cDje4DSO5JXVzIRFzsCtAwgd5bx6u/5bM5PLzxGwHJHkHe+lZ9b4Hbe
 ZLsqbU0HRy0rB8hmpro6NcrIoNHEXYupdd1gwslb88jdA+BDOUAZj7htcBBcC9Lw
 7D8RQTI6YNDsOzcLJyWmmfqTKd1j3RWPcTSuWkAGTvy04VOhGEc1LcgOOVLvhsLZ
 Hw412d0lYOiIwtjeIwS1etY42+f7tPOHOuhWFO3EQX0/1fIQ2H9V18DkM2qFPCWT
 L5GwV70YWbYfeRF1i3kGWKN15qLh9/toxB6rgE8eM2vCCGJXbt52yz9oSNM3QvZf
 2rHnzAHCEKGv4xS7oHhJSnkwH3Nd7WR6gjWatbfswrjtaDU2PKlsL703Uxk1gIPz
 o3itmJv/Ej1j6TqUpUz2Vs/sr4dltCDiD7tiaiv8zprHn0LZcNyp1/E+p31pMHBP
 8IW3BXxpBh3S8gpJ932LVAYe40ymQ0m1ZhssddbH1CC90jfs6m/HfaUC+GsgBX/x
 0iaxEH1jwwRIZC9p1Qso
 =YkKJ
 -----END PGP SIGNATURE-----

Merge tag 'nfc-fixes-3.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-fixes

Samuel Ortiz <sameo@linux.intel.com> says:

"This is the first NFC fixes pull request for 3.13.

It only contains one fix for a regression introduced with commit
e29a9e2ae1. Without this fix, we can not establish a p2p link in
target mode. Only initiator mode works."

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-01-08 13:36:17 -05:00
Ying Xue
da7c224b1b net: xfrm: xfrm_policy: silence compiler warning
Fix below compiler warning:

net/xfrm/xfrm_policy.c:1644:12: warning: ‘xfrm_dst_alloc_copy’ defined but not used [-Wunused-function]

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07 22:45:26 -05:00
Jon Paul Maloy
581465fa28 tipc: make link start event synchronous
When a link is created we delay the start event by launching it
to be executed later in a tasklet. As we hold all the
necessary locks at the moment of creation, and there is no risk
of deadlock or contention, this delay serves no purpose in the
current code.

We remove this obsolete indirection step, and the associated function
link_start(). At the same time, we rename the function tipc_link_stop()
to the more appropriate tipc_link_purge_queues().

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07 18:44:26 -05:00
Ying Xue
f9a2c80b8b tipc: introduce new spinlock to protect struct link_req
Currently, only 'bearer_lock' is used to protect struct link_req in
the function disc_timeout(). This is unsafe, since the member fields
'num_nodes' and 'timer_intv' might be accessed by below three different
threads simultaneously, none of them grabbing bearer_lock in the
critical region:

link_activate()
  tipc_bearer_add_dest()
    tipc_disc_add_dest()
      req->num_nodes++;

tipc_link_reset()
  tipc_bearer_remove_dest()
    tipc_disc_remove_dest()
      req->num_nodes--
      disc_update()
        read req->num_nodes
	write req->timer_intv

disc_timeout()
  read req->num_nodes
  read/write req->timer_intv

Without lock protection, the only symptom of a race is that discovery
messages occasionally may not be sent out. This is not fatal, since such
messages are best-effort anyway. On the other hand, since discovery
messages are not time critical, adding a protecting lock brings no
serious overhead either. So we add a new, dedicated spinlock in
order to guarantee absolute data consistency in link_req objects.
This also helps reduce the overall role of the bearer_lock, which
we want to remove completely in a later commit series.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07 18:44:25 -05:00
Jon Paul Maloy
b9d4c33935 tipc: remove 'has_redundant_link' flag from STATE link protocol messages
The flag 'has_redundant_link' is defined only in RESET and ACTIVATE
protocol messages. Due to an ambiguity in the protocol specification it
is currently also transferred in STATE messages. Its value is used to
initialize a link state variable, 'permit_changeover', which is used
to inhibit futile link failover attempts when it is known that the
peer node has no working links at the moment, although the local node
may still think it has one.

The fact that 'has_redundant_link' incorrectly is read from STATE
messages has the effect that 'permit_changeover' sometimes gets a wrong
value, and permanently blocks any links from being re-established. Such
failures can only occur in in dual-link systems, and are extremely rare.
This bug seems to have always been present in the code.

Furthermore, since commit b4b5610223
("tipc: Ensure both nodes recognize loss of contact between them"),
the 'permit_changeover' field serves no purpose any more. The task of
enforcing 'lost contact' cycles at both peer endpoints is now taken
by a new mechanism, using the flags WAIT_NODE_DOWN and WAIT_PEER_DOWN
in struct tipc_node to abort unnecessary failover attempts.

We therefore remove the 'has_redundant_link' flag from STATE messages,
as well as the now redundant 'permit_changeover' variable.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07 18:44:25 -05:00
Jon Paul Maloy
170b3927b4 tipc: rename functions related to link failover and improve comments
The functionality related to link addition and failover is unnecessarily
hard to understand and maintain. We try to improve this by renaming
some of the functions, at the same time adding or improving the
explanatory comments around them. Names such as "tipc_rcv()" etc. also
align better with what is used in other networking components.

The changes in this commit are purely cosmetic, no functional changes
are made.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07 18:44:25 -05:00
David S. Miller
a04c0e2c0d Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
The following patchset contains two patches:

* fix the IRC NAT helper which was broken when adding (incomplete) IPv6
  support, from Daniel Borkmann.

* Refine the previous bugtrap that Jesper added to catch problems for the
  usage of the sequence adjustment extension in IPVs in Dec 16th, it may
  spam messages in case of finding a real bug.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07 18:38:17 -05:00
Daniel Borkmann
be7928d20b net: xfrm: xfrm_policy: fix inline not at beginning of declaration
Fix three warnings related to:

  net/xfrm/xfrm_policy.c:1644:1: warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
  net/xfrm/xfrm_policy.c:1656:1: warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
  net/xfrm/xfrm_policy.c:1668:1: warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]

Just removing the inline keyword is sufficient as the compiler will
decide on its own about inlining or not.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07 18:34:00 -05:00
Patrick McHardy
9638f33ecf netfilter: nft_ct: load both IPv4 and IPv6 conntrack modules for NFPROTO_INET
The ct expression can currently not be used in the inet family since
we don't have a conntrack module for NFPROTO_INET, so
nf_ct_l3proto_try_module_get() fails. Add some manual handling to
load the modules for both NFPROTO_IPV4 and NFPROTO_IPV6 if the
ct expression is used in the inet family.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-07 23:57:32 +01:00
Patrick McHardy
4566bf2706 netfilter: nft_meta: add l4proto support
For L3-proto independant rules we need to get at the L4 protocol value
directly. Add it to the nft_pktinfo struct and use the meta expression
to retrieve it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-07 23:57:31 +01:00
Patrick McHardy
124edfa9e0 netfilter: nf_tables: add nfproto support to meta expression
Needed by multi-family tables to distinguish IPv4 and IPv6 packets.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-07 23:57:30 +01:00
Patrick McHardy
1d49144c0a netfilter: nf_tables: add "inet" table for IPv4/IPv6
This patch adds a new table family and a new filter chain that you can
use to attach IPv4 and IPv6 rules. This should help to simplify
rule-set maintainance in dual-stack setups.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-07 23:57:25 +01:00
Patrick McHardy
115a60b173 netfilter: nf_tables: add support for multi family tables
Add support to register chains to multiple hooks for different address
families for mixed IPv4/IPv6 tables.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2014-01-07 23:55:46 +01:00
Patrick McHardy
c9484874e7 netfilter: nf_tables: add hook ops to struct nft_pktinfo
Multi-family tables need the AF from the hook ops. Add a pointer to the
hook ops and replace usage of the hooknum member in struct nft_pktinfo.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-07 23:50:43 +01:00
Patrick McHardy
3b088c4bc0 netfilter: nf_tables: make chain types override the default AF functions
Currently the AF-specific hook functions override the chain-type specific
hook functions. That doesn't make too much sense since the chain types
are a special case of the AF-specific hooks.

Make the AF-specific hook functions the default and make the optional
chain type hooks override them.

As a side effect, the necessary code restructuring reduces the code size,
f.i. in case of nf_tables_ipv4.o:

  nf_tables_ipv4_init_net   |  -24
  nft_do_chain_ipv4         | -113
 2 functions changed, 137 bytes removed, diff: -137

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-07 23:50:43 +01:00
Pablo Neira Ayuso
688d18636f netfilter: nft_reject: fix compilation warning if NF_TABLES_IPV6 is disabled
net/netfilter/nft_reject.c: In function 'nft_reject_eval':
net/netfilter/nft_reject.c:37:14: warning: unused variable 'net' [-Wunused-variable]

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-07 23:50:43 +01:00
Jerry Chu
bf5a755f5e net-gre-gro: Add GRE support to the GRO stack
This patch built on top of Commit 299603e837
("net-gro: Prepare GRO stack for the upcoming tunneling support") to add
the support of the standard GRE (RFC1701/RFC2784/RFC2890) to the GRO
stack. It also serves as an example for supporting other encapsulation
protocols in the GRO stack in the future.

The patch supports version 0 and all the flags (key, csum, seq#) but
will flush any pkt with the S (seq#) flag. This is because the S flag
is not support by GSO, and a GRO pkt may end up in the forwarding path,
thus requiring GSO support to break it up correctly.

Currently the "packet_offload" structure only contains L3 (ETH_P_IP/
ETH_P_IPV6) GRO offload support so the encapped pkts are limited to
IP pkts (i.e., w/o L2 hdr). But support for other protocol type can
be easily added, so is the support for GRE variations like NVGRE.

The patch also support csum offload. Specifically if the csum flag is on
and the h/w is capable of checksumming the payload (CHECKSUM_COMPLETE),
the code will take advantage of the csum computed by the h/w when
validating the GRE csum.

Note that commit 60769a5dcd "ipv4: gre:
add GRO capability" already introduces GRO capability to IPv4 GRE
tunnels, using the gro_cells infrastructure. But GRO is done after
GRE hdr has been removed (i.e., decapped). The following patch applies
GRO when pkts first come in (before hitting the GRE tunnel code). There
is some performance advantage for applying GRO as early as possible.
Also this approach is transparent to other subsystem like Open vSwitch
where GRE decap is handled outside of the IP stack hence making it
harder for the gro_cells stuff to apply. On the other hand, some NICs
are still not capable of hashing on the inner hdr of a GRE pkt (RSS).
In that case the GRO processing of pkts from the same remote host will
all happen on the same CPU and the performance may be suboptimal.

I'm including some rough preliminary performance numbers below. Note
that the performance will be highly dependent on traffic load, mix as
usual. Moreover it also depends on NIC offload features hence the
following is by no means a comprehesive study. Local testing and tuning
will be needed to decide the best setting.

All tests spawned 50 copies of netperf TCP_STREAM and ran for 30 secs.
(super_netperf 50 -H 192.168.1.18 -l 30)

An IP GRE tunnel with only the key flag on (e.g., ip tunnel add gre1
mode gre local 10.246.17.18 remote 10.246.17.17 ttl 255 key 123)
is configured.

The GRO support for pkts AFTER decap are controlled through the device
feature of the GRE device (e.g., ethtool -K gre1 gro on/off).

1.1 ethtool -K gre1 gro off; ethtool -K eth0 gro off
thruput: 9.16Gbps
CPU utilization: 19%

1.2 ethtool -K gre1 gro on; ethtool -K eth0 gro off
thruput: 5.9Gbps
CPU utilization: 15%

1.3 ethtool -K gre1 gro off; ethtool -K eth0 gro on
thruput: 9.26Gbps
CPU utilization: 12-13%

1.4 ethtool -K gre1 gro on; ethtool -K eth0 gro on
thruput: 9.26Gbps
CPU utilization: 10%

The following tests were performed on a different NIC that is capable of
csum offload. I.e., the h/w is capable of computing IP payload csum
(CHECKSUM_COMPLETE).

2.1 ethtool -K gre1 gro on (hence will use gro_cells)

2.1.1 ethtool -K eth0 gro off; csum offload disabled
thruput: 8.53Gbps
CPU utilization: 9%

2.1.2 ethtool -K eth0 gro off; csum offload enabled
thruput: 8.97Gbps
CPU utilization: 7-8%

2.1.3 ethtool -K eth0 gro on; csum offload disabled
thruput: 8.83Gbps
CPU utilization: 5-6%

2.1.4 ethtool -K eth0 gro on; csum offload enabled
thruput: 8.98Gbps
CPU utilization: 5%

2.2 ethtool -K gre1 gro off

2.2.1 ethtool -K eth0 gro off; csum offload disabled
thruput: 5.93Gbps
CPU utilization: 9%

2.2.2 ethtool -K eth0 gro off; csum offload enabled
thruput: 5.62Gbps
CPU utilization: 8%

2.2.3 ethtool -K eth0 gro on; csum offload disabled
thruput: 7.69Gbps
CPU utilization: 8%

2.2.4 ethtool -K eth0 gro on; csum offload enabled
thruput: 8.96Gbps
CPU utilization: 5-6%

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07 16:21:31 -05:00
Benjamin Poirier
cdb3f4a31b net: Do not enable tx-nocache-copy by default
There are many cases where this feature does not improve performance or even
reduces it.

For example, here are the results from tests that I've run using 3.12.6 on one
Intel Xeon W3565 and one i7 920 connected by ixgbe adapters. The results are
from the Xeon, but they're similar on the i7. All numbers report the
mean±stddev over 10 runs of 10s.

1) latency tests similar to what is described in "c6e1a0d net: Allow no-cache
copy from user on transmit"
There is no statistically significant difference between tx-nocache-copy
on/off.
nic irqs spread out (one queue per cpu)

200x netperf -r 1400,1
tx-nocache-copy off
        692000±1000 tps
        50/90/95/99% latency (us): 275±2/643.8±0.4/799±1/2474.4±0.3
tx-nocache-copy on
        693000±1000 tps
        50/90/95/99% latency (us): 274±1/644.1±0.7/800±2/2474.5±0.7

200x netperf -r 14000,14000
tx-nocache-copy off
        86450±80 tps
        50/90/95/99% latency (us): 334.37±0.02/838±1/2100±20/3990±40
tx-nocache-copy on
        86110±60 tps
        50/90/95/99% latency (us): 334.28±0.01/837±2/2110±20/3990±20

2) single stream throughput tests
tx-nocache-copy leads to higher service demand

                        throughput  cpu0        cpu1        demand
                        (Gb/s)      (Gcycle)    (Gcycle)    (cycle/B)

nic irqs and netperf on cpu0 (1x netperf -T0,0 -t omni -- -d send)

tx-nocache-copy off     9402±5      9.4±0.2                 0.80±0.01
tx-nocache-copy on      9403±3      9.85±0.04               0.838±0.004

nic irqs on cpu0, netperf on cpu1 (1x netperf -T1,1 -t omni -- -d send)

tx-nocache-copy off     9401±5      5.83±0.03   5.0±0.1     0.923±0.007
tx-nocache-copy on      9404±2      5.74±0.03   5.523±0.009 0.958±0.002

As a second example, here are some results from Eric Dumazet with latest
net-next.
tx-nocache-copy also leads to higher service demand

(cpu is Intel(R) Xeon(R) CPU X5660  @ 2.80GHz)

lpq83:~# ./ethtool -K eth0 tx-nocache-copy on
lpq83:~# perf stat ./netperf -H lpq84 -c
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB

 87380  16384  16384    10.00      9407.44   2.50     -1.00    0.522   -1.000

 Performance counter stats for './netperf -H lpq84 -c':

       4282.648396 task-clock                #    0.423 CPUs utilized
             9,348 context-switches          #    0.002 M/sec
                88 CPU-migrations            #    0.021 K/sec
               355 page-faults               #    0.083 K/sec
    11,812,797,651 cycles                    #    2.758 GHz                     [82.79%]
     9,020,522,817 stalled-cycles-frontend   #   76.36% frontend cycles idle    [82.54%]
     4,579,889,681 stalled-cycles-backend    #   38.77% backend  cycles idle    [67.33%]
     6,053,172,792 instructions              #    0.51  insns per cycle
                                             #    1.49  stalled cycles per insn [83.64%]
       597,275,583 branches                  #  139.464 M/sec                   [83.70%]
         8,960,541 branch-misses             #    1.50% of all branches         [83.65%]

      10.128990264 seconds time elapsed

lpq83:~# ./ethtool -K eth0 tx-nocache-copy off
lpq83:~# perf stat ./netperf -H lpq84 -c
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB

 87380  16384  16384    10.00      9412.45   2.15     -1.00    0.449   -1.000

 Performance counter stats for './netperf -H lpq84 -c':

       2847.375441 task-clock                #    0.281 CPUs utilized
            11,632 context-switches          #    0.004 M/sec
                49 CPU-migrations            #    0.017 K/sec
               354 page-faults               #    0.124 K/sec
     7,646,889,749 cycles                    #    2.686 GHz                     [83.34%]
     6,115,050,032 stalled-cycles-frontend   #   79.97% frontend cycles idle    [83.31%]
     1,726,460,071 stalled-cycles-backend    #   22.58% backend  cycles idle    [66.55%]
     2,079,702,453 instructions              #    0.27  insns per cycle
                                             #    2.94  stalled cycles per insn [83.22%]
       363,773,213 branches                  #  127.757 M/sec                   [83.29%]
         4,242,732 branch-misses             #    1.17% of all branches         [83.51%]

      10.128449949 seconds time elapsed

CC: Tom Herbert <therbert@google.com>
Signed-off-by: Benjamin Poirier <bpoirier@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07 16:20:19 -05:00
Erik Hugne
732256b933 tipc: correctly unlink packets from deferred packet queue
When we pull a received packet from a link's 'deferred packets' queue
for processing, its 'next' pointer is not cleared, and still refers to
the next packet in that queue, if any. This is incorrect, but caused
no harm before commit 40ba3cdf54 ("tipc:
message reassembly using fragment chain") was introduced. After that
commit, it may sometimes lead to the following oops:

general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in: tipc
CPU: 4 PID: 0 Comm: swapper/4 Tainted: G        W 3.13.0-rc2+ #6
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
task: ffff880017af4880 ti: ffff880017aee000 task.ti: ffff880017aee000
RIP: 0010:[<ffffffff81710694>]  [<ffffffff81710694>] skb_try_coalesce+0x44/0x3d0
RSP: 0018:ffff880016603a78  EFLAGS: 00010212
RAX: 6b6b6b6bd6d6d6d6 RBX: ffff880013106ac0 RCX: ffff880016603ad0
RDX: ffff880016603ad7 RSI: ffff88001223ed00 RDI: ffff880013106ac0
RBP: ffff880016603ab8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff88001223ed00
R13: ffff880016603ad0 R14: 000000000000058c R15: ffff880012297650
FS:  0000000000000000(0000) GS:ffff880016600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000805b000 CR3: 0000000011f5d000 CR4: 00000000000006e0
Stack:
 ffff880016603a88 ffffffff810a38ed ffff880016603aa8 ffff88001223ed00
 0000000000000001 ffff880012297648 ffff880016603b68 ffff880012297650
 ffff880016603b08 ffffffffa0006c51 ffff880016603b08 00ffffffa00005fc
Call Trace:
 <IRQ>
 [<ffffffff810a38ed>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffffa0006c51>] tipc_link_recv_fragment+0xd1/0x1b0 [tipc]
 [<ffffffffa0007214>] tipc_recv_msg+0x4e4/0x920 [tipc]
 [<ffffffffa00016f0>] ? tipc_l2_rcv_msg+0x40/0x250 [tipc]
 [<ffffffffa000177c>] tipc_l2_rcv_msg+0xcc/0x250 [tipc]
 [<ffffffffa00016f0>] ? tipc_l2_rcv_msg+0x40/0x250 [tipc]
 [<ffffffff8171e65b>] __netif_receive_skb_core+0x80b/0xd00
 [<ffffffff8171df94>] ? __netif_receive_skb_core+0x144/0xd00
 [<ffffffff8171eb76>] __netif_receive_skb+0x26/0x70
 [<ffffffff8171ed6d>] netif_receive_skb+0x2d/0x200
 [<ffffffff8171fe70>] napi_gro_receive+0xb0/0x130
 [<ffffffff815647c2>] e1000_clean_rx_irq+0x2c2/0x530
 [<ffffffff81565986>] e1000_clean+0x266/0x9c0
 [<ffffffff81985f7b>] ? notifier_call_chain+0x2b/0x160
 [<ffffffff8171f971>] net_rx_action+0x141/0x310
 [<ffffffff81051c1b>] __do_softirq+0xeb/0x480
 [<ffffffff819817bb>] ? _raw_spin_unlock+0x2b/0x40
 [<ffffffff810b8c42>] ? handle_fasteoi_irq+0x72/0x100
 [<ffffffff81052346>] irq_exit+0x96/0xc0
 [<ffffffff8198cbc3>] do_IRQ+0x63/0xe0
 [<ffffffff81981def>] common_interrupt+0x6f/0x6f
 <EOI>

This happens when the last fragment of a message has passed through the
the receiving link's 'deferred packets' queue, and at least one other
packet was added to that queue while it was there. After the fragment
chain with the complete message has been successfully delivered to the
receiving socket, it is released. Since 'next' pointer of the last
fragment in the released chain now is non-NULL, we get the crash shown
above.

We fix this by clearing the 'next' pointer of all received packets,
including those being pulled from the 'deferred' queue, before they
undergo any further processing.

Fixes: 40ba3cdf54 ("tipc: message reassembly using fragment chain")
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reported-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07 16:15:24 -05:00
J. Bruce Fields
bba0f88bf7 minor svcauth_gss.c cleanup 2014-01-07 16:01:16 -05:00
Jiri Pirko
dfd1582d1e ipv4: loopback device: ignore value changes after device is upped
When lo is brought up, new ifa is created. Then, devconf and neigh values
bitfield should be set so later changes of default values would not
affect lo values.

Note that the same behaviour is in ipv6. Also note that this is likely
not an issue in many distros (for example Fedora 19) because userspace
sets address to lo manually before bringing it up.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07 15:55:17 -05:00
FX Le Bail
509aba3b0d IPv6: add the option to use anycast addresses as source addresses in echo reply
This change allows to follow a recommandation of RFC4942.

- Add "anycast_src_echo_reply" sysctl to control the use of anycast addresses
  as source addresses for ICMPv6 echo reply. This sysctl is false by default
  to preserve existing behavior.
- Add inline check ipv6_anycast_destination().
- Use them in icmpv6_echo_reply().

Reference:
RFC4942 - IPv6 Transition/Coexistence Security Considerations
   (http://tools.ietf.org/html/rfc4942#section-2.1.6)

2.1.6. Anycast Traffic Identification and Security

   [...]
   To avoid exposing knowledge about the internal structure of the
   network, it is recommended that anycast servers now take advantage of
   the ability to return responses with the anycast address as the
   source address if possible.

Signed-off-by: Francois-Xavier Le Bail <fx.lebail@yahoo.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07 15:51:39 -05:00
Li RongQing
657e5d1965 ipv6: pcpu_tstats.syncp should be initialised in ip6_vti.c
initialise pcpu_tstats.syncp to kill the calltrace
[   11.973950] Call Trace:
[   11.973950]  [<819bbaff>] dump_stack+0x48/0x60
[   11.973950]  [<819bbaff>] dump_stack+0x48/0x60
[   11.973950]  [<81078dcf>] __lock_acquire.isra.22+0x1bf/0xc10
[   11.973950]  [<81078dcf>] __lock_acquire.isra.22+0x1bf/0xc10
[   11.973950]  [<81079fa7>] lock_acquire+0x77/0xa0
[   11.973950]  [<81079fa7>] lock_acquire+0x77/0xa0
[   11.973950]  [<817ca7ab>] ? dev_get_stats+0xcb/0x130
[   11.973950]  [<817ca7ab>] ? dev_get_stats+0xcb/0x130
[   11.973950]  [<8183862d>] ip_tunnel_get_stats64+0x6d/0x230
[   11.973950]  [<8183862d>] ip_tunnel_get_stats64+0x6d/0x230
[   11.973950]  [<817ca7ab>] ? dev_get_stats+0xcb/0x130
[   11.973950]  [<817ca7ab>] ? dev_get_stats+0xcb/0x130
[   11.973950]  [<811cf8c1>] ? __nla_reserve+0x21/0xd0
[   11.973950]  [<811cf8c1>] ? __nla_reserve+0x21/0xd0
[   11.973950]  [<817ca7ab>] dev_get_stats+0xcb/0x130
[   11.973950]  [<817ca7ab>] dev_get_stats+0xcb/0x130
[   11.973950]  [<817d5409>] rtnl_fill_ifinfo+0x569/0xe20
[   11.973950]  [<817d5409>] rtnl_fill_ifinfo+0x569/0xe20
[   11.973950]  [<810352e0>] ? kvm_clock_read+0x20/0x30
[   11.973950]  [<810352e0>] ? kvm_clock_read+0x20/0x30
[   11.973950]  [<81008e38>] ? sched_clock+0x8/0x10
[   11.973950]  [<81008e38>] ? sched_clock+0x8/0x10
[   11.973950]  [<8106ba45>] ? sched_clock_local+0x25/0x170
[   11.973950]  [<8106ba45>] ? sched_clock_local+0x25/0x170
[   11.973950]  [<810da6bd>] ? __kmalloc+0x3d/0x90
[   11.973950]  [<810da6bd>] ? __kmalloc+0x3d/0x90
[   11.973950]  [<817b8c10>] ? __kmalloc_reserve.isra.41+0x20/0x70
[   11.973950]  [<817b8c10>] ? __kmalloc_reserve.isra.41+0x20/0x70
[   11.973950]  [<810da81a>] ? slob_alloc_node+0x2a/0x60
[   11.973950]  [<810da81a>] ? slob_alloc_node+0x2a/0x60
[   11.973950]  [<817b919a>] ? __alloc_skb+0x6a/0x2b0
[   11.973950]  [<817b919a>] ? __alloc_skb+0x6a/0x2b0
[   11.973950]  [<817d8795>] rtmsg_ifinfo+0x65/0xe0
[   11.973950]  [<817d8795>] rtmsg_ifinfo+0x65/0xe0
[   11.973950]  [<817cbd31>] register_netdevice+0x531/0x5a0
[   11.973950]  [<817cbd31>] register_netdevice+0x531/0x5a0
[   11.973950]  [<81892b87>] ? ip6_tnl_get_cap+0x27/0x90
[   11.973950]  [<81892b87>] ? ip6_tnl_get_cap+0x27/0x90
[   11.973950]  [<817cbdb6>] register_netdev+0x16/0x30
[   11.973950]  [<817cbdb6>] register_netdev+0x16/0x30
[   11.973950]  [<81f574a6>] vti6_init_net+0x1c4/0x1d4
[   11.973950]  [<81f574a6>] vti6_init_net+0x1c4/0x1d4
[   11.973950]  [<81f573af>] ? vti6_init_net+0xcd/0x1d4
[   11.973950]  [<81f573af>] ? vti6_init_net+0xcd/0x1d4
[   11.973950]  [<817c16df>] ops_init.constprop.11+0x17f/0x1c0
[   11.973950]  [<817c16df>] ops_init.constprop.11+0x17f/0x1c0
[   11.973950]  [<817c1779>] register_pernet_operations.isra.9+0x59/0x90
[   11.973950]  [<817c1779>] register_pernet_operations.isra.9+0x59/0x90
[   11.973950]  [<817c18d1>] register_pernet_device+0x21/0x60
[   11.973950]  [<817c18d1>] register_pernet_device+0x21/0x60
[   11.973950]  [<81f574b6>] ? vti6_init_net+0x1d4/0x1d4
[   11.973950]  [<81f574b6>] ? vti6_init_net+0x1d4/0x1d4
[   11.973950]  [<81f574c7>] vti6_tunnel_init+0x11/0x68
[   11.973950]  [<81f574c7>] vti6_tunnel_init+0x11/0x68
[   11.973950]  [<81f572a1>] ? mip6_init+0x73/0xb4
[   11.973950]  [<81f572a1>] ? mip6_init+0x73/0xb4
[   11.973950]  [<81f0cba4>] do_one_initcall+0xbb/0x15b
[   11.973950]  [<81f0cba4>] do_one_initcall+0xbb/0x15b
[   11.973950]  [<811a00d8>] ? sha_transform+0x528/0x1150
[   11.973950]  [<811a00d8>] ? sha_transform+0x528/0x1150
[   11.973950]  [<81f0c544>] ? repair_env_string+0x12/0x51
[   11.973950]  [<81f0c544>] ? repair_env_string+0x12/0x51
[   11.973950]  [<8105c30d>] ? parse_args+0x2ad/0x440
[   11.973950]  [<8105c30d>] ? parse_args+0x2ad/0x440
[   11.973950]  [<810546be>] ? __usermodehelper_set_disable_depth+0x3e/0x50
[   11.973950]  [<810546be>] ? __usermodehelper_set_disable_depth+0x3e/0x50
[   11.973950]  [<81f0cd27>] kernel_init_freeable+0xe3/0x182
[   11.973950]  [<81f0cd27>] kernel_init_freeable+0xe3/0x182
[   11.973950]  [<81f0c532>] ? do_early_param+0x7a/0x7a
[   11.973950]  [<81f0c532>] ? do_early_param+0x7a/0x7a
[   11.973950]  [<819b5b1b>] kernel_init+0xb/0x100
[   11.973950]  [<819b5b1b>] kernel_init+0xb/0x100
[   11.973950]  [<819cebf7>] ret_from_kernel_thread+0x1b/0x28
[   11.973950]  [<819cebf7>] ret_from_kernel_thread+0x1b/0x28
[   11.973950]  [<819b5b10>] ? rest_init+0xc0/0xc0
[   11.973950]  [<819b5b10>] ? rest_init+0xc0/0xc0

Before 469bdcefdc ("ipv6: fix the use of pcpu_tstats in ip6_vti.c"),
the pcpu_tstats.syncp is not used to pretect the 64bit elements of
pcpu_tstats, so not appear this calltrace.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07 14:12:46 -05:00
Thierry Escande
b711ad524b NFC: digital: Set rf tech and crc functions when receiving a PSL_REQ
This patch sets the correct rf tech value and crc functions in target
mode when receiving a PSL_REQ, as done when receiving an ATR_REQ.

Signed-off-by: Thierry Escande <thierry.escande@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-01-07 18:48:12 +01:00
Thierry Escande
48e1044515 NFC: digital: Set current target active on activate_target() call
The curr_protocol field of nfc_digital_dev structure used to determine
if a target is currently active was set too soon, immediately when a
target is found. This is not good since there is no other way than
deactivate_target() to reset curr_protocol and if activate_target() is
not called, the target remains active and it's not possible to put the
device in poll mode anymore.

With this patch curr_protocol is set when nfc core activates a target,
puts a device up, or when an ATR_REQ is received in target mode.

Signed-off-by: Thierry Escande <thierry.escande@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-01-07 18:48:12 +01:00
Heikki Krogerus
2111cad655 net: rfkill: gpio: convert to descriptor-based GPIO interface
Convert to the safer gpiod_* family of API functions.

Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2014-01-07 18:45:20 +01:00
Emmanuel Grumbach
349b196044 mac80211: allow to set smps mode to OFF in AP mode
In managed mode, we should not ask for OFF mode because the
power settings may still require DYNAMIC. In AP mode, this
should be allowed since the default settings is OFF and
AUTOMATIC is not allowed.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-01-07 16:25:49 +01:00
Johannes Berg
3c2723f503 mac80211: clean up prepare_for_handlers() return value
Using an int with 0/1 is not very common, make the function
return a bool instead with the same values (false/true).

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-01-07 16:23:24 +01:00
Emmanuel Grumbach
40791942ec mac80211: simplify code in ieee80211_prepare_and_rx_handle
No need to assign the return value of prepare_for_handlers
to a variable if the only usage is to test it.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-01-07 16:22:15 +01:00
Emmanuel Grumbach
87ee475ef6 mac80211: clean up garbage in comment
Not clear how this landed here.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-01-07 16:21:56 +01:00
Masanari Iida
8faaaead62 treewide: fix comments and printk msgs
This patch fixed several typo in printk from various
part of kernel source.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-01-07 15:06:07 +01:00
Claudio Takahasi
e825eb1d7e Bluetooth: Fix 6loWPAN peer lookup
This patch fixes peer address lookup for 6loWPAN over Bluetooth Low
Energy links.

ADDR_LE_DEV_PUBLIC, and ADDR_LE_DEV_RANDOM are the values allowed for
"dst_type" field in the hci_conn struct for LE links.

Signed-off-by: Claudio Takahasi <claudio.takahasi@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2014-01-07 11:32:15 -02:00
Claudio Takahasi
b071a62099 Bluetooth: Fix setting Universal/Local bit
This patch fixes the Bluetooth Low Energy Address type checking when
setting Universal/Local bit for the 6loWPAN network device or for the
peer device connection.

ADDR_LE_DEV_PUBLIC or ADDR_LE_DEV_RANDOM are the values allowed for
"src_type" and "dst_type" in the hci_conn struct. The Bluetooth link
type can be obtainned reading the "type" field in the same struct.

Signed-off-by: Claudio Takahasi <claudio.takahasi@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2014-01-07 11:32:11 -02:00
Eric Dumazet
438e38fadc gre_offload: statically build GRE offloading support
GRO/GSO layers can be enabled on a node, even if said
node is only forwarding packets.

This patch permits GSO (and upcoming GRO) support for GRE
encapsulated packets, even if the host has no GRE tunnel setup.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-06 20:28:34 -05:00
David S. Miller
39b6b2992f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch
Jesse Gross says:

====================
[GIT net-next] Open vSwitch

Open vSwitch changes for net-next/3.14. Highlights are:
 * Performance improvements in the mechanism to get packets to userspace
   using memory mapped netlink and skb zero copy where appropriate.
 * Per-cpu flow stats in situations where flows are likely to be shared
   across CPUs. Standard flow stats are used in other situations to save
   memory and allocation time.
 * A handful of code cleanups and rationalization.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-06 19:48:38 -05:00
Amitkumar Karwar
22c15bf30b NFC: NCI: Add set_config API
This API can be used by drivers to send their custom
configuration using SET_CONFIG NCI command to the device.

Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-01-07 01:32:40 +01:00
Amitkumar Karwar
86e8586ed5 NFC: NCI: Add setup handler
Some drivers require special configuration while initializing.
This patch adds setup handler for this custom configuration.

Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-01-07 01:32:40 +01:00
Amitkumar Karwar
1907299867 NFC: NCI: Don't reverse local general bytes
Local general bytes returned by nfc_get_local_general_bytes()
are already in correct order. We don't need to reverse them.

Remove local_gb[] local array as it's not needed any more.

Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-01-07 01:32:40 +01:00
Stephen Hemminger
443cd88c8a ovs: make functions local
Several functions and datastructures could be local
Found with 'make namespacecheck'

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-01-06 15:54:39 -08:00
Thomas Graf
09c5e6054e openvswitch: Compute checksum in skb_gso_segment() if needed
The copy & csum optimization is no longer present with zerocopy
enabled. Compute the checksum in skb_gso_segment() directly by
dropping the HW CSUM capability from the features passed in.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-01-06 15:53:24 -08:00
Thomas Graf
bda56f143c openvswitch: Use skb_zerocopy() for upcall
Use of skb_zerocopy() can avoid the expensive call to memcpy()
when copying the packet data into the Netlink skb. Completes
checksum through skb_checksum_help() if not already done in
GSO segmentation.

Zerocopy is only performed if user space supported unaligned
Netlink messages. memory mapped netlink i/o is preferred over
zerocopy if it is set up.

Cost of upcall is significantly reduced from:
+   7.48%       vhost-8471  [k] memcpy
+   5.57%     ovs-vswitchd  [k] memcpy
+   2.81%       vhost-8471  [k] csum_partial_copy_generic

to:
+   5.72%     ovs-vswitchd  [k] memcpy
+   3.32%       vhost-5153  [k] memcpy
+   0.68%       vhost-5153  [k] skb_zerocopy

(megaflows disabled)

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-01-06 15:53:17 -08:00
Thomas Graf
8055a89cfa openvswitch: Pass datapath into userspace queue functions
Allows removing the net and dp_ifindex argument and simplify the
code.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-01-06 15:53:07 -08:00
Thomas Graf
44da5ae5fb openvswitch: Drop user features if old user space attempted to create datapath
Drop user features if an outdated user space instance that does not
understand the concept of user_features attempted to create a new
datapath.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-01-06 15:53:00 -08:00
Thomas Graf
43d4be9cb5 openvswitch: Allow user space to announce ability to accept unaligned Netlink messages
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Reviewed-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-01-06 15:52:53 -08:00
Thomas Graf
af2806f8f9 net: Export skb_zerocopy() to zerocopy from one skb to another
Make the skb zerocopy logic written for nfnetlink queue available for
use by other modules.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Reviewed-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-01-06 15:52:42 -08:00