openslx/kernel-qcow2-linux.git - In-kernel qcow2 (Kernel part)

	Commit message (Collapse)	Author	Age	Files	Lines
*	xfrm: remove eth_proto value from xfrm_state_afinfo	Florian Westphal	2019-06-06	4	-20/+14
\| \| \| \| \| \| \| \| \| \| \| \|	xfrm_prepare_input needs to lookup the state afinfo backend again to fetch the address family ethernet protocol value. There are only two address families, so a switch statement is simpler. While at it, use u8 for family and proto and remove the owner member -- its not used anywhere. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
*	xfrm: remove state and template sort indirections from xfrm_state_afinfo	Florian Westphal	2019-06-06	4	-137/+113
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	No module dependency, placing this in xfrm_state.c avoids need for an indirection. This also removes the state spinlock -- I don't see why we would need to hold it during sorting. This in turn allows to remove the 'net' argument passed to xfrm_tmpl_sort. Last, remove the EXPORT_SYMBOL, there are no modular callers. For the CONFIG_IPV6=m case, vmlinux size increase is about 300 byte. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
*	xfrm: remove init_flags indirection from xfrm_state_afinfo	Florian Westphal	2019-06-05	3	-23/+3
\| \| \| \| \| \| \|	There is only one implementation of this function; just call it directly. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
*	xfrm: remove init_temprop indirection from xfrm_state_afinfo	Florian Westphal	2019-06-05	4	-43/+20
\| \| \| \| \| \| \| \|	same as previous patch: just place this in the caller, no need to have an indirection for a structure initialization. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
*	xfrm: remove init_tempsel indirection from xfrm_state_afinfo	Florian Westphal	2019-06-05	4	-49/+49
\| \| \| \| \| \| \|	Simple initialization, handle it in the caller. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
*	Merge branch 'net-dsa-mv88e6xxx-support-for-mv88e6250'	David S. Miller	2019-06-05	12	-6/+333
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rasmus Villemoes says: ==================== net: dsa: mv88e6xxx: support for mv88e6250 This adds support for the mv88e6250 chip. Initially based on the mv88e6240, this time around, I've been through each ->ops callback and checked that it makes sense, either replacing with a 6250 specific variant or dropping it if no equivalent functionality seems to exist for the 6250. Along the way, I found a few oddities in the existing code, mostly sent as separate patches/questions. The one relevant to the 6250 is the ieee_pri_map callback, where the existing mv88e6085_g1_ieee_pri_map() is actually wrong for many of the existing users. I've put the mv88e6250_g1_ieee_pri_map() patch first in case some of the existing chips get switched over to use that and it is deemed important enough for -stable. v4: - fix style issue in 1/10 - add Andrew's reviewed-by to 1,6,7,8,9,10. v3: - rebase on top of net-next/master - add reviewed-bys to patches unchanged from v2 (2,3,4,5) - add 6250-specific ->ieee_pri_map, ->port_set_speed, ->port_link_state (1,6,7) - in addition, use mv88e6065_phylink_validate for ->phylink_validate, and don't implement ->port_get_cmode, ->port_set_jumbo_size, ->port_disable_learn_limit, ->rmu_disable - drop ptp support - add patch adding the compatible string to the DT binding (9) - add small refactoring patch (10) v2: - rebase on top of net-next/master - add reviewed-by to two patches unchanged from v1 (2,3) - add separate watchdog_ops ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: dsa: mv88e6xxx: refactor mv88e6352_g1_reset	Rasmus Villemoes	2019-06-05	1	-13/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new mv88e6250_g1_reset() is identical to mv88e6352_g1_reset() except for the call of mv88e6352_g1_wait_ppu_polling(), so refactor the 6352 version in term of the 6250 one. No functional change. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	dt-bindings: net: dsa: marvell: add "marvell,mv88e6250" compatible string	Rasmus Villemoes	2019-06-05	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The mv88e6250 has port_base_addr 0x8 or 0x18 (depending on configuration pins), so it constitutes a new family and hence needs its own compatible string. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: dsa: mv88e6xxx: add support for mv88e6250	Rasmus Villemoes	2019-06-05	5	-0/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds support for the Marvell 88E6250. I've checked that each member in the ops-structure makes sense, and basic switchdev functionality works fine. It uses the new dual_chip option, and since its port registers start at SMI address 0x08 or 0x18 (i.e., always sw_addr + 0x08), we need to introduce a new compatible string in order for the auto-identification in mv88e6xxx_detect() to work. The chip has four per port 16-bits statistics registers, two of which correspond to the existing "sw_in_filtered" and "sw_out_filtered" (but at offsets 0x13 and 0x10 rather than 0x12 and 0x13, because why should this be easy...). Wiring up those four statistics seems to require introducing a STATS_TYPE_PORT_6250 bit or similar, which seems a tad ugly, so for now this just allows access to the STATS_TYPE_BANK0 ones. The chip does have ptp support, and the existing mv88e6352_{gpio,avb,ptp}_ops at first glance seem like they would work out-of-the-box, but for simplicity (and lack of testing) I'm eliding this. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: dsa: mv88e6xxx: implement port_link_state for mv88e6250	Rasmus Villemoes	2019-06-05	2	-0/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The mv88e6250 has a rather different way of reporting the link, speed and duplex status. A simple difference is that the link bit is bit 12 rather than bit 11 of the port status register. It gets more complicated for speed and duplex, which do not have separate fields. Instead, there's a four-bit PortMode field, and decoding that depends on whether it's a phy or mii port. For the phy ports, only four of the 16 values have defined meaning; the rest are called "reserved", so returning {SPEED,DUPLEX}_UNKNOWN seems reasonable. For the mii ports, most possible values are documented (0x3 and 0x5 are reserved), but I'm unable to make sense of them all. Since the bits simply reflect the Px_MODE[3:0] configuration pins, just support the subset that I'm certain about. Support for other setups can be added later. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: dsa: mv88e6xxx: implement port_set_speed for mv88e6250	Rasmus Villemoes	2019-06-05	2	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The data sheet also mentions the possibility of selecting 200 Mbps for the MII ports (ports 5 and 6) by setting the ForceSpd field to 0x2 (aka MV88E6065_PORT_MAC_CTL_SPEED_200). However, there's a note that "actual speed is determined by bit 8 above", and flipping back a page, one finds that bits 13:8 are reserved... So without further information on what bit 8 means, let's stick to supporting just 10 and 100 Mbps on all ports. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: dsa: mv88e6xxx: implement watchdog_ops for mv88e6250	Rasmus Villemoes	2019-06-05	2	-0/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The MV88E6352_G2_WDOG_CTL_* bits almost, but not quite, describe the watchdog control register on the mv88e6250. Among those actually referenced in the code, only QC_ENABLE differs (bit 6 rather than bit 5). Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: dsa: mv88e6xxx: implement vtu_getnext and vtu_loadpurge for mv88e6250	Rasmus Villemoes	2019-06-05	2	-0/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These are almost identical to the 6185 variants, but have fewer bits for the FID. Bit 10 of the VTU_OP register (offset 0x05) is the VidPolicy bit, which one should probably preserve in mv88e6xxx_g1_vtu_op(), instead of always writing a 0. However, on the 6352 family, that bit is located at bit 12 in the VTU FID register (offset 0x02), and is always unconditionally cleared by the mv88e6xxx_g1_vtu_fid_write() function. Since nothing in the existing driver seems to know or care about that bit, it seems reasonable to not add the boilerplate to preserve it for the 6250 (which would require adding a chip-specific vtu_op function, or adding chip-quirks to the existing one). Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: dsa: mv88e6xxx: prepare mv88e6xxx_g1_atu_op() for the mv88e6250	Rasmus Villemoes	2019-06-05	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All the currently supported chips have .num_databases either 256 or 4096, so this patch does not change behaviour for any of those. The mv88e6250, however, has .num_databases == 64, and it does not put the upper two bits in ATU control 13:12, but rather in ATU Operation 9:8. So change the logic to prepare for supporting mv88e6250. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: dsa: mv88e6xxx: introduce support for two chips using direct smi addressing	Rasmus Villemoes	2019-06-05	2	-1/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The 88e6250 (as well as 6220, 6071, 6070, 6020) do not support multi-chip (indirect) addressing. However, one can still have two of them on the same mdio bus, since the device only uses 16 of the 32 possible addresses, either addresses 0x00-0x0F or 0x10-0x1F depending on the ADDR4 pin at reset [since ADDR4 is internally pulled high, the latter is the default]. In order to prepare for supporting the 88e6250 and friends, introduce mv88e6xxx_info::dual_chip to allow having a non-zero sw_addr while still using direct addressing. Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: dsa: mv88e6xxx: add mv88e6250_g1_ieee_pri_map	Rasmus Villemoes	2019-06-05	2	-0/+8
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Quite a few of the existing supported chips that use mv88e6085_g1_ieee_pri_map as ->ieee_pri_map (including, incidentally, mv88e6085 itself) actually have a reset value of 0xfa50 in the G1_IEEE_PRI register. The data sheet for the mv88e6095, however, does describe a reset value of 0xfa41. So rather than changing the value in the existing callback, introduce a new variant with the 0xfa50 value. That will be used by the upcoming mv88e6250, and existing chips can be switched over one by one, preferably double-checking both the data sheet and actual hardware in each case - if anybody actually feels this is important enough to care. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
*	vmxnet3: turn off lro when rxcsum is disabled	Ronak Doshi	2019-06-05	3	-2/+16
\| \| \| \| \| \| \| \| \| \| \|	Currently, when rx csum is disabled, vmxnet3 driver does not turn off lro, which can cause performance issues if user does not turn off lro explicitly. This patch adds fix_features support which is used to turn off LRO whenever RXCSUM is disabled. Signed-off-by: Ronak Doshi <doshir@vmware.com> Acked-by: Rishi Mehta <rmehta@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'net-add-struct-nexthop-to-fib-info'	David S. Miller	2019-06-05	20	-175/+698
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	David Ahern says: ==================== net: add struct nexthop to fib{6}_info Set 10 of 11 to improve route scalability via support for nexthops as standalone objects for fib entries. https://lwn.net/Articles/763950/ This sets adds 'struct nexthop' to fib_info and fib6_info. IPv4 already handles multiple fib_nh entries in a single fib_info, so the conversion to use a nexthop struct is fairly mechanical. IPv6 using a nexthop struct with a fib6_info impacts a lot of core logic which is built around the assumption of a single, builtin fib6_nh per fib6_info. To make this easier to review, this set adds nexthop to fib6_info and adds checks in most places fib6_info is used. The next set finishes the IPv6 conversion, walking through the places that need to consider all fib6_nh within a nexthop struct. Offload drivers - mlx5, mlxsw and rocker - are changed to fail FIB entries using nexthop objects. That limitation can be removed once the drivers are updated to properly support separate nexthops. This set starts by adding accessors for fib_nh and fib_nhs in a fib_info. This makes it easier to extract the number of nexthops in the fib entry and a specific fib_nh once the entry references a struct nexthop. Patch 2 converts more of IPv4 code to use fib_nh_common allowing a struct nexthop to use a fib6_nh with an IPv4 entry. Patches 3 and 4 add 'struct nexthop' to fib{6}_info and update references to both take a different path when it is set. New exported functions are added to the nexthop code to validate a nexthop struct when configured for use with a fib entry. IPv4 is allowed to use a nexthop with either v4 or v6 entries. IPv6 is limited to v6 entries only. In both cases list_heads track the fib entries using a nexthop struct for fast correlation on events (e.g., device events or nexthop events like delete or replace). The last 3 patches add hooks to drivers listening for FIB notificationas. All 3 of them reject the routes as unsupported, returning an error message to the user via extack. For mlxsw at least this is a stop gap measure until the driver is updated for proper support. Functional tests for nexthops have already been committed. Those tests will be active after the next patch set which makes the code paths created by this set and the next one live. Existing code paths moved to the else branch of 'if (f{6}i->nh)' checks are covered by existing tests under selftests/net. v3 - remove ip6_create_rt_rcu from ip6_pol_route in patch 4 and use pcpu routes for REJECT routes with the blackhole nexthop (request from Wei) v2 - no code changes from v1 - commit messages for first 4 patches updated ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	rocker: Fail attempts to use routes with nexthop objects	David Ahern	2019-06-05	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fail attempts to use nexthop objects with routes until support can be properly added. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	mlx5: Fail attempts to use routes with nexthop objects	David Ahern	2019-06-05	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fail attempts to use nexthop objects with routes until support can be properly added. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	mlxsw: Fail attempts to use routes with nexthop objects	David Ahern	2019-06-05	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fail attempts to use nexthop objects with routes until support can be properly added. Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ipv6: Plumb support for nexthop object in a fib6_info	David Ahern	2019-06-05	8	-36/+260
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add struct nexthop and nh_list list_head to fib6_info. nh_list is the fib6_info side of the nexthop <-> fib_info relationship. Since a fib6_info referencing a nexthop object can not have 'sibling' entries (the old way of doing multipath routes), the nh_list is a union with fib6_siblings. Add f6i_list list_head to 'struct nexthop' to track fib6_info entries using a nexthop instance. Update __remove_nexthop_fib to walk f6_list and delete fib entries using the nexthop. Add a few nexthop helpers for use when a nexthop is added to fib6_info: - nexthop_fib6_nh - return first fib6_nh in a nexthop object - fib6_info_nh_dev moved to nexthop.h and updated to use nexthop_fib6_nh if the fib6_info references a nexthop object - nexthop_path_fib6_result - similar to ipv4, select a path within a multipath nexthop object. If the nexthop is a blackhole, set fib6_result type to RTN_BLACKHOLE, and set the REJECT flag Update the fib6_info references to check for nh and take a different path as needed: - rt6_qualify_for_ecmp - if a fib entry uses a nexthop object it can NOT be coalesced with other fib entries into a multipath route - rt6_duplicate_nexthop - use nexthop_cmp if either fib6_info references a nexthop - addrconf (host routes), RA's and info entries (anything configured via ndisc) does not use nexthop objects - fib6_info_destroy_rcu - put reference to nexthop object - fib6_purge_rt - drop fib6_info from f6i_list - fib6_select_path - update to use the new nexthop_path_fib6_result when fib entry uses a nexthop object - rt6_device_match - update to catch use of nexthop object as a blackhole and set fib6_type and flags. - ip6_route_info_create - don't add space for fib6_nh if fib entry is going to reference a nexthop object, take a reference to nexthop object, disallow use of source routing - rt6_nlmsg_size - add space for RTA_NH_ID - add rt6_fill_node_nexthop to add nexthop data on a dump As with ipv4, most of the changes push existing code into the else branch of whether the fib entry uses a nexthop object. Update the nexthop code to walk f6i_list on a nexthop deleted to remove fib entries referencing it. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ipv4: Plumb support for nexthop object in a fib_info	David Ahern	2019-06-05	5	-36/+229
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add 'struct nexthop' and nh_list list_head to fib_info. nh_list is the fib_info side of the nexthop <-> fib_info relationship. Add fi_list list_head to 'struct nexthop' to track fib_info entries using a nexthop instance. Add __remove_nexthop_fib and add it to __remove_nexthop to walk the new list_head and mark those fib entries as dead when the nexthop is deleted. Add a few nexthop helpers for use when a nexthop is added to fib_info: - nexthop_cmp to determine if 2 nexthops are the same - nexthop_path_fib_result to select a path for a multipath 'struct nexthop' - nexthop_fib_nhc to select a specific fib_nh_common within a multipath 'struct nexthop' Update existing fib_info_nhc to use nexthop_fib_nhc if a fib_info uses a 'struct nexthop', and mark fib_info_nh as only used for the non-nexthop case. Update the fib_info functions to check for fi->nh and take a different path as needed: - free_fib_info_rcu - put the nexthop object reference - fib_release_info - remove the fib_info from the nexthop's fi_list - nh_comp - use nexthop_cmp when either fib_info references a nexthop object - fib_info_hashfn - use the nexthop id for the hashing vs the oif of each fib_nh in a fib_info - fib_nlmsg_size - add space for the RTA_NH_ID attribute - fib_create_info - verify nexthop reference can be taken, verify nexthop spec is valid for fib entry, and add fib_info to fi_list for a nexthop - fib_select_multipath - use the new nexthop_path_fib_result to select a path when nexthop objects are used - fib_table_lookup - if the 'struct nexthop' is a blackhole nexthop, treat it the same as a fib entry using 'blackhole' The bulk of the changes are in fib_semantics.c and most of that is moving the existing change_nexthops into an else branch. Update the nexthop code to walk fi_list on a nexthop deleted to remove fib entries referencing it. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ipv4: Prepare for fib6_nh from a nexthop object	David Ahern	2019-06-05	7	-37/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Convert more IPv4 code to use fib_nh_common over fib_nh to enable routes to use a fib6_nh based nexthop. In the end, only code not using a nexthop object in a fib_info should directly access fib_nh in a fib_info without checking the famiy and going through fib_nh_common. Those functions will be marked when it is not directly evident. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ipv4: Use accessors for fib_info nexthop data	David Ahern	2019-06-05	12	-80/+132
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use helpers to access fib_nh and fib_nhs fields of a fib_info. Drop the fib_dev macro which is an alias for the first nexthop. Replacements: fi->fib_dev --> fib_info_nh(fi, 0)->fib_nh_dev fi->fib_nh --> fib_info_nh(fi, 0) fi->fib_nh[i] --> fib_info_nh(fi, i) fi->fib_nhs --> fib_info_num_path(fi) where fib_info_nh(fi, i) returns fi->fib_nh[nhsel] and fib_info_num_path returns fi->fib_nhs. Move the existing fib_info_nhc to nexthop.h and define the new ones there. A later patch adds a check if a fib_info uses a nexthop object, and defining the helpers in nexthop.h avoid circular header dependencies. After this all remaining open coded references to fi->fib_nhs and fi->fib_nh are in: - fib_create_info and helpers used to lookup an existing fib_info entry, and - the netdev event functions fib_sync_down_dev and fib_sync_up. The latter two will not be reused for nexthops, and the fib_create_info will be updated to handle a nexthop in a fib_info. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6: Always allocate pcpu memory in a fib6_nh	David Ahern	2019-06-04	1	-6/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A recent commit had an unintended side effect with reject routes: rt6i_pcpu is expected to always be initialized for all fib6_info except the null entry. The commit mentioned below skips it for reject routes and ends up leaking references to the loopback device. For example, ip netns add foo ip -netns foo li set lo up ip -netns foo -6 ro add blackhole 2001:db8:1::1 ip netns exec foo ping6 2001:db8:1::1 ip netns del foo ends up spewing: unregister_netdevice: waiting for lo to become free. Usage count = 3 The fib_nh_common_init is not needed for reject routes (no ipv4 caching or encaps), so move the alloc_percpu_gfp after it and adjust the goto label. Fixes: f40b6ae2b612 ("ipv6: Move pcpu cached routes to fib6_nh") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	hinic: add LRO support	Xue Chaojing	2019-06-04	10	-13/+348
\| \| \| \| \| \| \| \| \| \|	This patch adds LRO support for the HiNIC driver. Reported-by: kbuild test robot <lkp@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Xue Chaojing <xuechaojing@huawei.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'bond-mpls'	David S. Miller	2019-06-04	2	-0/+12
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ariel Levkovich says: ==================== Support MPLS features in bonding and vlan net devices Netdevice HW MPLS features are not passed from device driver's netdevice to upper netdevice, specifically VLAN and bonding netdevice which are created by the kernel when needed. This prevents enablement and usage of HW offloads, such as TSO and checksumming for MPLS tagged traffic when running via VLAN or bonding interface. The patches introduce changes to the initialization steps of the VLAN and bonding netdevices to inherit the MPLS features from lower netdevices to allow the HW offloads. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: vlan: Inherit MPLS features from parent device	Ariel Levkovich	2019-06-04	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During the creation of the VLAN interface net device, the various device features and offloads are being set based on the parent device's features. The code initiates the basic, vlan and encapsulation features but doesn't address the MPLS features set and they remain blank. As a result, all device offloads that have significant performance effect are disabled for MPLS traffic going via this VLAN device such as checksumming and TSO. This patch makes sure that MPLS features are also set for the VLAN device based on the parent which will allow HW offloads of checksumming and TSO to be performed on MPLS tagged packets. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: bonding: Inherit MPLS features from slave devices	Ariel Levkovich	2019-06-04	1	-0/+11
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When setting the bonding interface net device features, the kernel code doesn't address the slaves' MPLS features and doesn't inherit them. Therefore, HW offloads that enhance performance such as checksumming and TSO are disabled for MPLS tagged traffic flowing via the bonding interface. The patch add the inheritance of the MPLS features from the slave devices with a similar logic to setting the bonding device's VLAN and encapsulation features. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'net-tls-small-general-improvements'	David S. Miller	2019-06-04	8	-65/+75
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Jakub Kicinski says: ==================== net/tls: small general improvements This series cleans up and improves the tls code, mostly the offload parts. First a slight performance optimization - avoiding unnecessary re- -encryption of records in patch 1. Next patch 2 makes the code more resilient by checking for errors in skb_copy_bits(). Next commit removes a warning which can be triggered in normal operation, (especially for devices explicitly making use of the fallback path). Next two paths change the condition checking around the call to tls_device_decrypted() to make it easier to extend. Remaining commits are centered around reorganizing struct tls_context for better cache utilization. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/tls: don't pass version to tls_advance_record_sn()	Jakub Kicinski	2019-06-04	3	-13/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All callers pass prot->version as the last parameter of tls_advance_record_sn(), yet tls_advance_record_sn() itself needs a pointer to prot. Pass prot from callers. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/tls: reorganize struct tls_context	Jakub Kicinski	2019-06-04	1	-11/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	struct tls_context is slightly badly laid out. If we reorder things right we can save 16 bytes (320 -> 304) but also make all fast path data fit into two cache lines (one read only and one read/write, down from four cache lines). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/tls: use version from prot	Jakub Kicinski	2019-06-04	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ctx->prot holds the same information as per-direction contexts. Almost all code gets TLS version from this structure, convert the last two stragglers, this way we can improve the cache utilization by moving the per-direction data into cold cache lines. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/tls: don't re-check msg decrypted status in tls_device_decrypted()	Jakub Kicinski	2019-06-04	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tls_device_decrypted() is only called from decrypt_skb_update(), when ctx->decrypted == false, there is no need to re-check the bit. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/tls: don't look for decrypted frames on non-offloaded sockets	Jakub Kicinski	2019-06-04	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the RX config of a TLS socket is SW, there is no point iterating over the fragments and checking if frame is decrypted. It will always be fully encrypted. Note that in fully encrypted case the function doesn't actually touch any offload-related state, so it's safe to call for TLS_SW, today. Soon we will introduce code which can only be called for offloaded contexts. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/tls: remove false positive warning	Jakub Kicinski	2019-06-04	2	-21/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's possible that TCP stack will decide to retransmit a packet right when that packet's data gets acked, especially in presence of packet reordering. This means that packets may be in flight, even though tls_device code has already freed their record state. Make fill_sg_in() and in turn tls_sw_fallback() not generate a warning in that case, and quietly proceed to drop such frames. Make the exit path from tls_sw_fallback() drop monitor friendly, for users to be able to troubleshoot dropped retransmissions. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/tls: check return values from skb_copy_bits() and skb_store_bits()	Jakub Kicinski	2019-06-04	1	-6/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In light of recent bugs, we should make a better effort of checking return values. In theory none of the functions should fail today. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net/tls: fully initialize the msg wrapper skb	Jakub Kicinski	2019-06-04	3	-6/+28
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If strparser gets cornered into starting a new message from an sk_buff which already has frags, it will allocate a new skb to become the "wrapper" around the fragments of the message. This new skb does not inherit any metadata fields. In case of TLS offload this may lead to unnecessarily re-encrypting the message, as skb->decrypted is not set for the wrapper skb. Try to be conservative and copy all fields of old skb strparser's user may reasonably need. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: mscc: ocelot: Fix some struct initializations	Nathan Chancellor	2019-06-04	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Clang warns: drivers/net/ethernet/mscc/ocelot_ace.c:335:37: warning: suggest braces around initialization of subobject [-Wmissing-braces] struct ocelot_vcap_u64 payload = { 0 }; ^ {} drivers/net/ethernet/mscc/ocelot_ace.c:336:28: warning: suggest braces around initialization of subobject [-Wmissing-braces] struct vcap_data data = { 0 }; ^ {} drivers/net/ethernet/mscc/ocelot_ace.c:683:37: warning: suggest braces around initialization of subobject [-Wmissing-braces] struct ocelot_ace_rule del_ace = { 0 }; ^ {} drivers/net/ethernet/mscc/ocelot_ace.c:743:28: warning: suggest braces around initialization of subobject [-Wmissing-braces] struct vcap_data data = { 0 }; ^ {} 4 warnings generated. One way to fix these warnings is to add additional braces like Clang suggests; however, there has been a bit of push back from some maintainers[1][2], who just prefer memset as it is unambiguous, doesn't depend on a particular compiler version[3], and properly initializes all subobjects. Do that here so there are no more warnings. [1]: https://lore.kernel.org/lkml/022e41c0-8465-dc7a-a45c-64187ecd9684@amd.com/ [2]: https://lore.kernel.org/lkml/20181128.215241.702406654469517539.davem@davemloft.net/ [3]: https://lore.kernel.org/lkml/20181116150432.2408a075@redhat.com/ Fixes: b596229448dd ("net: mscc: ocelot: Add support for tcam") Link: https://github.com/ClangBuiltLinux/linux/issues/505 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: ipv4: fix rcu lockdep splat due to wrong annotation	Florian Westphal	2019-06-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	syzbot triggered following splat when strict netlink validation is enabled: net/ipv4/devinet.c:1766 suspicious rcu_dereference_check() usage! This occurs because we hold RTNL mutex, but no rcu read lock. The second call site holds both, so just switch to the _rtnl variant. Reported-by: syzbot+bad6e32808a3a97b1515@syzkaller.appspotmail.com Fixes: 2638eb8b50cf ("net: ipv4: provide __rcu annotation for ifa_list") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'net-expose-flash-update-status-to-user'	David S. Miller	2019-06-04	17	-91/+358
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Jiri Pirko says: ==================== expose flash update status to user When user is flashing device using devlink, he currenly does not see any information about what is going on, percentages, etc. Drivers, for example mlxsw and mlx5, have notion about the progress and what is happening. This patchset exposes this progress information to userspace. Example output for existing flash command: $ devlink dev flash pci/0000:01:00.0 file firmware.bin Preparing to flash Flashing 100% Flashing done See this console recording which shows flashing FW on a Mellanox Spectrum device: https://asciinema.org/a/247926 Please see individual patches for changelog. v2->v3 only adds tags and the last selftest patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	selftests: add basic netdevsim devlink flash testing	Jiri Pirko	2019-06-04	1	-0/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Utilizes the devlink flash code. Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	netdevsim: implement fake flash updating with notifications	Jiri Pirko	2019-06-04	2	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	mlxsw: Implement flash update status notifications	Jiri Pirko	2019-06-04	1	-1/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement mlxfw status_notify op by passing notification down to devlink. Also notify about flash update begin and end. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	mlxfw: Introduce status_notify op and call it to notify about the status	Jiri Pirko	2019-06-04	2	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add new op status_notify which is called to update the user about flashing status. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	devlink: allow driver to update progress of flash update	Jiri Pirko	2019-06-04	3	-0/+115
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce a function to be called from drivers during flash. It sends notification to userspace about flash update progress. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	mlxfw: Propagate error messages through extack	Jiri Pirko	2019-06-04	6	-20/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the error messages are printed to dmesg. Propagate them also to directly to user doing the flashing through extack. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	mlx5: Move firmware flash implementation to devlink	Jiri Pirko	2019-06-04	4	-46/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Benefit from the devlink flash update implementation and ethtool fallback to it and move firmware flash implementation there. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	mlxsw: Move firmware flash implementation to devlink	Jiri Pirko	2019-06-04	3	-26/+41
\|/ \| \| \| \| \| \| \| \|	Benefit from the devlink flash update implementation and ethtool fallback to it and move firmware flash implementation there. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>