net/mlx5e: TX latency optimization to save DMA reads

A regular TX WQE execution involves two or more DMA reads - one to fetch the WQE, and another one per WQE gather entry. These DMA reads obviously increase the TX latency. There are two mlx5 mechanisms to bypass these DMA reads: 1) Inline WQE 2) Blue Flame (BF) An inline WQE contains a whole packet, thus saves the DMA read/s of the regular WQE gather entry/s. Inline WQE support was already added in the previous commit. A BF WQE is written directly to the device I/O mapped memory, thus enables saving the DMA read that fetches the WQE. The BF WQE I/O write must be in cache line granularity, thus uses the CPU write combining mechanism. A BF WQE I/O write acts also as a TX doorbell for notifying the device of new TX WQEs. A BF WQE is written to the same I/O mapped address as the regular TX doorbell, thus this address is being mapped twice - once by ioremap() and once by io_mapping_map_wc(). While both mechanisms reduce the TX latency, they both consume more CPU cycles than a regular WQE: - A BF WQE must still be written to host memory, in addition to being written directly to the device I/O mapped memory. - An inline WQE involves copying the SKB data into it. To handle this tradeoff, we introduce here a heuristic algorithm that strives to avoid using these two mechanisms in case the TX queue is being back-pressured by the device, and limit their usage rate otherwise. An inline WQE will always be "Blue Flamed" (written directly to the device I/O mapped memory) while a BF WQE may not be inlined (may contain gather entries). Preliminary testing using netperf UDP_RR shows that the latency goes down from 17.5us to 16.9us, while the message rate (tested with pktgen) stays the same. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
author: Achiad Shochat 2015-07-23 22:35:59 +0200
committer: David S. Miller 2015-07-27 09:29:17 +0200
commit: 88a85f99e51fb2373259ab83c8bb130a9bbf3804 (patch)
tree: f27980b2cdd3c9601a5953c7d268eabc18452a2e /drivers/net/ethernet/mellanox/mlx5/core/en_main.c
parent: net/mlx5e: Support TX packet copy into WQE (diff)
download: kernel-qcow2-linux-88a85f99e51fb2373259ab83c8bb130a9bbf3804.tar.gz
kernel-qcow2-linux-88a85f99e51fb2373259ab83c8bb130a9bbf3804.tar.xz
kernel-qcow2-linux-88a85f99e51fb2373259ab83c8bb130a9bbf3804.zip
1 files changed, 7 insertions, 5 deletions
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index c55fad431cbf..4a87e9dcf52c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -514,6 +514,7 @@ static int mlx5e_create_sq(struct mlx5e_channel *c,
 
 	sq->wq.db       = &sq->wq.db[MLX5_SND_DBR];
 	sq->uar_map     = sq->uar.map;
+	sq->uar_bf_map  = sq->uar.bf_map;
 	sq->bf_buf_size = (1 << MLX5_CAP_GEN(mdev, log_bf_reg_size)) / 2;
 	sq->max_inline  = param->max_inline;
 
@@ -524,11 +525,12 @@ static int mlx5e_create_sq(struct mlx5e_channel *c,
 	txq_ix = c->ix + tc * priv->params.num_channels;
 	sq->txq = netdev_get_tx_queue(priv->netdev, txq_ix);
 
-	sq->pdev    = c->pdev;
-	sq->mkey_be = c->mkey_be;
-	sq->channel = c;
-	sq->tc      = tc;
-	sq->edge    = (sq->wq.sz_m1 + 1) - MLX5_SEND_WQE_MAX_WQEBBS;
+	sq->pdev      = c->pdev;
+	sq->mkey_be   = c->mkey_be;
+	sq->channel   = c;
+	sq->tc        = tc;
+	sq->edge      = (sq->wq.sz_m1 + 1) - MLX5_SEND_WQE_MAX_WQEBBS;
+	sq->bf_budget = MLX5E_SQ_BF_BUDGET;
 	priv->txq_to_sq_map[txq_ix] = sq;
 
 	return 0;
author	Achiad Shochat	2015-07-23 22:35:59 +0200
committer	David S. Miller	2015-07-27 09:29:17 +0200
commit	88a85f99e51fb2373259ab83c8bb130a9bbf3804 (patch)
tree	f27980b2cdd3c9601a5953c7d268eabc18452a2e /drivers/net/ethernet/mellanox/mlx5/core/en_main.c
parent	net/mlx5e: Support TX packet copy into WQE (diff)
download	kernel-qcow2-linux-88a85f99e51fb2373259ab83c8bb130a9bbf3804.tar.gz kernel-qcow2-linux-88a85f99e51fb2373259ab83c8bb130a9bbf3804.tar.xz kernel-qcow2-linux-88a85f99e51fb2373259ab83c8bb130a9bbf3804.zip