diff options
author | Jiong Wang | 2017-12-01 06:32:58 +0100 |
---|---|---|
committer | Daniel Borkmann | 2017-12-01 20:59:20 +0100 |
commit | 9879a3814beb3b1350755475e67a8d92ba1f7e4b (patch) | |
tree | 69118717e6a3ddd0d068fe43f4f6084183358191 /drivers/net/ethernet/netronome/nfp/nfp_asm.h | |
parent | nfp: bpf: factor out is_mbpf_load & is_mbpf_store (diff) | |
download | kernel-qcow2-linux-9879a3814beb3b1350755475e67a8d92ba1f7e4b.tar.gz kernel-qcow2-linux-9879a3814beb3b1350755475e67a8d92ba1f7e4b.tar.xz kernel-qcow2-linux-9879a3814beb3b1350755475e67a8d92ba1f7e4b.zip |
nfp: bpf: implement memory bulk copy for length within 32-bytes
For NFP, we want to re-group a sequence of load/store pairs lowered from
memcpy/memmove into single memory bulk operation which then could be
accelerated using NFP CPP bus.
This patch extends the existing load/store auxiliary information by adding
two new fields:
struct bpf_insn *paired_st;
s16 ldst_gather_len;
Both fields are supposed to be carried by the the load instruction at the
head of the sequence. "paired_st" is the corresponding store instruction at
the head and "ldst_gather_len" is the gathered length.
If "ldst_gather_len" is negative, then the sequence is doing memory
load/store in descending order, otherwise it is in ascending order. We need
this information to detect overlapped memory access.
This patch then optimize memory bulk copy when the copy length is within
32-bytes.
The strategy of read/write used is:
* Read.
Use read32 (direct_ref), always.
* Write.
- length <= 8-bytes
write8 (direct_ref).
- length <= 32-bytes and is 4-byte aligned
write32 (direct_ref).
- length <= 32-bytes but is not 4-byte aligned
write8 (indirect_ref).
NOTE: the optimization should not change program semantics. The destination
register of the last load instruction should contain the same value before
and after this optimization.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Diffstat (limited to 'drivers/net/ethernet/netronome/nfp/nfp_asm.h')
-rw-r--r-- | drivers/net/ethernet/netronome/nfp/nfp_asm.h | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_asm.h b/drivers/net/ethernet/netronome/nfp/nfp_asm.h index 6ff842a15e5d..98803f9f40b6 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_asm.h +++ b/drivers/net/ethernet/netronome/nfp/nfp_asm.h @@ -220,6 +220,7 @@ struct cmd_tgt_act { enum cmd_tgt_map { CMD_TGT_READ8, CMD_TGT_WRITE8_SWAP, + CMD_TGT_WRITE32_SWAP, CMD_TGT_READ32, CMD_TGT_READ32_LE, CMD_TGT_READ32_SWAP, @@ -241,6 +242,9 @@ enum cmd_ctx_swap { CMD_CTX_NO_SWAP = 3, }; +#define CMD_OVE_LEN BIT(7) +#define CMD_OV_LEN GENMASK(12, 8) + #define OP_LCSR_BASE 0x0fc00000000ULL #define OP_LCSR_A_SRC 0x000000003ffULL #define OP_LCSR_B_SRC 0x000000ffc00ULL |