diff options
author | Reza Arbab | 2020-07-17 00:56:55 +0200 |
---|---|---|
committer | David Gibson | 2020-07-20 01:21:39 +0200 |
commit | a6030d7e0b35a23c82e4a765b53dc3847bcdb4d1 (patch) | |
tree | 0476fd6630d17bd727a20779aeae771348bd7011 /hw/ppc/spapr_pci_nvlink2.c | |
parent | spapr_pci: Robustify support of PCI bridges (diff) | |
download | qemu-a6030d7e0b35a23c82e4a765b53dc3847bcdb4d1.tar.gz qemu-a6030d7e0b35a23c82e4a765b53dc3847bcdb4d1.tar.xz qemu-a6030d7e0b35a23c82e4a765b53dc3847bcdb4d1.zip |
spapr: Add a new level of NUMA for GPUs
NUMA nodes corresponding to GPU memory currently have the same
affinity/distance as normal memory nodes. Add a third NUMA associativity
reference point enabling us to give GPU nodes more distance.
This is guest visible information, which shouldn't change under a
running guest across migration between different qemu versions, so make
the change effective only in new (pseries > 5.0) machine types.
Before, `numactl -H` output in a guest with 4 GPUs (nodes 2-5):
node distances:
node 0 1 2 3 4 5
0: 10 40 40 40 40 40
1: 40 10 40 40 40 40
2: 40 40 10 40 40 40
3: 40 40 40 10 40 40
4: 40 40 40 40 10 40
5: 40 40 40 40 40 10
After:
node distances:
node 0 1 2 3 4 5
0: 10 40 80 80 80 80
1: 40 10 80 80 80 80
2: 80 80 10 80 80 80
3: 80 80 80 10 80 80
4: 80 80 80 80 10 80
5: 80 80 80 80 80 10
These are the same distances as on the host, mirroring the change made
to host firmware in skiboot commit f845a648b8cb ("numa/associativity:
Add a new level of NUMA for GPU's").
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Message-Id: <20200716225655.24289-1-arbab@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Diffstat (limited to 'hw/ppc/spapr_pci_nvlink2.c')
-rw-r--r-- | hw/ppc/spapr_pci_nvlink2.c | 13 |
1 files changed, 10 insertions, 3 deletions
diff --git a/hw/ppc/spapr_pci_nvlink2.c b/hw/ppc/spapr_pci_nvlink2.c index dd8cd6db96..76ae77ebc8 100644 --- a/hw/ppc/spapr_pci_nvlink2.c +++ b/hw/ppc/spapr_pci_nvlink2.c @@ -362,9 +362,9 @@ void spapr_phb_nvgpu_ram_populate_dt(SpaprPhbState *sphb, void *fdt) &error_abort); uint32_t associativity[] = { cpu_to_be32(0x4), - SPAPR_GPU_NUMA_ID, - SPAPR_GPU_NUMA_ID, - SPAPR_GPU_NUMA_ID, + cpu_to_be32(nvslot->numa_id), + cpu_to_be32(nvslot->numa_id), + cpu_to_be32(nvslot->numa_id), cpu_to_be32(nvslot->numa_id) }; uint64_t size = object_property_get_uint(nv_mrobj, "size", NULL); @@ -375,6 +375,13 @@ void spapr_phb_nvgpu_ram_populate_dt(SpaprPhbState *sphb, void *fdt) _FDT(off); _FDT((fdt_setprop_string(fdt, off, "device_type", "memory"))); _FDT((fdt_setprop(fdt, off, "reg", mem_reg, sizeof(mem_reg)))); + + if (sphb->pre_5_1_assoc) { + associativity[1] = SPAPR_GPU_NUMA_ID; + associativity[2] = SPAPR_GPU_NUMA_ID; + associativity[3] = SPAPR_GPU_NUMA_ID; + } + _FDT((fdt_setprop(fdt, off, "ibm,associativity", associativity, sizeof(associativity)))); |