| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
| |
Reuse the code that creates I/O device page mappings to create the
coherent DMA mapping of the 32-bit address space on demand, instead of
constructing this mapping as part of the initial page table.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The data cache must be invalidated twice for RX DMA buffers: once
before passing ownership to the DMA device (in case the cache happens
to contain dirty data that will be written back at an undefined future
point), and once after receiving ownership from the DMA device (in
case the CPU happens to have speculatively accessed data in the buffer
while it was owned by the hardware).
Only the used portion of the buffer needs to be invalidated after
completion, since we do not care about data within the unused portion.
Update the DMA API to include the used length as an additional
parameter to dma_unmap(), and add the necessary second cache
invalidation pass to the RISC-V DMA API implementation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Add a RISC-V assembly language implementation of TCP/IP checksumming,
which is around 50x faster than the generic algorithm. The main loop
checksums aligned xlen-bit words, using almost entirely compressible
instructions and accumulating carries in a separate register to allow
folding to be deferred until after all loops have completed.
Experimentation on a C910 CPU suggests that this achieves around four
bytes per clock cycle, which is comparable to the x86 implementation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
| |
Provide an implementation of dma_map() that performs cache clean or
invalidation as required, and an implementation of dma_alloc() that
returns virtual addresses within the coherent mapping of the 32-bit
physical address space.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On platforms where DMA devices are not in the same coherency domain as
the CPU cache, it is necessary to be able to explicitly clean the
cache (i.e. force data to be written back to main memory) and
invalidate the cache (i.e. discard any cached data and force a
subsequent read from main memory).
Add support for cache management via the standard Zicbom extension or
the T-Head cache management operations extension, with the supported
extension detected on first use.
Support cache management operations only on I/O buffers, since these
are guaranteed to not share cachelines with other data.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
| |
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
iPXE drivers have been written with the implicit assumption that MMIO
writes are allowed to be posted but that an MMIO register read or
write after another MMIO register write will always observe the
effects of the first write.
For example: after having written a byte to the transmit holding
register (THR) of a 16550 UART, it is expected that any subsequent
read of the line status register (LSR) will observe a value consistent
with the occurrence of the write.
RISC-V does not seem to provide any ordering guarantees between
accesses to different registers within the same MMIO device. Add
fences as part of the MMIO accessors to provide the assumed
guarantees.
Use "fence io, io" before each MMIO read or write to enforce full
serialisation of MMIO accesses with respect to each other. This is
almost certainly more conservative than is strictly necessary.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The RISC-V "fence" instruction encoding includes bits for predecessor
and successor input and output operations, separate from read and
write operations. It is up to the CPU implementation to decide what
counts as I/O space rather than memory space for the purposes of this
instruction.
Since we do not expect fencing to be performance-critical, keep
everything as simple and reliable as possible by using the unadorned
"fence" instruction (equivalent to "fence iorw, iorw").
Add a memory clobber to ensure that the compiler does not reorder the
barrier. (The volatile qualifier seems to already prevent reordering
in practice, but this is not guaranteed according to the compiler
documentation.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the 64-bit paging schemes (Sv39, Sv48, and Sv57), we identity-map
as much of the physical address space as is possible. Experimentation
shows that this is not sufficient to provide access to all I/O
devices. For example: the Sipeed Lichee Pi 4A includes a CPU that
supports only Sv39, but places I/O devices at the top of a 40-bit
address space.
Add support for creating I/O page table entries on demand to map I/O
devices, based on the existing design used for x86_64 BIOS.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
| |
Fall back to attempting the legacy SBI console and shutdown calls if
the standard calls fail.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
| |
Use the word-at-a-time variable-length memcpy() implementation when
performing an overlapping copy in the forwards direction, since this
is guaranteed to be safe and likely to be substantially faster than
the existing bytewise copy.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The RISC-V and AArch64 bare-metal kernel images share a common header
format, and require essentially the same execution environment: loaded
close to the start of RAM, entered with paging disabled, and passed a
pointer to a flattened device tree that describes the hardware and any
boot arguments.
Implement basic support for executing bare-metal RISC-V and AArch64
kernel images. The (trivial) AArch64-specific code path is untested
since we do not yet have the ability to build for any bare-metal
AArch64 platforms. Constructing and passing an initramfs image is not
yet supported.
Rename the IMAGE_BZIMAGE build configuration option to IMAGE_LKRN,
since "bzImage" is specific to x86. To retain backwards compatibility
with existing local build configurations, we leave IMAGE_BZIMAGE as
the enabled option in config/default/pcbios.h and treat IMAGE_LKRN as
a synonym for IMAGE_BZIMAGE when building for x86 BIOS.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
| |
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
iPXE does not make use of any thread-local storage. Use the otherwise
unused thread pointer register ("tp") to hold the current value of
the virtual address offset, rather than using a global variable.
This ensures that virt_offset can be made valid even during very early
initialisation (when iPXE may be executing directly from read-only
memory and so cannot update a global variable).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
| |
Expose the bit shifted out as a result of shifting a big integer left
or right.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
| |
Expose the effective carry (or borrow) out flag from big integer
addition and subtraction, and use this to elide an explicit bit test
when performing x25519 reduction.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The seed CSR defined by the Zkr extension is accessible only in M-mode
by default. Older versions of OpenSBI (prior to version 1.4) do not
set mseccfg.sseed, with the result that attempts to access the seed
CSR from S-mode will raise an illegal instruction exception.
Add a facility for testing the accessibility of arbitrary CSRs, and
use it to check that the seed CSR is accessible before reporting the
seed CSR entropy source as being functional.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add basic support for running directly on top of SBI, with no UEFI
firmware present. Build as e.g.:
make CROSS=riscv64-linux-gnu- bin-riscv64/ipxe.sbi
The resulting binary can be tested in QEMU using e.g.:
qemu-system-riscv64 -M virt -cpu max -serial stdio \
-kernel bin-riscv64/ipxe.sbi
No drivers or executable binary formats are supported yet, but the
unit test suite may be run successfully.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
| |
The Zkr entropy source extension defines a potentially unprivileged
seed CSR that can be read to obtain 16 bits of entropy input, with a
mandated requirement that 256 entropy input bits read from the seed
CSR will contain at least 128 bits of min-entropy.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The Zicntr extension defines an unprivileged wall-clock time CSR that
roughly matches the behaviour of an invariant TSC on x86. The nominal
frequency of this timer may be read from the "timebase-frequency"
property of the CPU node in the device tree.
Add a timer source using RDTIME to provide implementations of udelay()
and currticks(), modelled on the existing RDTSC-based timer for x86.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
RISC-V seems to allow for direct discovery of CPU features only from
M-mode (e.g. by setting up a trap handler and then attempting to
access a CSR), with S-mode code expected to read the resulting
constructed ISA description from the device tree.
Add the ability to check for the presence of named extensions listed
in the "riscv,isa" property of the device tree node corresponding to
the boot hart.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
| |
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
| |
Add the ability to issue Supervisor Binary Interface (SBI) calls via
the ECALL instruction, and use the SBI DBCN extension to implement a
debug console.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
|
|
| |
Every architecture uses the same implementation for bigint_is_set(),
and there is no reason to suspect that a future CPU architecture will
provide a more efficient way to implement this operation.
Simplify the code by providing a single architecture-independent
implementation of bigint_is_set().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
|
|
| |
The big integer shift operations are misleadingly described as
rotations since the original x86 implementations are essentially
trivial loops around the relevant rotate-through-carry instruction.
The overall operation performed is a shift rather than a rotation.
Update the function names and descriptions to reflect this.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
An n-bit multiplication product may be added to up to two n-bit
integers without exceeding the range of a (2n)-bit integer:
(2^n - 1)*(2^n - 1) + (2^n - 1) + (2^n - 1) = 2^(2n) - 1
Exploit this to perform big integer multiplication in constant time
without requiring the caller to provide temporary carry space.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All consumers of profile_timestamp() currently treat the value as an
unsigned long. Only the elapsed number of ticks is ever relevant: the
absolute value of the timestamp is not used. Profiling is used to
measure short durations that are generally fewer than a million CPU
cycles, for which an unsigned long is easily large enough.
Standardise the return type of profile_timestamp() as unsigned long
across all CPU architectures. This allows 32-bit architectures such
as i386 and riscv32 to omit all logic associated with retrieving the
upper 32 bits of the 64-bit hardware counter, which simplifies the
code and allows riscv32 and riscv64 to share the same implementation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Big integer multiplication currently performs immediate carry
propagation from each step of the long multiplication, relying on the
fact that the overall result has a known maximum value to minimise the
number of carries performed without ever needing to explicitly check
against the result buffer size.
This is not a constant-time algorithm, since the number of carries
performed will be a function of the input values. We could make it
constant-time by always continuing to propagate the carry until
reaching the end of the result buffer, but this would introduce a
large number of redundant zero carries.
Require callers of bigint_multiply() to provide a temporary carry
storage buffer, of the same size as the result buffer. This allows
the carry-out from the accumulation of each double-element product to
be accumulated in the temporary carry space, and then added in via a
single call to bigint_add() after the multiplication is complete.
Since the structure of big integer multiplication is identical across
all current CPU architectures, provide a single shared implementation
of bigint_multiply(). The architecture-specific operation then
becomes the multiplication of two big integer elements and the
accumulation of the double-element product.
Note that any intermediate carry arising from accumulating the lower
half of the double-element product may be added to the upper half of
the double-element product without risk of overflow, since the result
of multiplying two n-bit integers can never have all n bits set in its
upper half. This simplifies the carry calculations for architectures
such as RISC-V and LoongArch64 that do not have a carry flag.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|
|
|
Add support for building iPXE as a 64-bit or 32-bit RISC-V binary, for
either UEFI or Linux userspace platforms. For example:
# RISC-V 64-bit UEFI
make CROSS=riscv64-linux-gnu- bin-riscv64-efi/ipxe.efi
# RISC-V 32-bit UEFI
make CROSS=riscv64-linux-gnu- bin-riscv32-efi/ipxe.efi
# RISC-V 64-bit Linux
make CROSS=riscv64-linux-gnu- bin-riscv64-linux/tests.linux
qemu-riscv64 -L /usr/riscv64-linux-gnu/sys-root \
./bin-riscv64-linux/tests.linux
# RISC-V 32-bit Linux
make CROSS=riscv64-linux-gnu- SYSROOT=/usr/riscv32-linux-gnu/sys-root \
bin-riscv32-linux/tests.linux
qemu-riscv32 -L /usr/riscv32-linux-gnu/sys-root \
./bin-riscv32-linux/tests.linux
Signed-off-by: Michael Brown <mcb30@ipxe.org>
|