Adding Files

main
FPGALover 2024-02-24 00:25:27 -08:00
parent dd8356b94e
commit 63dd9360f3
351 changed files with 80886 additions and 0 deletions

641
CHANGELOG.md 100644
View File

@ -0,0 +1,641 @@
## Project Change Log
[![release](https://img.shields.io/github/v/release/stnolting/neorv32?longCache=true&style=flat-square&logo=GitHub)](https://github.com/stnolting/neorv32/releases)
![commits-since-latest-release](https://img.shields.io/github/commits-since/stnolting/neorv32/latest?longCache=true&style=flat-square&logo=GitHub)
This project uses [semantic versioning](https://semver.org).
The **version identifier** uses an additional custom element (`MAJOR.MINOR.PATCH.custom`)
to track individual changes. The identifier is incremented by every core hardware modification
and also by major software/project changes.
The version identifier is globally defined by the `hw_version_c` constant in the main VHDL
[package file](https://github.com/stnolting/neorv32/blob/main/rtl/core/neorv32_package.vhd).
Software can determine this version by reading the RISC-V-compatible `mimpid` CSR, which uses
a 8x4-bit BCD (binary-coded decimal) representation. Example:
```
mimpid = 0x01040312 -> Version 01.04.03.12 -> v1.4.3.12
```
### Version History
* :bug: bug-fix
* :sparkles: new feature
* :test_tube: new experimental feature
* :warning: changes that might impact compatibility with previous versions
* :lock: security/safety-related
* :rocket: official release
* Date format is "dd.mm.yyyy"
| Date | Version | Comment | Link |
|:----:|:-------:|:--------|:----:|
| 23.02.2024 | 1.9.5.8 | optimize FIFO component to improve technology mapping (inferring blockRAM for "async read" configuration); :bug: fix SLINK status flag delay | [#828](https://github.com/stnolting/neorv32/pull/828) |
| 23.02.2024 | 1.9.5.7 | fix FIFO synthesis issue (Vivado cannot infer block RAM nor LUT-RAM) | [#827](https://github.com/stnolting/neorv32/pull/827) |
| 20.02.2024 | 1.9.5.6 | :bug: fix bug in `mip.firq` CSR access; `mip.firq` bits are now read-write - software can trigger FIRQs by writing `1` to the according CSR bit | [#821](https://github.com/stnolting/neorv32/pull/821) |
| 19.02.2024 | 1.9.5.5 | SLINK: add native hardware support for AXI-stream's "tlast" signal | [#815](https://github.com/stnolting/neorv32/pull/815) |
| 19.02.2024 | 1.9.5.4 | :warning: remove support of `Smcntrpmf` ISA extension (counter privilege mode filtering) | [#814](https://github.com/stnolting/neorv32/pull/814) |
| 17.02.2024 | 1.9.5.3 | :warning: reworked CPU's hardware performance monitor (HPMs) events | [#811](https://github.com/stnolting/neorv32/pull/811) |
| 16.02.2024 | 1.9.5.2 | :warning: **revert** support for page faults (keep that in mmu branch for now) | [#809](https://github.com/stnolting/neorv32/pull/809) |
| 16.02.2024 | 1.9.5.1 | :sparkles: add two new generics to exclude certain PMP modes from synthesis | [#808](https://github.com/stnolting/neorv32/pull/808) |
| 16.02.2024 | [**:rocket:1.9.5**](https://github.com/stnolting/neorv32/releases/tag/v1.9.5) | **New release** | |
| 15.02.2023 | 1.9.4.13 | allow the DMA to issue a FENCE operation | [#807](https://github.com/stnolting/neorv32/pull/807) |
| 14.02.2024 | 1.9.4.12 | :bug: close another illegal compressed instruction encoding loophole | [#806](https://github.com/stnolting/neorv32/pull/806) |
| 11.02.2024 | 1.9.4.11 | :bug: fix several FPU bugs and design flaws | [#794](https://github.com/stnolting/neorv32/pull/794) |
| 11.02.2024 | 1.9.4.10 | minor additions to previous version (1.9.4.9): fix HPM configuration read-back | [#804](https://github.com/stnolting/neorv32/pull/804) |
| 10.02.2024 | 1.9.4.9 | fixing HPM configuration's null range assertions | [#803](https://github.com/stnolting/neorv32/pull/803) |
| 10.02.2024 | 1.9.4.8 | :bug: fix missing fence pass-through in caches | [#802](https://github.com/stnolting/neorv32/pull/802) |
| 09.02.2024 | 1.9.4.7 | :warning: integrate fence signal into CPU bus, remove top entity's fence signals | [#800](https://github.com/stnolting/neorv32/pull/800) |
| 09.02.2024 | 1.9.4.6 | :sparkles: add configurable XIP cache | [#799](https://github.com/stnolting/neorv32/pull/799) |
| 09.02.2024 | 1.9.4.5 | :bug: close further illegal compressed instruction encoding loopholes | [#797](https://github.com/stnolting/neorv32/pull/797) |
| 04.02.2024 | 1.9.4.4 | :bug: fix minor bug: CPU instruction bus privilege signal did not remain stable during the entire request | [#792](https://github.com/stnolting/neorv32/pull/792) |
| 03.02.2024 | 1.9.4.3 | :bug: fix minor bug: CPU instruction bus privilege signal was hardwired to "user-mode" | [#790](https://github.com/stnolting/neorv32/pull/790) |
| 01.02.2024 | 1.9.4.2 | :sparkles: add support for page fault exceptions (yet unused) | [#786](https://github.com/stnolting/neorv32/pull/786) |
| 31.01.2024 | 1.9.4.1 | fix trap priority | [#784](https://github.com/stnolting/neorv32/pull/784) |
| 31.01.2024 | [**:rocket:1.9.4**](https://github.com/stnolting/neorv32/releases/tag/v1.9.4) | **New release** | |
| 31.01.2024 | 1.9.3.10 | close illegal compressed instruction decoding loophole | [#783](https://github.com/stnolting/neorv32/pull/783) |
| 29.01.2024 | 1.9.3.9 | :test_tube: extend switchable clock domain (CPU bus switch, i-cache, d-cache) | [#780](https://github.com/stnolting/neorv32/pull/780) |
| 29.01.2024 | 1.9.3.8 | top entity input ports now have default values `'L'` or `'h'` modeling a pull-down or pull-resistor in case they are not explicitly assigned during instantiation | [#779](https://github.com/stnolting/neorv32/pull/779) |
| 28.01.2024 | 1.9.3.7 | FIFO module _NULL assertion_ fix | [#778](https://github.com/stnolting/neorv32/pull/778) |
| 27.01.2024 | 1.9.3.6 | improve CPU's front end (instruction fetch) increasing overall performance | [#777](https://github.com/stnolting/neorv32/pull/777) |
| 27.01.2024 | 1.9.3.5 | :bug: fix typo that renders the clock gating (added in v1.9.3.4) useless: CPU sleep output stuck at zero | [#776](https://github.com/stnolting/neorv32/pull/776) |
| 24.01.2024 | 1.9.3.4 | :sparkles: add optional CPU clock gating (via new generic `CLOCK_GATING_EN`): shut down the CPU clock during sleep mode; :warning: add new HDL design file for the clock gate (`neorv32_clockgate.vhd`) | [#775](https://github.com/stnolting/neorv32/pull/775) |
| 23.01.2024 | 1.9.3.3 | :bug: remove compressed floating point load/store operations as they are **not** supported by `Zfinx` | [#771](https://github.com/stnolting/neorv32/pull/771) |
| 20.01.2024 | 1.9.3.2 | optimize bus switch; minor RTL and comment edits | [#769](https://github.com/stnolting/neorv32/pull/769) |
| 14.01.2024 | 1.9.3.1 | minor rtl cleanups and optimizations | [#764](https://github.com/stnolting/neorv32/pull/764) |
| 11.01.2024 | [**:rocket:1.9.3**](https://github.com/stnolting/neorv32/releases/tag/v1.9.3) | **New release** | |
| 10.01.2024 | 1.9.2.11 | minor HDL fix (introduced in v1.9.2.9) | [#763](https://github.com/stnolting/neorv32/pull/763) |
| 10.01.2024 | 1.9.2.10 | re-add MTIME system time output to processor top (`mtime_time_o`) | [#762](https://github.com/stnolting/neorv32/pull/762) |
| 08.01.2024 | 1.9.2.9 | minor rtl code cleanups | [#760](https://github.com/stnolting/neorv32/pull/760) |
| 06.01.2024 | 1.9.2.8 | :sparkles: add timer-capture mode to General Purpose Timer (GPTMR); :warning: remove "single-shot" mode, change control register layout | [#759](https://github.com/stnolting/neorv32/pull/759) |
| 19.12.2023 | 1.9.2.7 | minor rtl code cleanups, edits and optimization; :lock: reset `mtvec`, `mepc` and `dpc` CSRs to CPU boot address (`CPU_BOOT_ADDR` CPU generic) | [#755](https://github.com/stnolting/neorv32/pull/755) |
| 19.12.2023 | 1.9.2.6 | rework FIFO component fixing problems with inferring block RAM | [#754](https://github.com/stnolting/neorv32/pull/754) |
| 11.12.2023 | 1.9.2.5 | clean-up software framework | [#752](https://github.com/stnolting/neorv32/pull/752) |
| 09.12.2023 | 1.9.2.4 | minor rtl code cleanups | [#747](https://github.com/stnolting/neorv32/pull/747) |
| 09.12.2023 | 1.9.2.3 | refine behavior of CPU's sleep state & signal | [#746](https://github.com/stnolting/neorv32/pull/746) |
| 05.12.2023 | 1.9.2.2 | reset `mstatus.mpp` to "machine-mode" | [#745](https://github.com/stnolting/neorv32/pull/745) |
| 02.12.2023 | 1.9.2.1 | :sparkles: add RISC-V `Zicond` ISA extension (integer conditional operations) | [#743](https://github.com/stnolting/neorv32/pull/743) |
| 01.12.2023 | [**:rocket:1.9.2**](https://github.com/stnolting/neorv32/releases/tag/v1.9.2) | **New release** | |
| 01.12.2023 | 1.9.1.9 | add `menvcfg[h]` CSRs | [#741](https://github.com/stnolting/neorv32/pull/741) |
| 30.11.2023 | 1.9.1.8 | :sparkles: :bug: upgrade RISC-V hardware trigger module (`Sdtrig` ISA extension) to spec. version v1.0 (fixing several minor bugs) | [#739](https://github.com/stnolting/neorv32/pull/739) |
| 25.11.2023 | 1.9.1.7 | cleanup/update assertions and auto-adjusting of invalid generic configurations | [#738](https://github.com/stnolting/neorv32/pull/738) |
| 25.11.2023 | 1.9.1.6 | :sparkles: add option for "ASIC style" register file that provides a full/dedicated hardware reset | [#736](https://github.com/stnolting/neorv32/pull/736) |
| 23.11.2023 | 1.9.1.5 | clean-up & rework CPU branch logic | [#735](https://github.com/stnolting/neorv32/pull/735) |
| 21.11.2023 | 1.9.1.4 | :bug: fix bug in handling of "misaligned instruction exception" | [#734](https://github.com/stnolting/neorv32/pull/734) |
| 20.11.2023 | 1.9.1.3 | :bug: fix wiring of FPU exception flags | [#733](https://github.com/stnolting/neorv32/pull/733) |
| 18.11.2023 | 1.9.1.2 | add XIP clock divider to fine-tune SPI frequency | [#731](https://github.com/stnolting/neorv32/pull/731) |
| 18.11.2023 | 1.9.1.1 | (re-)add SPI high-speed mode, :bug: fix bug in SPI shift register - introduced in v1.9.0.9 | [#730](https://github.com/stnolting/neorv32/pull/730) |
| 14.11.2023 | [**:rocket:1.9.1**](https://github.com/stnolting/neorv32/releases/tag/v1.9.1) | **New release** | |
| 11.11.2023 | 1.9.0.9 | :test_tube: add full hardware reset for **all** flip flops in CPU/processor | [#724](https://github.com/stnolting/neorv32/pull/724) |
| 09.11.2023 | 1.9.0.8 | minor rtl code cleanups | [#723](https://github.com/stnolting/neorv32/pull/723) |
| 04.11.2023 | 1.9.0.7 | upgrade true random number generator to [neoTRNG version 3](https://github.com/stnolting/neoTRNG) | [#721](https://github.com/stnolting/neorv32/pull/721) |
| 31.10.2023 | 1.9.0.6 | update crt0's early-boot trap handler | [#719](https://github.com/stnolting/neorv32/pull/719) |
| 30.10.2023 | 1.9.0.5 | minor rtl cleanups and code beautification | [#718](https://github.com/stnolting/neorv32/pull/718) |
| 28.10.2023 | 1.9.0.4 | :warning: :sparkles: move FreeRTOS port and demo to a new repository: https://github.com/stnolting/neorv32-freertos | [#716](https://github.com/stnolting/neorv32/pull/716) |
| 18.10.2023 | 1.9.0.3 | :warning: remove top's `CPU_EXTENSION_RISCV_Zifencei` generic - `Zifencei` ISA extension is now always enabled | [#709](https://github.com/stnolting/neorv32/pull/709) |
| 16.10.2023 | 1.9.0.2 | minor CPU control cleanups and optimizations (branch system) | [#707](https://github.com/stnolting/neorv32/pull/707) |
| 13.10.2023 | 1.9.0.1 | update software framework to GCC-13.2.0 | [#705](https://github.com/stnolting/neorv32/pull/705) |
| 13.10.2023 | [**:rocket:1.9.0**](https://github.com/stnolting/neorv32/releases/tag/v1.9.0) | **New release** | |
| 13.10.2023 | 1.8.9.9 | minor hardware edits and optimizations | [#703](https://github.com/stnolting/neorv32/pull/703) |
| 07.10.2023 | 1.8.9.8 | add "transfer done" flag to DMA | [#699](https://github.com/stnolting/neorv32/pull/699) |
| 04.10.2023 | 1.8.9.7 | :warning: rework internal bus protocol | [#697](https://github.com/stnolting/neorv32/pull/697) |
| 29.09.2023 | 1.8.9.6 | optimize PMP logic (reducing area requirements) | [#695](https://github.com/stnolting/neorv32/pull/695) |
| 29.09.2023 | 1.8.9.5 | minor CPU optimizations and code clean-ups | [#694](https://github.com/stnolting/neorv32/pull/694) |
| 23.09.2023 | 1.8.9.4 | :sparkles: added vectored trap handling mode of `mtvec` for reduced latency from IRQ to ISR | [#691](https://github.com/stnolting/neorv32/pull/691)
| 22.09.2023 | 1.8.9.3 | :lock: **watchdog**: add reset password and optional "strict" mode for increased safety | [#692](https://github.com/stnolting/neorv32/pull/692) |
| 15.09.2023 | 1.8.9.2 | :warning: rework CFU CSRs; minor rtl edits | [#690](https://github.com/stnolting/neorv32/pull/690) |
| 11.09.2023 | 1.8.9.1 | minor rtl edits and updates | [#684](https://github.com/stnolting/neorv32/pull/684) |
| 09.09.2023 | [**:rocket:1.8.9**](https://github.com/stnolting/neorv32/releases/tag/v1.8.9) | **New release** | |
| 08.09.2023 | 1.8.8.9 | removed unused `mcontext` and `scontext` CSRs (`Sdtrig` ISA extension); CPU optimizations (area and timing) | [#683](https://github.com/stnolting/neorv32/pull/683) |
| 02.09.2023 | 1.8.8.8 | :sparkles: add option to implement **up to 2^32 CFU-internal user-defined CSRs** (via indirect CSR access) | [#681](https://github.com/stnolting/neorv32/pull/681) |
| 02.09.2023 | 1.8.8.7 | :lock: (re-)add **execution monitor**: raise an exception if a multi-cycle ALU operation does not complete within a bound amount of time | [#680](https://github.com/stnolting/neorv32/pull/680) |
| 01.09.2023 | 1.8.8.6 | minor rtl edits and cleanups | [#679](https://github.com/stnolting/neorv32/pull/679) |
| 30.08.2023 | 1.8.8.5 | remove "branch prediction" logic - core is smaller and _even faster_ without it | [#678](https://github.com/stnolting/neorv32/pull/678) |
| 25.08.2023 | 1.8.8.4 | add new generic to downgrade on-chip debugger's debug module back to spec. version 0.13 (`DM_LEGACY_MODE` generic) | [#677](https://github.com/stnolting/neorv32/pull/677) |
| 23.08.2023 | 1.8.8.3 | :test_tube: add experimental `Smcntrpmf` ISA extension (counter privilege mode filtering; spec. is frozen but not yet ratified); remove unused `menvcfg` CSRs | [#676](https://github.com/stnolting/neorv32/pull/676) |
| 19.08.2023 | 1.8.8.2 | :warning: constrain `mtval` CSR; add support for `mtinst` CSR (trap instruction) | [#674](https://github.com/stnolting/neorv32/pull/674) |
| 19.08.2023 | 1.8.8.1 | :test_tube: update RTE to support easy emulation of instructions; add example program to showcase how to emulate unaligned memory accesses | [#673](https://github.com/stnolting/neorv32/pull/673) |
| 18.08.2023 | [**:rocket:1.8.8**](https://github.com/stnolting/neorv32/releases/tag/v1.8.8) | **New release** | |
| 17.08.2023 | 1.8.7.9 | minor rtl edits and cleanups | [#672](https://github.com/stnolting/neorv32/pull/672) |
| 13.08.2023 | 1.8.7.8 | :warning: constrain/optimize `mtval` and `mcounteren` CSRs | [#671](https://github.com/stnolting/neorv32/pull/671) |
| 12.08.2023 | 1.8.7.7 | remove _unratified_ `Zicond` ISA extension; minor rtl code cleanups and optimizations | [#670](https://github.com/stnolting/neorv32/pull/670) |
| 05.08.2023 | 1.8.7.6 | :bug: fix bug: HPM width configurations below 32 bit fail | [#665](https://github.com/stnolting/neorv32/pull/665) |
| 04.08.2023 | 1.8.7.5 | :warning: major code edits/cleanups and file renaming | [#664](https://github.com/stnolting/neorv32/pull/664) |
| 29.07.2023 | 1.8.7.4 | RTL cleanup and optimizations (less synthesis warnings, less resource requirements) | [#660](https://github.com/stnolting/neorv32/pull/660) |
| 28.07.2023 | 1.8.7.3 | :warning: reworked **SYSINFO** module; clean-up address space layout; clean-up assertion notes | [#659](https://github.com/stnolting/neorv32/pull/659) |
| 27.07.2023 | 1.8.7.2 | :bug: make sure that IMEM/DMEM size is always a power of two | [#658](https://github.com/stnolting/neorv32/pull/658) |
| 27.07.2023 | 1.8.7.1 | :warning: remove `CUSTOM_ID` generic; cleanup and re-layout `NEORV32_SYSINFO.SOC` bits; (:bug:) fix gateway's generics (`positive` -> `natural` as these generics are allowed to be zero) | [#657](https://github.com/stnolting/neorv32/pull/657) |
| 26.07.2023 | [**:rocket:1.8.7**](https://github.com/stnolting/neorv32/releases/tag/v1.8.7) | **New release** | |
| 24.07.2023 | 1.8.6.10 | :bug: fixing some LR/SC design flaws | [#654](https://github.com/stnolting/neorv32/pull/654) |
| 23.07.2023 | 1.8.6.9 | optimize bus system and customization options | [#653](https://github.com/stnolting/neorv32/pull/653) |
| 22.07.2023 | 1.8.6.8 | minor rtl edits | [#652](https://github.com/stnolting/neorv32/pull/652) |
| 21.07.2023 | 1.8.6.7 | :sparkles: add support for **RISC-V A ISA Extension** (atomic memory accesses; `lr.w`/`sc.w` only!) | [#651](https://github.com/stnolting/neorv32/pull/651) |
| 18.07.2023 | 1.8.6.6 | :warning: remove 32-bit data dump mode from UART0/1 sim-mode | [#650](https://github.com/stnolting/neorv32/pull/650) |
| 16.07.2023 | 1.8.6.5 | :warning: **rework SoC bus system & memory map - part 3**: re-enforce PMAs (physical memory attributes) | [#648](https://github.com/stnolting/neorv32/pull/648) |
| 15.07.2023 | 1.8.6.4 | :warning: **rework SoC bus system & memory map - part 2**: move IO address decoding to central IO switch; add i-cache uncached accesses | [#648](https://github.com/stnolting/neorv32/pull/648) |
| 14.07.2023 | 1.8.6.3 | :warning: **rework SoC bus system & memory map - part 1**: add central bus gateway to control core accesses to the main address regions | [#648](https://github.com/stnolting/neorv32/pull/648) |
| 14.07.2023 | 1.8.6.2 | minor rtl edits | [#646](https://github.com/stnolting/neorv32/pull/646) |
| 30.06.2023 | 1.8.6.1 | minor rtl edits, cleanups and optimizations | [#641](https://github.com/stnolting/neorv32/pull/641) |
| 27.06.2023 | [**:rocket:1.8.6**](https://github.com/stnolting/neorv32/releases/tag/v1.8.6) | **New release** | |
| 24.06.2023 | 1.8.5.9 | :test_tube: VHDL code: use entity instantiation instead of component instantiation | [#637](https://github.com/stnolting/neorv32/pull/637) |
| 24.06.2023 | 1.8.5.8 | optimize CPU control logic; closed further invalid instruction word detection holes | [#636](https://github.com/stnolting/neorv32/pull/636) |
| 23.06.2023 | 1.8.5.7 | :warning: remove **buskeeper's status register** | [#635](https://github.com/stnolting/neorv32/pull/635) |
| 17.06.2023 | 1.8.5.6 | :sparkles: add new **Cyclic Redundancy Check module (CRC)** | [#632](https://github.com/stnolting/neorv32/pull/632) |
| 03.06.2023 | 1.8.5.5 | :sparkles: re-add (simplified) **Stream Link Interface (SLINK)** | [#628](https://github.com/stnolting/neorv32/pull/628) |
| 03.06.2023 | 1.8.5.4 | :warning: rearrange bits in **SYSINFO** | [#627](https://github.com/stnolting/neorv32/pull/627) |
| 02.06.2023 | 1.8.5.3 | :bug: executable generation: fix address continuity between `.text` and `.rodata` segments | [#626](https://github.com/stnolting/neorv32/pull/626) |
| 19.05.2023 | 1.8.5.2 | :sparkles: add automatic trigger mode to **DMA** (trigger transfer if a processor-internal peripheral issues an interrupt request) | [#618](https://github.com/stnolting/neorv32/pull/618) |
| 18.05.2023 | 1.8.5.1 | software can now retrieve the configured FIFO size of the **TRNG** | [#616](https://github.com/stnolting/neorv32/pull/616) |
| 18.05.2023 | [**:rocket:1.8.5**](https://github.com/stnolting/neorv32/releases/tag/v1.8.5) | **New release** | |
| 18.05.2023 | 1.8.4.9 | remove `is_simulation` flag from SYSINFO; add programmable interrupt to **TRNG** module | [#615](https://github.com/stnolting/neorv32/pull/615) |
| 12.05.2023 | 1.8.4.8 | `mtval` CSR now provides the address of `ebreak` exceptions (re-added temporarily to pass RISC-V ISA tests) | [#611](https://github.com/stnolting/neorv32/pull/611) |
| 03.05.2023 | 1.8.4.7 | :bug: fix bug in FPU (terminate FPU sub-module operations if an exception has been raised) | [#609](https://github.com/stnolting/neorv32/pull/609) |
| 02.05.2023 | 1.8.4.6 | make SDI FIFO access entirely synchronous; upgrade processor memory modules; update test setup wrappers | [#608](https://github.com/stnolting/neorv32/pull/608) |
| 30.04.2023 | 1.8.4.5 | rework processor-internal bus system | [#607](https://github.com/stnolting/neorv32/pull/607) |
| 27.04.2023 | 1.8.4.4 | minor hardware edits and switching activity optimizations of CPU bus unit | [#605](https://github.com/stnolting/neorv32/pull/605) |
| 25.04.2023 | 1.8.4.3 | :bug: fix bug in **DMA** (corrupted write-back when there are bus wait cycles - e.g. when no caches are implemented) | [#601](https://github.com/stnolting/neorv32/pull/601) |
| 24.04.2023 | 1.8.4.2 | minor rtl edits; shorten critical path of d-cache setup | [#599](https://github.com/stnolting/neorv32/pull/599) |
| 22.04.2023 | 1.8.4.1 | :sparkles: add optional **direct memory access controller (DMA)** | [#593](https://github.com/stnolting/neorv32/pull/593) |
| 21.04.2023 | [**:rocket:1.8.4**](https://github.com/stnolting/neorv32/releases/tag/v1.8.4) | **New release** | |
| 21.04.2023 | 1.8.3.9 | :bug: fix timeout bug in **FPU** normalizer | [#592](https://github.com/stnolting/neorv32/pull/592) |
| 19.04.2023 | 1.8.3.8 | minor processor bus system optimizations and clean-ups | [#591](https://github.com/stnolting/neorv32/pull/591) |
| 15.04.2023 | 1.8.3.7 | :bug: :warning: `wfi` and XIRQ bug fixes; massive RTL code cleanup and optimization of CPU control | [#586](https://github.com/stnolting/neorv32/pull/586) |
| 14.04.2023 | 1.8.3.6 | [UARTs] software can now retrieve the configured RX/TX FIFO sizes from the `DATA` register | [#581](https://github.com/stnolting/neorv32/pull/581) |
| 13.04.2023 | 1.8.3.5 | :bug: fixed bug in FPU control logic (introduced in some earlier clean-up commit); minor code edits and optimizations | [#578](https://github.com/stnolting/neorv32/pull/578) |
| 07.04.2023 | 1.8.3.4 | rtl edits and cleanups | [#571](https://github.com/stnolting/neorv32/pull/571) |
| 05.04.2023 | 1.8.3.3 | update **external interrupt controller (XIRQ)** | [#570](https://github.com/stnolting/neorv32/pull/570) |
| 05.04.2023 | 1.8.3.2 | `time` CSR struggles (again) and logic optimization | [#569](https://github.com/stnolting/neorv32/pull/569) |
| 01.04.2023 | 1.8.3.1 | :sparkles: add full `NA4` and `NAPOT` support to the (now) RISC-V-compatible **physical memory protection (PMP)** | [#566](https://github.com/stnolting/neorv32/pull/566) |
| 31.03.2023 | [**:rocket:1.8.3**](https://github.com/stnolting/neorv32/releases/tag/v1.8.3) | **New release** | |
| 29.03.2023 | 1.8.2.9 | :warning: remove `CPU_EXTENSION_RISCV_Zicsr` generic - `Zicsr` ISA extension is always enabled; optimize bus switch; VHDL code cleanups | [#562](https://github.com/stnolting/neorv32/pull/562) |
| 25.03.2023 | 1.8.2.8 | :test_tube: add configurable data cache (**dCACHE**) | [#560](https://github.com/stnolting/neorv32/pull/560) |
| 24.03.2023 | 1.8.2.7 | :sparkles: add full support of `mcounteren` CSR; cleanup counter and PMP CSRs; i-cache optimization | [#559](https://github.com/stnolting/neorv32/pull/559) |
| 18.03.2023 | 1.8.2.6 | add new generic `JEDEC_ID` (official JEDEC identifier; used for `mvendorid` CSR); further generics cleanups | [#557](https://github.com/stnolting/neorv32/pull/557)
| 17.03.2023 | 1.8.2.5 | add RISC-V `time[h]` CSRs (part of the `Zicntr` ISA extension) | [#556](https://github.com/stnolting/neorv32/pull/556) |
| 17.03.2023 | 1.8.2.4 | re-add VHDL process names | [#555](https://github.com/stnolting/neorv32/pull/555) |
| 15.03.2023 | 1.8.2.3 | rtl reworks, cleanups and optimizations | [#550](https://github.com/stnolting/neorv32/pull/550) |
| 11.03.2023 | 1.8.2.2 | :sparkles: add support for RISC-V `Zicond` ISA extension (conditional operations) | [#546](https://github.com/stnolting/neorv32/pull/546) |
| 10.03.2023 | 1.8.2.1 | rtl code edits, clean-ups and minor optimizations (improve branch prediction) | [#545](https://github.com/stnolting/neorv32/pull/545) |
| 10.03.2023 | [**:rocket:1.8.2**](https://github.com/stnolting/neorv32/releases/tag/v1.8.2) | **New release** | |
| 09.03.2023 | 1.8.1.10 | :warning: move tri-state drivers (ONEWIRE and TWI) out of the core | [#543](https://github.com/stnolting/neorv32/pull/543) |
| 08.03.2023 | 1.8.1.9 | reintegrate **UART** RTS/CTS hardware flow-control | [#541](https://github.com/stnolting/neorv32/pull/541) |
| 07.03.2023 | 1.8.1.8 | update smart LED controller **NEOLED** | [#536](https://github.com/stnolting/neorv32/pull/536) |
| 05.03.2023 | 1.8.1.7 | :warning: rework and update **UART0 & UART1** | [#533](https://github.com/stnolting/neorv32/pull/533) |
| 04.03.2023 | 1.8.1.6 | :warning: rework and update **SPI** module | [#530](https://github.com/stnolting/neorv32/pull/530) |
| 02.03.2023 | 1.8.1.5 | minor general purpose timer (GPTMR) code edits | [#529](https://github.com/stnolting/neorv32/pull/529) |
| 02.03.2023 | 1.8.1.4 | :bug: fix timeout bug in **FPU** (conversion and add/sub instructions) | [#528](https://github.com/stnolting/neorv32/pull/528) |
| 25.02.2023 | 1.8.1.3 | :sparkles: add new processor module: **Serial Data Interface (SDI)** - a SPI _device-class_ interface | [#505](https://github.com/stnolting/neorv32/pull/505) |
| 24.02.2023 | 1.8.1.2 | :warning: rename top interface signals of **XIP** and **SIP** modules | [#504](https://github.com/stnolting/neorv32/pull/504) |
| 23.02.2023 | 1.8.1.1 | CFS: add another 32 interface register (now having 64 memory-mapped registers for custom usage) | [#503](https://github.com/stnolting/neorv32/pull/503) |
| 23.02.2023 | [**:rocket:1.8.1**](https://github.com/stnolting/neorv32/releases/tag/v1.8.1) | **New release** | |
| 22.02.2023 | 1.8.0.10 | :warning: **remove stream link interface (SLINK)** | [#502](https://github.com/stnolting/neorv32/pull/502) |
| 19.02.2023 | 1.8.0.9 | :warning: constrain number of **PWM** channels to 12 (was 60); change base address of PWM module | [#501](https://github.com/stnolting/neorv32/pull/501) |
| 18.02.2023 | 1.8.0.8 | :bug: fix minor bug in CPU's co-processor monitor; minor VHDL clean-ups and edits | [#500](https://github.com/stnolting/neorv32/pull/500) |
| 13.02.2023 | 1.8.0.7 | minor CPU optimization and fixes | [#497](https://github.com/stnolting/neorv32/pull/497) |
| 11.02.2023 | 1.8.0.6 | :warning: replace boolean `IO_GPIO_EN` generic by natural `IO_GPIO_NUM` generic to fine-tune GPIO pin number | [#491](https://github.com/stnolting/neorv32/pull/491) |
| 10.02.2023 | 1.8.0.5 | :test_tube: add CPU co-processor monitor (to auto-terminate operation if a co-processor operation takes too long) | [#490](https://github.com/stnolting/neorv32/pull/490) |
| 10.02.2023 | 1.8.0.4 | replace CPU-internal control bus by a VHDL `record` (much cleaner code); minor control optimizations; add 6ht CPU co-processor slot (yet unused) | [#489](https://github.com/stnolting/neorv32/pull/489) |
| 05.02.2023 | 1.8.0.3 | CPU control optimizations | [#487](https://github.com/stnolting/neorv32/pull/487) |
| 04.02.2023 | 1.8.0.2 | fix RISC-V-incompatible behavior of `mip` CSR | [#486](https://github.com/stnolting/neorv32/pull/486) |
| 01.02.2023 | 1.8.0.1 | clean-up CPU's interrupt controller; fix race condition in FIRQ trigger/acknowledge | [#484](https://github.com/stnolting/neorv32/pull/484) |
| 25.01.2023 | [**:rocket:1.8.0**](https://github.com/stnolting/neorv32/releases/tag/v1.8.0) | **New release** | |
| 21.01.2023 | 1.7.9.10 | update software framework; :bug: fix bug in constructor calling in `crt0` start-up code | [#478](https://github.com/stnolting/neorv32/pull/478) |
| 15.01.2023 | 1.7.9.9 | :warning: rework **CPU counters**; remove `mtime_i/o` top entity ports; remove `time[h]` CSRs | [#477](https://github.com/stnolting/neorv32/pull/477) |
| 14.01.2023 | 1.7.9.8 | minor CPU control edits, optimizations and fixes | [#476](https://github.com/stnolting/neorv32/pull/476) |
| 10.01.2023 | 1.7.9.7 | :warning: rework **watchdog timer (WDT)** | [#474](https://github.com/stnolting/neorv32/pull/474) |
| 06.01.2023 | 1.7.9.6 | update [neoTRNG v2](https://github.com/stnolting/neoTRNG) | [#472](https://github.com/stnolting/neorv32/pull/472) |
| 06.01.2023 | 1.7.9.5 | CPU control: logic optimization and fix minor bug in trigger module | [#470](https://github.com/stnolting/neorv32/pull/470) |
| 04.01.2023 | 1.7.9.4 | update **on-chip debugger**: :test_tube: remove debug module's `haltsum0` register; rework DMI to comply with RISC-V debug spec.; minor edits, updates and fixes | [#468](https://github.com/stnolting/neorv32/pull/468) |
| 23.12.2022 | 1.7.9.3 | :warning: add explicit `Sdext` and `Sdtrig` ISA extension generics (replacing `DEBUG`); :sparkles: trigger-module can now also be used by machine-mode software without the on-chip debugger, add minimal example program `sw/example/demo_trigger_module` | [#465](https://github.com/stnolting/neorv32/pull/465) |
| 23.12.2022 | 1.7.9.2 | :sparkles: upgrade the **on-chip debugger (OCD)** to spec. version 1.0; major logic and debugging response time optimizations | [#463](https://github.com/stnolting/neorv32/pull/463) |
| 22.12.2022 | 1.7.9.1 | remove signal initialization (in reset generator) as some FPGAs do not support FF initialization via bitstream | [#464](https://github.com/stnolting/neorv32/pull/464) |
| 21.12.2022 | [**:rocket:1.7.9**](https://github.com/stnolting/neorv32/releases/tag/v1.7.9) | **New release** | |
| 21.12.2022 | 1.7.8.11 | CPU: remove explicit reset-to-don't-care; branch and CSR access check logic optimizations; close further illegal instruction encoding hole | [#462](https://github.com/stnolting/neorv32/pull/462) |
| 20.12.2022 | 1.7.8.10 | SOC: rework r/w access logic; split read and write accesses into two processes; removed explicit reset-to-don't-care | [#461](https://github.com/stnolting/neorv32/pull/461) |
| 18.12.2022 | 1.7.8.9 | `mtval` is no longer read-only and can now be written by machine-mode software | [#460](https://github.com/stnolting/neorv32/pull/460) |
| 17.12.2022 | 1.7.8.8 | :bug: fix incorrect value written to `mepc` when encountering an "instruction access fault" exception | [#458](https://github.com/stnolting/neorv32/pull/458) |
| 16.12.2022 | 1.7.8.7 | :bug: fix **instruction cache** block invalidation when a bus access error occurs during memory block fetch (after cache miss) | [#457](https://github.com/stnolting/neorv32/pull/457) |
| 16.12.2022 | 1.7.8.6 | :test_tube: optimized park-loop code (**on-chip debugger firmware**) providing slightly faster debugging response; added explicit address generics for defining debug mode entry points | [#456](https://github.com/stnolting/neorv32/pull/456) |
| 13.12.2022 | 1.7.8.5 | code cleanup of FIFO module; improved **instruction prefetch buffer (IPB)** - IPD depth can be as small as "1" and will be adjusted automatically when enabling the `C` ISA extension; update hardware implementation results | [#455](https://github.com/stnolting/neorv32/pull/455) |
| 09.12.2022 | 1.7.8.4 | :sparkles: new option to add custom **R5-type** (4 source registers, 1 destination register) instructions to **Custom Functions Unit (CFU)** | [#452](https://github.com/stnolting/neorv32/pull/452) |
| 08.12.2022 | 1.7.8.3 | :bug: fix interrupt behavior when in user-mode; minor core rtl fixes; do not check registers specifiers in CFU instructions (i.e. using registers above `x15` when `E` ISA extension is enabled) | [#450](https://github.com/stnolting/neorv32/pull/450) |
| 03.12.2022 | 1.7.8.2 | :sparkles: new option to add custom **R4-type** RISC-V instructions to **Custom Functions Unit (CFU)**; rework CFU hardware module, intrinsic library and example program | [#449](https://github.com/stnolting/neorv32/pull/449) |
| 01.12.2022 | 1.7.8.1 | package cleanup | [#447](https://github.com/stnolting/neorv32/pull/447) |
| 28.11.2022 | [**:rocket:1.7.8**](https://github.com/stnolting/neorv32/releases/tag/v1.7.8) | **New release** | |
| 14.11.2022 | 1.7.7.9 | minor rtl edits and code optimizations | [#442](https://github.com/stnolting/neorv32/pull/442) |
| 05.11.2022 | 1.7.7.8 | minor rtl edits | [#441](https://github.com/stnolting/neorv32/pull/441) |
| 03.11.2022 | 1.7.7.7 | :sparkles: add fine-grained clock configuration for **TWI** module: add fine-grained clock configuration, add clock stretching configuration flag | [#440](https://github.com/stnolting/neorv32/pull/440) |
| 01.11.2022 | 1.7.7.6 | :warning: rework **SPI module** | [#438](https://github.com/stnolting/neorv32/pull/438) |
| 24.10.2022 | 1.7.7.5 | :test_tube: remove weird Quartus latch warnings by modifying VHDL coding style | [#434](https://github.com/stnolting/neorv32/pull/434) |
| 19.10.2022 | 1.7.7.4 | optimize UART's `RTS` (hardware flow control) behavior | [#433](https://github.com/stnolting/neorv32/pull/433) |
| 15.10.2022 | 1.7.7.3 | :bug: fix bug in `is_power_of_two_f` VHDL function (thanks Alan!) | [#428](https://github.com/stnolting/neorv32/pull/428) |
| 12.10.2022 | 1.7.7.2 | add dedicated hardware reset to _all_ CPU counters (`[m]cycle[h]`, `[m]instret[h]`, `[m]hpmcounter[h]`); :sparkles: **all CSRs now provide a dedicated hardware reset** | [#426](https://github.com/stnolting/neorv32/pull/426) |
| 09.10.2022 | 1.7.7.1 | fix Quartus synthesis issue (VHDL): make sure reset state is the _first_ entry in a state list | [#423](https://github.com/stnolting/neorv32/pull/423) |
| 24.09.2022 | [**:rocket:1.7.7**](https://github.com/stnolting/neorv32/releases/tag/v1.7.7) | **New release** | |
| 23.09.2022 | 1.7.6.10 | cleanup native data path size (remove `data_width_c` package constant); initial preparations to **support RV64 ISA extension** somewhere in the future | [#417](https://github.com/stnolting/neorv32/pull/417) |
| 18.09.2022 | 1.7.6.9 | :bug: fixed instruction decoding collision in **`B` ISA extensions** - `B` extension is now fully operational and verified (see [neorv32-riscof](https://github.com/stnolting/neorv32-riscof))! | [#413](https://github.com/stnolting/neorv32/pull/413) |
| 13.09.2022 | 1.7.6.8 | :bug: bug fix: clearing `mie`'s FIRQ bits did not clear the according _pending_ FIRQs | [#411](https://github.com/stnolting/neorv32/pull/411) |
| 12.09.2022 | 1.7.6.7 | minor rtl edits and cleanups | [#410](https://github.com/stnolting/neorv32/pull/410) |
| 10.09.2022 | 1.7.6.6 | :warning: set `mtval` to _zero_ on any illegal instruction exception - removes redundancies, simplifies hardware | [#409](https://github.com/stnolting/neorv32/pull/409) |
| 09.09.2022 | 1.7.6.5 | minor rtl edits; add "output gate" to FIFO component | [#408](https://github.com/stnolting/neorv32/pull/408) |
| 08.09.2022 | 1.7.6.4 | :warning: cleanup CPU standard counters and remove _CPU_CNT_WIDTH_ generic | [#407](https://github.com/stnolting/neorv32/pull/407) |
| 07.09.2022 | 1.7.6.3 | minor rtl edits and cleanups | [#406](https://github.com/stnolting/neorv32/pull/406) |
| 03.09.2022 | 1.7.6.2 | cleanup hardware reset logic | [#405](https://github.com/stnolting/neorv32/pull/405) |
| 02.09.2022 | 1.7.6.1 | :sparkles: add new processor module: **1-Wire Interface Controller** (ONEWIRE) | [#402](https://github.com/stnolting/neorv32/pull/402) |
| 28.08.2022 | [**:rocket:1.7.6**](https://github.com/stnolting/neorv32/releases/tag/v1.7.6) | **New release** | |
| 27.08.2022 | 1.7.5.9 | fix minor core rtl issues that were found while experimenting with a low-level netlist of the processor | [#398](https://github.com/stnolting/neorv32/pull/398) |
| 26.08.2022 | 1.7.5.8 | cleanup **crt0** start-up code: remove setup of `mcountern` and `mcountinhibit` CSRs | [#397](https://github.com/stnolting/neorv32/pull/397) |
| 24.08.2022 | 1.7.5.7 | minor rtl cleanups | [#396](https://github.com/stnolting/neorv32/pull/396) |
| 20.08.2022 | 1.7.5.6 | :sparkles: update software framework to GCC 12.1.0 (new prebuilt toolchains available!) | [#391](https://github.com/stnolting/neorv32/pull/391) |
| 18.08.2022 | 1.7.5.5 | :lock: add **TRNG** read data protection | [#389](https://github.com/stnolting/neorv32/pull/389) |
| 18.08.2022 | 1.7.5.4 | minor rtl cleanup in **PWM** module | [#388](https://github.com/stnolting/neorv32/pull/388) |
| 17.08.2022 | 1.7.5.3 | optimized **CPU front-end** - faster instruction fetch | [#387](https://github.com/stnolting/neorv32/pull/387) |
| 16.08.2022 | 1.7.5.2 | relocate TWI tri-state drivers | [#386](https://github.com/stnolting/neorv32/pull/386) |
| 15.08.2022 | 1.7.5.1 | change base address of **BUSKEEPER** | [#385](https://github.com/stnolting/neorv32/pull/385) |
| 15.08.2022 | [**:rocket:1.7.5**](https://github.com/stnolting/neorv32/releases/tag/v1.7.5) | **New release** | |
| 14.08.2022 | 1.7.4.10 | cleanup of FIFO rtl component | [#384](https://github.com/stnolting/neorv32/pull/384) |
| 13.08.2022 | 1.7.4.9 | minor rtl cleanups and optimizations | [#383](https://github.com/stnolting/neorv32/pull/383) |
| 01.08.2022 | 1.7.4.8 | :sparkles: add configurable data FIFO to **SPI** module | [#381](https://github.com/stnolting/neorv32/pull/381) |
| 31.07.2022 | 1.7.4.7 | :warning: rework **SLINK** module | [#377](https://github.com/stnolting/neorv32/pull/377) |
| 25.07.2022 | 1.7.4.6 | :warning: simplify memory configuration of **linker script**; :sparkles: add in-console configuration option | [#375](https://github.com/stnolting/neorv32/pull/375) |
| 22.07.2022 | 1.7.4.5 | add `CUSTOM_ID` generic; update bootloader | [#374](https://github.com/stnolting/neorv32/pull/374) |
| 21.07.2022 | 1.7.4.4 | :lock: specify **physical memory attributes (PMA)** | [#372](https://github.com/stnolting/neorv32/pull/372) |
| 18.07.2022 | 1.7.4.3 | minor rtl edits and updates | [#369](https://github.com/stnolting/neorv32/pull/369) |
| 15.07.2022 | 1.7.4.2 | :bug: fixed PMP configuration error when `PMP_NUM_REGIONS` = 0 | [#368](https://github.com/stnolting/neorv32/pull/368) |
| 15.07.2022 | 1.7.4.1 | :bug: fix permanent stall of `[m]cycle[h]` and `[m]instret[h]` counter if _HPM_NUM_CNTS_ = 0; :bug: fixed bug in Wishbone `we` signal when _ASYNC_TX_ mode enabled; hardwire `dcsr.mprven` to 1 | [#367](https://github.com/stnolting/neorv32/pull/367) |
| 14.07.2022 | [**:rocket:1.7.4**](https://github.com/stnolting/neorv32/releases/tag/v1.7.4) | **New release** | |
| 14.07.2022 | 1.7.3.11 | reset all "core" CSRs to all-zero | [#366](https://github.com/stnolting/neorv32/pull/366) |
| 13.07.2022 | 1.7.3.10 | :bug: reworked/fixed **physical memory protection**; :sparkles: added `mstatus.MPRV` flag | [#365](https://github.com/stnolting/neorv32/pull/365) |
| 12.07.2022 | 1.7.3.9 | clean-up and rework **bootloader**; :sparkles: add "boot via XIP" option | [#364](https://github.com/stnolting/neorv32/pull/364) |
| 11.07.2022 | 1.7.3.8 | **physical memory protection(PMP)**: locking entry `i` in TOR mode will now also prevent write access to `pmpaddr(i-1)` (RISC-V compatibility) | [#363](https://github.com/stnolting/neorv32/pull/363) |
| 09.07.2022 | 1.7.3.7 | :bug: fixed **bootloader's** byte order when using the flash for application storage: :warning: was BIG-endian, is now also LITTLE-endian | [#362](https://github.com/stnolting/neorv32/pull/362) |
| 08.07.2022 | 1.7.3.6 | :test_tube: added burst mode option to **XIP module** to accelerate consecutive flash read accesses; :warning: fixed XIP endianness: was BIG-endian and is now LITTLE-endian | [#361](https://github.com/stnolting/neorv32/pull/361) |
| 08.07.2022 | 1.7.3.5 | Update "raw" executable generation options of makefile and image generator | [#360](https://github.com/stnolting/neorv32/pull/360) |
| 05.07.2022 | 1.7.3.4 | add "infrastructure" for cached (burst) bus accesses | [#359](https://github.com/stnolting/neorv32/pull/359) |
| 01.07.2022 | 1.7.3.3 | minor rtl cleanups | [#357](https://github.com/stnolting/neorv32/pull/357) |
| 29.06.2022 | 1.7.3.2 | :test_tube: add experimental core complex wrapper for integration into the [**LiteX**](https://github.com/enjoy-digital/litex) SoC builder framework | [#353](https://github.com/stnolting/neorv32/pull/353) |
| 28.06.2022 | 1.7.3.1 | :bug: fix bug that caused permanent CPU stall if illegal load/store instruction | [#356](https://github.com/stnolting/neorv32/pull/356) |
| 23.06.2022 | [**:rocket:1.7.3**](https://github.com/stnolting/neorv32/releases/tag/v1.7.3) | **New release** _two years NEORV32!_ :tada: | |
| 21.06.2022 | 1.7.2.10 | :sparkles: add option to implement an asynchronous **Wishbone** TX path; add new top generic `MEM_EXT_ASYNC_TX` | [#352](https://github.com/stnolting/neorv32/pull/352) |
| 17.06.2022 | 1.7.2.9 | minor rtl code clean-ups/optimization of **CPU core** and **Neoled** module | [#351](https://github.com/stnolting/neorv32/pull/351) |
| 16.06.2022 | 1.7.2.8 | :warning: rework **SLINK** module, add support for T_LAST signals | [#349](https://github.com/stnolting/neorv32/pull/349) |
| 11.06.2022 | 1.7.2.7 | reworked processor **reset system**; :warning: changed behavior of **watchdog's** "lock" bit; add watchdog "access password" | [#345](https://github.com/stnolting/neorv32/pull/345) |
| 10.06.2022 | 1.7.2.6 | **Wishbone** interface now _gates_ all outgoing signals (= signals remain stable if there is no active Wishbone access) | [#344](https://github.com/stnolting/neorv32/pull/344) |
| 09.06.2022 | 1.7.2.5 | reworked **TWI** module fixing several interface timing issues; :warning: removed "START condition done interrupt" and "STOP condition done interrupt" | [#340](https://github.com/stnolting/neorv32/pull/340) |
| 06.06.2022 | 1.7.2.4 | split executable images into package and body | [#338](https://github.com/stnolting/neorv32/pull/338) |
| 04.06.2022 | 1.7.2.3 | :bug: fixed bug in **SPI** and **XIP** modules: phase offset between SPI clock and SPI data | [#336](https://github.com/stnolting/neorv32/pull/336) |
| 03.06.2022 | 1.7.2.2 | :sparkles: (finally) added a **dedicated hardware reset** to all IO/peripheral devices | [#334](https://github.com/stnolting/neorv32/pull/334) |
| 02.06.2022 | 1.7.2.1 | :sparkles: add **watchdog** pause flag to stop watchdog timeout counter when CPU is in sleep mode | [#331](https://github.com/stnolting/neorv32/pull/331) |
| 02.06.2022 | [**:rocket:1.7.2**](https://github.com/stnolting/neorv32/releases/tag/v1.7.2) | **New release** | |
| 01.06.2022 | 1.7.1.11 | :bug: fixed bug in **debugger's** single-stepping mode (bug introduced with version 1.7.1.9) | [#329](https://github.com/stnolting/neorv32/pull/329) |
| 29.05.2022 | 1.7.1.10 | rework **bootloader's** "SPI flash presence detection"; added new option (`SPI_FLASH_ADDR_BYTES`) to customize the bootloader SPI flash address width (16-, 24- or 32-bit) | [#321](https://github.com/stnolting/neorv32/pull/321) |
| 29.05.2022 | 1.7.1.9 | :bug: fixed bug in **CPU trap logic**: collision of synchronous and asynchronous exceptions | [#327](https://github.com/stnolting/neorv32/pull/327) |
| 19.05.2022 | 1.7.1.8 | :bug: fixed bug in **XIP** address conversion logic: sub-word read accesses (half-word, byte) returned wrong data | [#320](https://github.com/stnolting/neorv32/pull/320) |
| 17.05.2022 | 1.7.1.7 | :sparkles: add optional/configurable data FIFO to **TRNG**; new top generic `IO_TRNG_FIFO` | [#316](https://github.com/stnolting/neorv32/pull/316) |
| 13.05.2022 | 1.7.1.6 | :bug: fixed bug in **BUSKEEPER** timeout logic | [#315](https://github.com/stnolting/neorv32/pull/315) |
| 10.05.2022 | 1.7.1.5 | code clean-up and minor optimization of `B` extension (bit-manipulation) CPU co-processor | [#312](https://github.com/stnolting/neorv32/pull/312) |
| 06.05.2022 | 1.7.1.4 | :sparkles: upgrade TRNG module to new [neoTRNG v2](https://github.com/stnolting/neoTRNG) | [#311](https://github.com/stnolting/neorv32/pull/311) |
| 05.05.2022 | 1.7.1.3 | :bug: bug fix in CPU counter overflow logic (`cycle` and `instret` counters); minor optimization of CPU execution unit | [#310](https://github.com/stnolting/neorv32/pull/310) |
| 28.04.2022 | 1.7.1.2 | add flag to `mxisa` CSR to check if _this_ is a simulation (bit 20: _CSR_MXISA_IS_SIM_); add flag to `mxisa` CSR to check if all CPU core register have a dedicated reset (bit 21: _CSR_MXISA_HW_RESET_) | [#309](https://github.com/stnolting/neorv32/pull/309) |
| 27.04.2022 | 1.7.1.1 | :warning: **removed RISC-V `A` ISA extension** (atomic memory accesses); removed Wishbone "lock" signal | [#308](https://github.com/stnolting/neorv32/pull/308) |
| 25.04.2022 | [**:rocket:1.7.1**](https://github.com/stnolting/neorv32/releases/tag/v1.7.1) | **New release** | |
| 23.04.2022 | 1.7.0.9 | :bug: fixed minor bug in HPM event logic: imprecise "taken branch" (_HPMCNT_EVENT_TBRANCH_) event |
| 23.04.2022 | 1.7.0.8 | :sparkles: add simple branch prediction (predict "always taken") to CPU front-end to reduce branch penalty (less wait cycles); [#306](https://github.com/stnolting/neorv32/pull/306) |
| 22.04.2022 | 1.7.0.7 | reworked CPU's MUL/DIV unit (`M`-extension): less area and shorter critical path; [#305](https://github.com/stnolting/neorv32/pull/305) |
| 21.04.2022 | 1.7.0.6 | further VHDL code clean-ups and minor optimizations; [#303](https://github.com/stnolting/neorv32/pull/303) |
| 19.04.2022 | 1.7.0.5 | minor clean-up and optimization of CPU's bus unit |
| 13.04.2022 | 1.7.0.4 | improve timing of CPU's barrel shifter (`FAST_SHIFT_EN` = true) by moving the register stage; [#301](https://github.com/stnolting/neorv32/pull/301) |
| 12.04.2022 | 1.7.0.3 | CPU front-end is now controlled by a _synchronous_ state machine (all outgoing signals are driven by registers), reducing critical path of memory system & reducing area costs; :warning: `CPU_IPB_ENTRIES` now has to be >= 2; [#300](https://github.com/stnolting/neorv32/pull/300) |
| 11.04.2022 | 1.7.0.2 | cleanup of CPU front-end (instruction fetch); cleaner code, less area costs; [#299](https://github.com/stnolting/neorv32/pull/299) |
| 10.04.2022 | 1.7.0.1 | rework handling of `x0` register (`zero`): shortens critical path and reduces area costs; [#298](https://github.com/stnolting/neorv32/pull/298) |
| 08.04.2022 | [**:rocket:1.7.0**](https://github.com/stnolting/neorv32/releases/tag/v1.7.0) | **New release** |
| 08.04.2022 | 1.6.9.11 | :bug: fixed bug in interrupt setup of **`crt0` start-up code** [#297](https://github.com/stnolting/neorv32/pull/297) |
| 08.04.2022 | 1.6.9.10 | rework compressed instruction (`C` ISA extension) de-compressor: :lock: closed further illegal compressed instruction holes; code clean-ups; `mtval` CSR now shows the decompressed 32-bit instruction when executing an illegal compressed instruction; minor RTL code cleanups (removing legacy stuff); [PR #296](https://github.com/stnolting/neorv32/pull/296) |
| 07.04.2022 | 1.6.9.9 | AND-gate CSR read address: reduces **CPU switching activity** (= dynamic power consumption) and even reduces area costs; [PR #295](https://github.com/stnolting/neorv32/pull/295) |
| 06.04.2022 | 1.6.9.8 | :bug: fixed instruction decoding collision in CPU `B` extension; :lock: closed further illegal instruction encoding holes; optimized illegal instruction detection logic; [PR #294](https://github.com/stnolting/neorv32/pull/294) |
| 04.04.2022 | 1.6.9.7 | **major CPU logic optimization**: reduced area costs and shortened critical path (higher f_max!); :bug: fixed rare bug in RTE core (if C-extension is not implemented); :lock: closed further illegal instruction encoding holes; [PR #293](https://github.com/stnolting/neorv32/pull/293) |
| 01.04.2022 | 1.6.9.6 | rework **CPU front-end**: instruction issue engine; much cleaner code, slightly less HW required; [PR #292](https://github.com/stnolting/neorv32/pull/292) |
| 29.03.2022 | 1.6.9.5 | minor clock generator edits: reset **clock generator** explicitly if not being used by _any_ peripheral/IO device |
| 19.03.2022 | 1.6.9.4 | :test_tube: change usage of VHDL `*_reduce_f` functions for signals that might effect gate-level simulations; [PR #290](https://github.com/stnolting/neorv32/pull/290) |
| 19.03.2022 | 1.6.9.3 | :bug: fixed minor bug in **FPU** - incorrect/missing reset (even if reset to `'-'`) of some registers |
| 18.03.2022 | 1.6.9.2 | fixed minor bug in **TRNG** interface hand shake (that marked the _same_ RND value as "valid" for several times); minor optimization of **processor's reset generator** |
| 14.03.2022 | 1.6.9.1 | `mtval` CSR is set to zero for software breakpoints (`[c.]ebreak` instruction(s)) - this is permitted by the RISC-V machine ISA spec. v1.12; [PR #289](https://github.com/stnolting/neorv32/pull/289) |
| 09.03.2022 | [**:rocket:1.6.9**](https://github.com/stnolting/neorv32/releases/tag/v1.6.9) | **New release** |
| 09.03.2022 | 1.6.8.12 | CPU core: minor code clean-up |
| 08.03.2022 | 1.6.8.11 | clean-up of CPU's privilege mode logic |
| 07.03.2022 | 1.6.8.10 | added compressed floating-point instructions (`Zfinx` ISA extensions); minor optimization of compressed instruction decoding logic |
| 05.03.2022 | 1.6.8.9 | CPU core: minor optimizations, code clean-ups and edits; :sparkles: added RISC-V `mstatus.TW` bit to allow/disallow execution of `wfi` instruction in user mode; [PR #285](https://github.com/stnolting/neorv32/pull/285) |
| 02.03.2022 | 1.6.8.8 | :bug: fixed bug in layout of CPU's `pmpaddr` CSRs (**physical memory protection**); [PR #283](https://github.com/stnolting/neorv32/pull/283) |
| 01.03.2022 | 1.6.8.7 | CPU core: minor optimizations, code clean-ups and edits |
| 26.02.2022 | 1.6.8.6 | :warning: :lock: **reworked Physical Memory Protection (PMP)**: replacing `NAPOT` mode by `TOR` mode and fixing several minor PMP CSR-access bugs; maximum number of PMP regions is now limited to 16 entries; :warning: removed **BUSKEEPER's NULL address check** (introduced in version `1.6.5.4`) - use a single PMP entry instead; see [PR #281](https://github.com/stnolting/neorv32/pull/281) |
| 25.02.2022 | 1.6.8.5 | minor BUSMUX (bus multiplexer for CPU's instruction and data buses) and CPU control edits (pipeline front-end) |
| 24.02.2022 | 1.6.8.4 | :bug: **fixed bug in `mip` CSR** (introduced in version `1.6.4.6` with [#236](https://github.com/stnolting/neorv32/pull/236)): to clear/ack a pending interrupt software needs to **clear** the according `mip` bit; see [PR #280](https://github.com/stnolting/neorv32/pull/280) |
| 24.02.2022 | 1.6.8.3 | reworked CPU's data path (use a few _wide_ multiplexers instead of many small ones); [PR #279](https://github.com/stnolting/neorv32/pull/279) |
| 23.02.2022 | 1.6.8.2 | CPU logic optimizations (less area): simplified CPU co-processor interface; minor optimization of bus unit access arbiters; optimized `M` extension's (mul/div co-processor) divider unit |
| 18.02.2022 | 1.6.8.1 | minor CPU control logic optimizations: simplified execute engine; faster execution of SYSTEM instructions (one cycle less) |
| 17.02.2022 | [**:rocket:1.6.8**](https://github.com/stnolting/neorv32/releases/tag/v1.6.8) | **New release** |
| 17.02.2022 | 1.6.7.10 | hardwired `dcsr.stopcount` to `1`: all standard counters (`[m]cycle[h]` and `[m]instret[h]`, but **NOT** `[m]time[h]`!!) and all hardware performance monitor (HPM) counters are _stopped_ when the CPU is in debug mode; [PR #277](https://github.com/stnolting/neorv32/pull/277) |
| 16.02.2022 | 1.6.7.9 | :warning: **added custom `mxisa` CSR replacing SYSINFO's `NEORV32_SYSINFO.CPU` memory-mapped register**: bit-positions remain but names and the actual access mechanism (CSR vs. memory-mapped) have changed! see [PR #276](https://github.com/stnolting/neorv32/pull/276) |
| 11.02.2022 | 1.6.7.8 | :test_tube: added newlib's system calls (stubs) and linker script symbols for heap memory to support **dynamic memory allocation** (e.g. `malloc`) and even **standard IO functions** like `printf`; see [PR #275](https://github.com/stnolting/neorv32/pull/275) |
| 10.02.2022 | 1.6.7.7 | :test_tube: added **RISC-V hardware trigger module** to CPU - allows to set _hardware breakpoints_ (via gdb's `hb`/`hbreak` command) to debug code from ROM; see [PR #274](https://github.com/stnolting/neorv32/pull/274); :bug: minor bug fix in `ebreak` instruction's `dcsr.cause` value (was 0b010 but has to be 0b001) |
| 08.02.2022 | 1.6.7.6 | :warning: renamed default branch of repository to `main` |
| 07.02.2022 | 1.6.7.5 | removed default values for bi-directional top entity ports `twi_sda_io` and `twi_scl_io` |
| 05.02.2022 | 1.6.7.4 | added `err_o` signal to **IMEM** module; if the IMEM is implemented as true ROM any write attempt will raise a _store access fault_ exception (with a `[DEVICE_ERR]` error); see [PR #273](https://github.com/stnolting/neorv32/pull/273) |
| 03.02.2022 | 1.6.7.3 | :test_tube: using `LTO` (link-time-optimization) option for **bootloader**; improved bootloader user console; see [PR #268](https://github.com/stnolting/neorv32/pull/268) |
| 31.01.2022 | 1.6.7.2 | :bug: fixed minor bug in **bootloader's MTIME handling** (bootloader crashed if `Zicntr` ISA extension not enabled), fixed minor issues in MTIME and `time` CSRs handling; added MTIME example program; see [PR #267](https://github.com/stnolting/neorv32/pull/267) |
| 30.01.2022 | 1.6.7.1 | :sparkles: added **`Zxcfu` ISA extension for user-defined custom RISC-V instructions**; see [PR #264](https://github.com/stnolting/neorv32/pull/264) |
| 28.01.2022 | [**:rocket:1.6.7**](https://github.com/stnolting/neorv32/releases/tag/v1.6.7) | **New release** |
| 28.01.2022 | 1.6.6.10 | :bug: fixed bug in **bit-manipulation co-processor**: decoding collision between `cpop` and `rol` instructions; :bug: fixed bug in co-processor arbitration when an illegal instruction is detected; added four additional (yet unused) **CPU** co-processor slots; [PR #262](https://github.com/stnolting/neorv32/pull/262) |
| 27.01.2022 | 1.6.6.9 | reworked **CFS** "user" logic; added CFS demo program; see [PR #261](https://github.com/stnolting/neorv32/pull/261) |
| 27.01.2022 | 1.6.6.8 | :sparkles: added support for RISC-V bit-manipulation (`B`) **carry-less multiplication instructions `Zbc`** sub-extension; added test cases and intrinsics; the NEORV32 bit-manipulation ISA extension (`B`) now fully complies to the RISC-V specs. v0.93; see [PR #260](https://github.com/stnolting/neorv32/pull/260) |
| 26.01.2022 | 1.6.6.7 | :sparkles: added support for RISC-V bit-manipulation (`B`) **single-bit instructions `Zbs`** sub-extension; added test cases and intrinsics; see [PR #259](https://github.com/stnolting/neorv32/pull/259) |
| 26.01.2022 | 1.6.6.6 | minor logic optimizations in **CPU control unit** |
| 25.01.2022 | 1.6.6.5 | :lock: **on-chip debugger:** the memory-mapped registers of the debug module (DM) are only accessible/visible when the CPU is actually in debug mode; any access outside of debug mode will now raise a bus exception |
| 22.01.2022 | 1.6.6.4 | minor logic optimizations in **CPU control unit**, minor improvement of critical path |
| 21.01.2022 | 1.6.6.3 | reworked **CPU's instruction issue engine** (area optimization: ~100 LUTs less on an Intel Cyclone IV), [PR #256](https://github.com/stnolting/neorv32/pull/256); minor CPU control unit code clean-ups and logic optimizations |
| 18.01.2022 | 1.6.6.2 | :warning: moved `setups` folder to new [neorv32-setups](https://github.com/stnolting/neorv32-setups) repository, [PR #254](https://github.com/stnolting/neorv32/pull/254) |
| 18.01.2022 | 1.6.6.1 | minor **MTIME** VHDL code clean-up; minor logic optimization of **CPU's bus unit** |
| 17.01.2022 | [**:rocket:1.6.6**](https://github.com/stnolting/neorv32/releases/tag/v1.6.6) | **New release** |
| 14.01.2022 | 1.6.5.9 | **GPIO** module: write accesses to the GPIO module's "input" registers will now raise a bus exception; [PR #255](https://github.com/stnolting/neorv32/pull/255) |
| 11.01.2022 | 1.6.5.8 | minor rtl code clean-ups and edits in `rtl/core`; any write access to the SYSINFO module will now show up as a BUSKEEPER's "DEVICE_ERR" |
| 08.01.2022 | 1.6.5.7 | :bug: fixed bug in BUSKEEPER's error type logic (introduced in version `1.6.5.4`); removed "unexpected ERR/ACK" error codes; [PR #253](https://github.com/stnolting/neorv32/pull/253) |
| 07.01.2022 | 1.6.5.6 | :sparkles: **XIP & SPI: added high-speed SPI mode** (SPI clocking at half of the processor clock), see [PR #251](https://github.com/stnolting/neorv32/pull/251) |
| 06.01.2022 | 1.6.5.5 | :warning: optimized/reworked XIP (execute in place) module, see [PR #249](https://github.com/stnolting/neorv32/pull/249) |
| 04.01.2022 | 1.6.5.4 | **BUSKEEPER** can now optionally check for NULL address accesses (address `0x00000000`), see [PR #247](https://github.com/stnolting/neorv32/pull/247) |
| 02.01.2022 | 1.6.5.3 | :sparkles: **added Execute In Place (XIP) module** allowing code to be directly executed from an external SPI flash, see [PR #244](https://github.com/stnolting/neorv32/pull/244) |
| 02.01.2022 | 1.6.5.2 | :bug: fixed minor bug in CPU's instruction fetch unit (only issue new instruction fetch request when the previous one has been completed) |
| 16.12.2021 | [**:rocket:1.6.5**](https://github.com/stnolting/neorv32/releases/tag/v1.6.5) | **New release** |
| 15.12.2021 | 1.6.4.10 | minor logic optimization of CPU's pipeline front-end (instruction fetch and instruction issue) |
| 14.12.2021 | 1.6.4.9 | optimized CPU's multiplication/division co-processor: divisions are 1 cycle faster, fast-multiplications (when using DSPs) are 1 cycle faster, slightly less resource utilization, see [PR #240](https://github.com/stnolting/neorv32/pull/240) |
| 11.12.2021 | 1.6.4.8 | watchdog: added new _DBEN_ and _HALF_ flags to control register (enable WDT during debugging, check timeout counter level), see [PR #239](https://github.com/stnolting/neorv32/pull/239) |
| 10.12.2021 | 1.6.4.7 | optimized CPU's multiplication/division co-processor: all mul/div operations are 1 cycle faster + slightly less resource utilization, see [PR #238](https://github.com/stnolting/neorv32/pull/238) |
| 08.12.2021 | 1.6.4.6 | :warning: reworked **Fast Interrupt Requests (FIRQ)** system, see [PR #236](https://github.com/stnolting/neorv32/pull/236) |
| 03.12.2021 | 1.6.4.5 | added _SYSINFO_SOC_IS_SIM_ flag to SYSINFO to check if processor is being simulated (not guaranteed, depends on the toolchain's 'pragma' support), see [PR #231](https://github.com/stnolting/neorv32/pull/231) |
| 03.12.2021 | 1.6.4.4 | :bug: fixed bug in **Wishbone** bus interface: timeout configurations (via `MEM_EXT_TIMEOUT` generic) that are a power of two (e.g. 256) caused _immediate_ timeouts; timeout counter was one bit short; same problem for processor-internal bus monitor (BUSKEEPER); see [PR #230](https://github.com/stnolting/neorv32/pull/230) |
| 02.12.2021 | 1.6.4.3 | :warning: removed legacy software compatibility wrappers (`sw/lib/include/neorv32_legacy.h` and `neorv32_uart_*` functions) |
| 28.11.2021 | 1.6.4.2 | :bug: fixed bug in **UART[0/1]** overrun flag (was not set/cleared correctly); fixed bug in UART0 enable function `neorv32_uart0_enable()` |
| 28.11.2021 | 1.6.4.1 | (:warning:) bootloader now stores executable in _little-endian_ byte-order to SPI flash |
| 26.11.2021 | [**:rocket:1.6.4**](https://github.com/stnolting/neorv32/releases/tag/v1.6.4) | **New release** |
| 22.11.2021 | 1.6.3.11 | on-chip debugger: reworked JTAG signal input/output synchronization logic (see [PR #216](https://github.com/stnolting/neorv32/pull/216)) |
| 22.11.2021 | 1.6.3.10 | reworked **TRNG** (less hardware requirements, improved quality), see [PR #212](https://github.com/stnolting/neorv32/pull/212) and [stnolting/neoTRNG](https://github.com/stnolting/neoTRNG) |
| 21.11.2021 | 1.6.3.9 | minor rtl edits: configuring an IMEM or DMEM size (`MEM_INT_IMEM_SIZE` / `MEM_INT_DMEM_SIZE` generic) of 0 will now exclude the according memory from synthesis (and also clears the according `NEORV32_SYSINFO.SOC` flags) |
| 18.11.2021 | 1.6.3.8 | TWI: removed TWI_CTRL_CKSTEN flag (enable clock stretching) from control registers, clock-stretching is now _always_ enabled |
| 14.11.2021 | 1.6.3.7 | major control unit and ALU logic optimizations, reduced hardware footprint; :lock: closed further illegal instruction encoding holes (system environment instructions, ALU and ALU-immediate instructions, FENCE instructions); [PR #204](https://github.com/stnolting/neorv32/pull/204) |
| 10.11.2021 | 1.6.3.6 | optimized BUSKEEPER: removed redundant logic - bus keeper now also shows an external interface access timeout (if implemented) as "timeout error"; removed _BUSKEEPER_ERR_SRC_ status flag; :warning: added `err_o` (fault access operation) to the custom functions subsystem (CFS) |
| 09.11.2021 | 1.6.3.5 | :warning: reworked IRQ trigger logic of SPI, TWI, UART0, UART1, NELOED and SLINK; FIRQs now only trigger **once** when the programmed interrupt condition is met instead of triggering **all the time** (see [PR #202](https://github.com/stnolting/neorv32/pull/202)) |
| 06.11.2021 | 1.6.3.4 | :bug: fixed bug in **WISHBONE** interface: _pipelined_ Wishbone mode did not clear STB after first transfer cycle |
| 05.11.2021 | 1.6.3.3 | :bug: fixed bug in general purpose timer **GPTMR** - clock prescaler had no effect, the timer was always counting at full processor clock speed; minor watchdog (WDT) code edits |
| 04.11.2021 | 1.6.3.2 | added optional _alternative_ IMEM and DMEM architecture-only design files (in `rtl/core/mem`); these are not device-specific ("cyclone 2") as they do not use any FPGA-specific primitives or macros - just a different HDL style for describing memories is used (see [PR #192](https://github.com/stnolting/neorv32/pull/198) and [Issue #197](https://github.com/stnolting/neorv32/issues/197)) |
| 03.11.2021 | 1.6.3.1 | :sparkles: added new peripheral module - **General Purpose 32-bit Timer `GPTMR`** ([see PR #195](https://github.com/stnolting/neorv32/pull/195)) |
| 02.11.2021 | [**:rocket:1.6.3**](https://github.com/stnolting/neorv32/releases/tag/v1.6.3) | **New release** |
| 01.11.2021 | 1.6.2.13 | added new top generics to explicitly control implementation of `Zicntr` (CPU base counters) and `Zihpm` (hardware performance monitors, see [PR #192](https://github.com/stnolting/neorv32/pull/192) |
| 30.10.2021 | 1.6.2.12 | :sparkles: :lock: added memory-mapped register to BUSKEEPER module - software can now retrieve the actual cause of an instruction / data-load / data-store bus access fault exception (access timeout or device error); see [PR #191](https://github.com/stnolting/neorv32/pull/191) |
| 28.10.2021 | 1.6.2.11 | :sparkles: added `Zba` bit-manipulation sub-extension; :warning: removed configuration option for `B` sub-extensions: removed `CPU_EXTENSION_RISCV_Zbb` generic and according SYSINFO flag, added new `CPU_EXTENSION_RISCV_B` generic (to implement bit-manipulation `B` ISA extension with _all_ currently supported subsets), see [PR #190](https://github.com/stnolting/neorv32/pull/190) |
| 27.10.2021 | 1.6.2.10 | :bug: CPU control unit: fixed _imprecise_ illegal instruction exceptions - `MEPC` and `MTAVL` did not reflect the correct exception-causing data for illegal ALU-class (non-multi-cycle like `SUB`) operations; optimized critical path of exception logic (illegal compressed instruction detection) |
| 27.10.2021 | 1.6.2.9 | CPU control unit: minor logic optimization - `fence.i` instruction needs 1 cycle less to execute, reduced HW footprint of control engine, shortened CPU's critical path (PC update logic) |
| 26.10.2021 | 1.6.2.8 | :bug: bootloader: fixed bug in stack pointer initialization (introduced in version `1.6.2.7`); minor SPI unit VHDL code clean-up |
| 24.10.2021 | 1.6.2.7 | minor control unit fixes (add logic to check both half-words of a unaligned 32-bit instruction did not cause any bus exceptions); minor ALU logic optimization; optimized `ctr0.S`: bootloader stack pointer initialization (is now done based on the actual physical memory configuration) - bootloader is now even more independent of the actual platform configuration |
| 24.10.2021 | 1.6.2.6 | :bug: **fixed HW bug** introduced in version `1.6.2.4` (write access arbitration in BUSMUX) |
| 21.10.2021 | 1.6.2.5 | minor code edits; improved stability of UART receiver's start-bit detection (more "spike"-resistant) |
| 21.10.2021 | 1.6.2.4 | minor VHDL code fixes, clean-ups, optimizations and comment typo fixes (:lipstick:) |
| 20.10.2021 | 1.6.2.3 | SPI: minor VHDL code optimization and clean-up; NOTE: all serial interfaces (SPI, TWI, UARTs) allow to terminate a running transmission by clearing the enable flag in the module's control register |
| 18.10.2021 | 1.6.2.2 | :bug: `*_reduce_f` VHDL functions did not work for single-bit operands (see [PR #186](https://github.com/stnolting/neorv32/pull/186)) |
| 18.10.2021 | 1.6.2.1 | :sparkles: SPI: added option to configure _clock polarity_ - the SPI module now supports all standard clock modes (0,1,2,3) (see [PR #185](https://github.com/stnolting/neorv32/pull/185)); logic optimization of SPI module |
| 17.10.2021 | [**:rocket:1.6.2**](https://github.com/stnolting/neorv32/releases/tag/v1.6.2) | **New release** |
| 17.10.2021 | 1.6.1.13 | :warning: :warning: main software makefile: modified behavior of `MARCH` and `MABI` variables - the `-march` and `-mabi` flags are no longer required/allowed (example: overriding makefile's default `MARCH` is now done using `make MARCH=rv32imac ...`) ([see PR #184](https://github.com/stnolting/neorv32/pull/184)) |
| 15.10.2021 | 1.6.1.12 | :warning: Custom Functions Subsystem (CFS): removed `sleep` input (indicating CPU is in sleep mode); minor CPU control logic optimization |
| 15.10.2021 | 1.6.1.11 | :sparkles: UARTs: added optional configurable RX and TX FIFOs, added fine-grained RX/TX IRQ configuration options (see [PR #183](https://github.com/stnolting/neorv32/pull/183)) |
| 14.10.2021 | 1.6.1.10 | :sparkles: SLINK: added fine-grained, per-link interrupt configuration (see [PR #182](https://github.com/stnolting/neorv32/pull/182)) |
| 13.10.2021 | 1.6.1.9 | :sparkles: NEOLED: added new control register bit _NEOLED_CTRL_IRQ_CONF_ to configure IRQ condition: `0` = IRQ if FIFO is less than half-full, `1` = IRQ if FIFO is empty; :information_source: IRQ behavior is fully backwards compatible if _NEOLED_CTRL_IRQ_CONF_ is ignored (kept zero) |
| 12.10.2021 | 1.6.1.8 | added dedicated `half_o` signal to FIFO component (FIFO _at least_ half-full), simplifies half-full test logic in FIFO-utilizing modules (area footprint and critical path); minor logic/hardware optimization of NEOLED module |
| 09.10.2021 | 1.6.1.7 | :warning: reworked _fast interrupt requests_ (FIRQ) CPU interrupt system: fast interrupt requests are now also high-level-triggered (like the RISC-V standard interrupts) and stay asserted until explicitly acknowledged by software ([PR #176](https://github.com/stnolting/neorv32/pull/176)) |
| 06.10.2021 | 1.6.1.6 | :bug: fixed bugs in signal assignments and processor configuration of `setups/radiant/UPduino_v3` setup; minor CPU HPM counter fix (architecture condition for "multi-cycle ALU wait cycle" HPM event) |
| 05.10.2021 | 1.6.1.5 | :sparkles: :lock: the CPU now ensures that _all_ illegal instructions _do not commit_ any potential architecture state changes (like writing registers or triggering memory accesses); CPU logic optimization (smaller footprint) |
| 04.10.2021 | 1.6.1.4 | moved CPU's comparator logic from register file unit to ALU unit (to allow easier replacement of register file design unit by technology-optimized one) |
| 03.10.2021 | 1.6.1.3 | :bug: fixed UART signal connection in `rtl/system_integration` wrappers |
| 01.10.2021 | 1.6.1.2 | :warning: removed `mstatus.TW` (timeout wait) bit, `wfi` instruction is now always allowed to be executed in less-privileged modes; minor CPU control unit logic optimizations |
| 01.10.2021 | 1.6.1.1 | on-chip-debugger: `wfi` instruction acts as a simple `nop` when _in_ debug mode or during single-stepping |
| 28.09.2021 | [**:rocket:1.6.1**](https://github.com/stnolting/neorv32/releases/tag/v1.6.1) | **New release** |
| 28.09.2021 | 1.6.0.13 | :bug: fixed elementary bug in MTIME comparator logic (interrupt condition `mtime >= mtimecmp` was not always evaluated correctly) |
| 28.09.2021 | 1.6.0.12 | fixed CPU's IRQ prioritization: (re-)enter debug mode interrupts have to be evaluated _before_ all other interrupts |
| 27.09.2021 | 1.6.0.11 | :warning: `Zifencei` extension is _required_ for the on-chip debugger; executing `fence.i` without having the `Zifencei` extension enabled will now cause an illegal instruction exception |
| 22.09.2021 | 1.6.0.10 | reworked CPU/software handshake of external interrupt controller `XIRQ` to avoid "external IRQ -> CPU IRQ" race conditions |
| 22.09.2021 | 1.6.0.9 | if `CPU_CNT_WIDTH` generic (actual width of `[m]cycle` and `[m]instret` counters) is less than 64 the remaining bits are now just hardwired to zero ignoring any write access instead of causing an exception; minor CPU hardware optimizations |
| 22.09.2021 | 1.6.0.8 | :bug: fixed bug introduced in previous version: misaligned instruction address - PC and all instruction address-related registers need to have bit 0 hardwired to zero, misaligned instructions can only appear if NOT using `C` ISA extension |
| 21.09.2021 | 1.6.0.7 | :warning: **reworked CPU trap/exception system** (in order to comply with RISC-V specs.): removed non-maskable interrupt (`NMI`, top signal `nm_irq_i`); reworked CPU trap prioritization (sync before async); RISC-V interrupts (`MTI`, `MSI`, `MEI`) are now high-level-triggered and require to stay asserted until they are explicitly acknowledged; fixed minor bug in misaligned instruction check logic (PC(0) = '1' will always cause a misalignment exception); updated trap/interrupt-related documentation |
| 20.09.2021 | 1.6.0.6 | the NEORV32's `misa`, `mip` and `mtval` CSRs are _read-only_; however, write accesses to these CSRs _do not raise an illegal instruction exception_ (anymore) to be compatible to the RISC-V specs. |
| 19.09.2021 | 1.6.0.5 | added `menvcfg[h]` CSRs (only available if `U` ISA extension is enabled; not used yet - hardwired to zero, but required by RISC-V spec.) |
| 18.09.2021 | 1.6.0.4 | :warning: :warning: **major change** modified low-level hardware access (memory-mapped registers) [PR #158](https://github.com/stnolting/neorv32/pull/158): now using `struct`-based access concept (IO module = `struct`, interface registers = members of struct) instead of `#define` single-pointers (inspired by https://blog.feabhas.com/2019/01/peripheral-register-access-using-c-structs-part-1/), format: `NEORV32_<module_name>.<register_name>`; renamed all control registers and bits from `*CT*` to `*CTRL*`; added `sw/lib/include/neorv32_legacy.h` compatibility layer (maps deprecated "defines" to according struct registers, provides old control register/bit names, _do not use for new designs!_) |
| 16.09.2021 | 1.6.0.3 | :bug: fixed another missing IRQ signal connection (NMI) in `system_integration` wrappers |
| 15.09.2021 | 1.6.0.2 | :warning: **split** processor-internal memory VHDL sources (IMEM and DMEM) into separated files ([#151](https://github.com/stnolting/neorv32/pull/151)): entity-only (`rtl/core/neorv32_*mem.entity.vhd`) and _default_ architecture-only (`rtl/core/mem/neorv32_*mem.default.vhd`); allows easy replacement by optimized platform-specific architectures |
| 13.09.2021 | 1.6.0.1 | :bug: fixed missing IRQ signal assignments (MSW and XIRQ) in AXI4-lite top wrapper |
| 11.09.2021 | [**:rocket:1.6.0**](https://github.com/stnolting/neorv32/releases/tag/v1.6.0) | **New release** |
| 11.09.2021 | 1.5.9.9 | removed `mstatus.SD` flag (is always 0 for `Zfinx` extension as the current state is already defined entirely by the `x` register file); tied `mstatus.fs` as it must not affect trapping of `Zfinx` instructions (according to RISC-V specs.) |
| 09.09.2021 | 1.5.9.8 | added flags to `SYSINFO` module to determine configuration of `FAST_MUL_EN` and `FAST_SHIFT_EN` generics by software |
| 09.09.2021 | 1.5.9.7 | `FAST_SHIFT_EN` option will now also implement full-parallel computation logic (like barel shifters) for _all_ `Zbb` shift-related instructions (population count, count leading/trailing zeros, circular shifts) |
| 08.09.2021 | 1.5.9.6 | :sparkles: added support for RISC-V `Zbb` CPU extension (**basic bit-manipulation operations**), enabled via new top generic `CPU_EXTENSION_RISCV_Zbb`; added example software project providing a `Zbb` "intrinsic" library |
| 08.09.2021 | 1.5.9.5 | :bug: fixed missing `flash_sdi_i` in Radiant-related example setups and processor wrappers |
| 19.08.2021 | 1.5.9.4 | :warning: removed custom `mzext` CPU CSR, moved all information flags to new `SYSINFO_CPU` register in the system information memory module (`SYSINFO`) |
| 19.08.2021 | 1.5.9.3 | :warning: removed top's `USER_CODE` generic |
| 18.08.2021 | 1.5.9.2 | fixed `Zifencei` test of `riscv-arch-test` port |
| 16.08.2021 | 1.5.9.2 | minor CPU control logic optimizations |
| 15.08.2021 | 1.5.9.1 | :bug: fixed bug in `mret` instruction that caused an exception if user mode was not implemented (bug caused by modifications in v1.5.8.8) |
| 14.08.2021 | 1.5.9.0 | Added new designated test setups: [`rtl/test_setups`](https://github.com/stnolting/neorv32/tree/main/rtl/test_setups), :books: [_UG: General Hardware Setup_](https://stnolting.github.io/neorv32/ug/#_general_hardware_setup) |
| 13.08.2021 | [**:rocket:1.5.9**](https://github.com/stnolting/neorv32/releases/tag/v1.5.9) | **New release** |
| 08.08.2021 | 1.5.8.9 | reworked CPU register file logic: any write access to `x0` will be masked to actually write zero - no special treatment by the CPU control unit required anymore; slightly less hardware resources required; first instruction after hardware reset should write `x0` (_any_ value; implemented in start-up code `crt0.S`) |
| 07.08.2021 | 1.5.8.8 | :bug: fixed bug in execution (trapping) of `xRET` instructions: `dret` (return from debug-mode handler) has to raise an illegal instruction exception if executed outside of debug-mode, `mret` (return from machine-mode handler) has to raise an illegal instruction exception if executed in lower-privileged modes (lower than machine-mode) |
| 05.08.2021 | 1.5.8.7 | :sparkles: added `mstatus.FS` and `mstatus.SD` CSR bits: control the state of the FPU (`Zfinx`) extension; supported states for `mstatus.FS`: `00` = _off_, `11` = _dirty_; writing other states will always set _dirty_ state; note that all FPU instructions including FPU CSR access instructions will raise an illegal instruction exception if `mstatus.FS` = _off_ |
| 03.08.2021 | 1.5.8.6 | :bug: fixed bug in linker script [#134](https://github.com/stnolting/neorv32/issues/134): `.rodata.*` "sub"-sections were missing, caused wrong linking of implicit constants (like strings); added `mconfigptr` CSR (RISC-V priv. ISA spec. v1.12-draft ;read-only): holds a pointer to a platform/system configuration structure - not actually used yet |
| 30.07.2021 | 1.5.8.5 | fixed minor bug in top entity / AXI4 wrapper (Vivado "issue": generic defaults need a _fixed-size_ initialization value) [#113](https://github.com/stnolting/neorv32/issues/133) |
| 26.07.2021 | 1.5.8.4 | :bug: **fixed major bug in CPU interrupt system**: interrupts during memory accesses (load/store instruction) terminated those memory accesses violating the crucial "instruction atomicity" concept: traps (interrupts and exceptions) must only intervene _between_ instructions |
| 25.07.2021 | 1.5.8.3 | :sparkles: added `mstauts.TW` CSR flag (when set executing `wfi` instruction outside of machine-mode will raise an illegal instruction exception); flag is hardwired to zero if user mode is not implemented |
| 25.07.2021 | 1.5.8.2 | :bug: fixed bug in `E` ISA extension: extension could not be enabled due to missing generic propagation; clean-up of generic defaults: only the processor top entity provides defaults for the configuration generics |
| 24.07.2021 | 1.5.8.1 | machine-level interrupts (top entity signals; "external" `mext_irq_i`, "software" `msw_irq_i`, "mtime" `mtime_irq_i` and "non-maskable" `nm_irq_i`) now trigger on rising edges; exposed advanced external bus interface configuration options as new top entity generics (moved from package constants): `MEM_EXT_PIPE_MODE`, `MEM_EXT_BIG_ENDIAN`, `MEM_EXT_ASYNC_RX` |
| 22.07.2021 | [**:rocket:1.5.8**](https://github.com/stnolting/neorv32/releases/tag/v1.5.8) | **New release** |
| 22.07.2021 | 1.5.7.16 | (re-)added `mstatush` CSR (all bits are hardwired to zero: writes are ignored, reads will always return zero) - CSR address is assigned to comply with RISC-V priv. arch. spec. 1.12 |
| 21.07.2021 | 1.5.7.15 | :bug: fixed minor bug in SLINK module (signals were missing in sensitivity lists); :warning: simplified NEOLED interrupt system (now triggered if TX FIFO fill level falls below half-full), added option to send LED strobe command ("RESET"), added FIFO status signals to status register, simplified FIFO access logic, added new top generic `IO_NEOLED_TX_FIFO` to configure NEOLED FIFO depth |
| 18.07.2021 | 1.5.7.14 | exposed new generic `CPU_IPB_ENTRIES` to configure size of CPU instruction prefetch buffer |
| 18.07.2021 | 1.5.7.13 | clean-up of processor top entity: using more sophisticated default values for all input signals and generics (all generics are "off" by default; input signals use `L` for control lines and `U` for data lines by default) |
| 14.07.2021 | 1.5.7.12 | reworked SLINK interrupt concept (now using FIFO fill level "half-full" as interrupt condition, see [#122](https://github.com/stnolting/neorv32/issues/122)); added fill level output to processor FIFO component |
| 09.07.2021 | 1.5.7.11 | :bug: fixed minor bug in FIFO component (mapping might fail if `FIFO_DEPTH` = 1); fixed broken `sw/example/demo_freeRTOS` makefile (all freeRTOS includes were missing) |
| 03.07.2021 | 1.5.7.10 | :sparkles: added new component: **External Interrupt Controller (XIRQ)**: up to 32 external interrupt channels `xirq_i` (via `XIRQ_NUM_CH` generic), configurable trigger (via `XIRQ_TRIGGER_TYPE` and `XIRQ_TRIGGER_POLARITY` generics), prioritized or non-prioritized servicing |
| 02.07.2021 | 1.5.7.9 | relocated base addresses of watchdog timer (WDT) and true-random number generator (TRNG); removed CPU's `firq_ack_o` signal (was not used at all) |
| 30.06.2021 | 1.5.7.8 | :warning: increased GPIO port size from 32-bit to 64-bit; relocated GPIO base address; removed GPIO.input pin-change interrupt |
| 29.06.2021 | 1.5.7.7 | :sparkles: added new processor module **stream link interface (SLINK)**: up to 8 individual RX and TX stream links, compatible to AXI4-Stream base protocol; added software driver files; added documentation |
| 27.06.2021 | 1.5.7.6 | :bug: fixed bug in CFS (custom functions subsystem) address map layout |
| 27.06.2021 | 1.5.7.5 | :warning: removed numerically-controlled oscillator (NCO, `neorv32_nco.vhd`) module as it appears to be an over-engineered clock-generator without many use cases (if you really need this module, you can wrap it within the custom functions subsystem CFS) |
| 27.06.2021 | 1.5.7.4 | :warning: removed top's fast IRQ (FIRQ) inputs `soc_firq_i`: the FIRQs are reserved for processor-internal usage only, use the `mext_irq_i` RISC-V external interrupt signal for all external interrupt applications (via dedicated interrupt controller), a follow-up version of the project will introduce a customizable external interrupt controller; sourced-out FIFOs into new HDL component `neorv32_fifo.vhd` |
| 26.06.2021 | 1.5.7.3 | edit of v1.5.7.2: RISC-V spec claims to leave destination registers of trapping load operation unchanged (do _not_ set to zero); minor CPU control logic optimizations; :sparkles: reworked bootloader to provide several new configuration and customization options |
| 25.06.2021 | 1.5.7.2 | optimized instruction execution FSM: less hardware utilization, :lock: now _ensures_ to write ZERO to destination register if there is an exception during a load operation; made default bootloader even more HW configuration independent (GPIO, SPI and MTIME are optional; UART is optional but highly recommended); |
| 24.06.2021 | 1.5.7.1 | :sparkles: added RISC-V `Zmmul` ISA extension (via `CPU_EXTENSION_RISCV_Zmmul` generic; default = _false_): implements only the integer multiplication instructions sub-set of the `M` extension; for size-constrained setups, requires ~50% less hardware resources than the `M` extension |
| 23.06.2021 | [**:rocket:1.5.7**](https://github.com/stnolting/neorv32/releases/tag/v1.5.7) | **New release** _one year NEORV32!_ :tada: |
| 21.06.2021 | 1.5.6.14 | :bug: fixed bug in debugger "park loop": `fence.i` instruction was missing before executing the DM's program buffer - this caused execution of outdated instructions from the program buffer if the **instruction cache** is implemented |
| 21.06.2021 | 1.5.6.13 | removed `TINY_SHIFT_EN` generic; clean-up of CPU co-processor system: removed "dummy co-processor" for CSR read access, moved CPU shifter core into new co-processor; simplified default (bit-serial) shifter logic (single bit-shifts only) and multi-cycl instructions decode logic |
| 18.06.2021 | 1.5.6.12 | clean-up of CPU co-processor system (removed unused co-processor slots 4,5,6,7) |
| 15.06.2021 | 1.5.6.11 | made bootloader more configuration-independent: bootloader now only uses the first 512 bytes of internal/external DMEM for runtime data - hence, the DMEM size is not further relevant as long as it greater than or equal to 512 bytes |
| 14.06.2021 | 1.5.6.10 | :sparkles: physical size of bootloader ROM (BOOTROM) is automatically determined during synthesis based on the size of the initialization image, max physical size is 32kB; simplified BOOTROM access check logic; added size check when using IMEM as ROM (check if application image fits); simplified linker script: _logical_ instruction address space 2GB now, no need to adapt this to hardware configuration, hardware checks if application fits into _physical_ memory size (which configured via generics) |
| 13.06.2021 | 1.5.6.9 | :warning: reworked boot configuration: removed `MEM_INT_IMEM_ROM` and `BOOTLOADER_EN` generics, replaced by single `INT_BOOTLOADER_EN` generic (type boolean): _true_ = implement processor-internal (default) bootloader, implement processor-internal IMEM (if implemented) as RAM; _false_ = boot from processor-internal IMEM implemented (if enabled) as pre-intialized ROM; reworked IMEM, DMEM and BOOTROM memory architecture; reworked image generator and generated application image files (now using unconstrained array as init images + unified array/memory types) |
| 12.06.2021 | 1.5.6.8 | :bug: fixed bug in instruction cache (cache controller might have missed resync/"clear-and-reload" requests from `fence.i` instructions); minor project/repo clean-ups |
| 08.06.2021 | 1.5.6.7 | clean-up of Wishbone interface module (dead code removal); added new package constant `wb_rx_buffer_c` to configure SYNC (default) or ASYNC Wishbone RX path (allows trade-off between performance/latency and timing closure) |
| 06.06.2021 | 1.5.6.6 | :bug: fixed bug in PWM base address configuration; :warning: removed user-access HPM counter access via `hpmcounter3[h]`:`hpmcounter3[h]` CSRs, hardwired according `mcounteren` bits to zero: HPM can only be used in machine mode; reworded 64-bit counters (`cycle`, `instret`, `hpmcounter` + `mtime`) overflow logic: now using dedicated CARRY chain instead of overflow detector (can improve timing); |
| 05.06.2021 | 1.5.6.5 | removed debug mode's `stepie` flag (used to allow interrupts during single-stepping) as the debugger can emulate interrupts |
| 04.06.2021 | 1.5.6.4 | :warning: removed `IO_PWM_EN` generic, replaced by `IO_PWM_NUM_CH` generic - PWM controller now supports implementation of up to 60 channels via `IO_PWM_NUM_CH` (`IO_PWM_NUM_CH` = 0 will omit the PWM controller); :bug: fixed minor bug in `minstreth` counter logic |
| 04.06.2021 | 1.5.6.3 | :warning: increased processor-internal IO size from 256 bytes to 512 bytes; relocated base address of CFS |
| 03.06.2021 | 1.5.6.2 | :warning: The `B` ISA extension (bit manipulation) has been (temporarily) removed from the project. See [B ISA Extension](https://github.com/stnolting/neorv32/projects/7) project board. |
| 03.06.2021 | 1.5.6.1 | CPU/HPM counter size configuration (`CPU_CNT_WIDTH` and `HPM_CNT_WIDTH` generics) can now be 0-bit (no counters implemented at all) to 64-bit (full-scale / RISC-V standard) wide |
| 01.06.2021 | **:rocket:1.5.6.0** | **New release** |
| 01.06.2021 | 1.5.5.13 | :warning: fixed project's endianness inconsistency (issue [#50](https://github.com/stnolting/neorv32/issues/50)) - CPU and processor are **little-endian**; changed image generator (`sw/image_gen`) and bootloader to generate/use little-endian executables; external memory interface is little-endian by default; removed `mstatus.ube` bit (reads as zero now); removed `mstatush` CSR |
| 31.05.2021 | 1.5.5.12 | `mret` instruction now clears `mstatus.mpp` (according to _new_ RISC-V privileged specs.) |
| 31.05.2021 | 1.5.5.11 | :warning: `mtval` CSR is now read-only; a write access will raise an illegal instruction exception |
| 30.05.2021 | 1.5.5.10 | :bug: fixed bug in processor's reset system (system reset stuck at `0` if on-chip debugger not implemented); reworked processor's reset generator system; VHDL code clean-up; reworked SoC's bus infrastructure (now using array of records for module bus response) |
| 28.05.2021 | 1.5.5.9 | integrated DBMEM (debug memory) component into DM (debug module); removing now-obsolete `neorv32_debug_dbmem.vhd` component |
| 22.05.2021 | 1.5.5.8 | :sparkles: **on-chip debugger (OCD)**: added debug module (`DM`) component; **OCD is operational now** (but still experimental) |
| 22.05.2021 | 1.5.5.7 | :bug: fixed bug in internal memory monitoring: if accessing an unused address which is not re-directed to the external bus interface (because WISHBONE module is disabled) caused the CPU to stall since that bus access was not correctly monitored and aborted by the BUS_KEEPER |
| 21.05.2021 | 1.5.5.6 | **on-chip debugger**: added debug transport module (`DTM`) component |
| 20.05.2021 | 1.5.5.5 | added system time output `mtime_o` (64-bit) driven by processor-internal _MTIME_ unit (idea [#29](https://github.com/stnolting/neorv32/discussions/29))
| 20.05.2021 | 1.5.5.4 | **on-chip debugger**: added debug memory (`DBMEM`) component |
| 20.05.2021 | 1.5.5.3 | added flag (SYSINFO.FEATURES) to allow software to discover if on-chip debugger is implemented (`SYSINFO_FEATURES_OCD`); added documentation [https://stnolting.github.io/neorv32/#_on_chip_debugger_ocd](https://stnolting.github.io/neorv32/#_on_chip_debugger_ocd) |
| 19.05.2021 | 1.5.5.2 | :sparkles: added **RISC-V CPU Debug Mode**, compatible to [RISC-V debug spec](https://github.com/riscv/riscv-debug-spec); new CSRs: `dcsr`, `dpc`, `dscratch`; new instructions: `dret`; :warning: debug mode is still **work-in-progress** and not operational yet! updated documentation CI [#26](https://github.com/stnolting/neorv32/pull/26), contributed by [umarcor](https://github.com/umarcor) :+1:; `fence.i` will not longer trap if executed but not implemented (`CPU_EXTENSION_RISCV_Zifencei` = false) |
| 13.05.2021 | 1.5.5.1 | added [`UPduino_v3`](https://github.com/stnolting/neorv32/tree/main/boards/UPduino_v3) example setup; renamed signal in watchdog module (`rtl/core/neorv32_wdt.vhd`) - collision with reserved keyword in vhdl-2008 (fixing issue [#24](https://github.com/stnolting/neorv32/issues/24)) |
| 10.05.2021 | **:rocket:1.5.5.0** | **New release** |
| 10.05.2021 | 1.5.4.12 | :warning: `mip` CSR is now read-only (pending IRQs can be cleared by disabling (and re-enabling) the according `mie` bit), writing to `mip` will raise an illegal instruction exception; :sparkles: added non-maskable interrupt (NMI), top entity port `nm_irq_i`; added new NMI to NEORV32 runtime environment |
| 09.05.2021 | 1.5.4.11 | added new flags to `mzext` CSR: *CSR_MZEXT_PMP* (set if at least 1 PMP region is implemented at all), *CSR_MZEXT_HPM* (set if at least 1 HPM counter is implemented) |
| 03.05.2021 | 1.5.4.10 | minor code clean-ups; moved FIRQ synchronization registers to top, removed sync FFs for processor-internal sources; |
| 30.04.2021 | 1.5.4.9 | moved definitions of IO area from `crt0.S` to linker script; reworked CPU's CSR access system - highly reducing area overhead (removing decoding logic of not implemented CSRs by heavily using VHDL's `NULL` statement in `case` constructs) |
| 29.04.2021 | 1.5.4.8 | minor edits in CPU instruction fetch engine; reduced **processor-internal bus timeout** (`max_proc_int_response_time_c`) to 15 cycles; added flag to SYSINGO module (`SYSINFO_FEATURES_HW_RESET`) to check if a dedicated hardware reset of all core register is implemented (via package's `dedicated_reset_c` constant) |
| 28.04.2021 | 1.5.4.7 | :bug: fixed bug in instruction cache (iCACHE) when using two sets - `ICACHE_ASSOCIATIVITY` = 2: cache was corrupting the non-active set |
| 26.04.2021 | 1.5.4.6 | optimized CPU's instruction fetch unit: less overhead for branches, reduced unit's hardware complexity |
| 25.04.2021 | 1.5.4.5 | :sparkles: :warning: removed `cancel` signals from processor-internal bus system; removed CPU's internal bus access timeout counter; added new top generic: `MEM_EXT_TIMEOUT` - type `natural`, default = 255; used to configure optional auto-timeout of Wishbone interface (if an **external** device is not responding within `MEM_EXT_TIMEOUT` clock cycles); set to zero to disable auto-timeout (required to comply with AXI4-Lite specs. when using the top's AXI wrapper) |
| 25.04.2021 | 1.5.4.3 | :sparkles: converted NEORV32.pdf data sheet to [`asciidoc` using asciidoctor](https://asciidoctor.org/); added data sheet sources to [`docs/src_adoc`](https://github.com/stnolting/neorv32/blob/main/docs/src_adoc) |
| 21.04.2021 | 1.5.4.3 | :warning: :bug: reworked *atomic memory access* system due to conceptual design errors: new system will make atomic LR/SC combinations fail when there is a trap (like a context switch) between the two instructions; new system prohibits SC from writing to memory if exclusive access fails; removed top's `wb_tag_i` signal, pruned one bit of top's `wb_tag_o` signal (atomic access), added top's `wb_lock_o` signal; updated sections in NEORV32.pdf regarding atomic memory accesses |
| 19.04.2021 | 1.5.4.1 | added register stage to `MTIME.time` write access to improve timing closure |
| 17.04.2021 | **:rocket:1.5.4.0** | **New release** |
| 16.04.2021 | 1.5.3.13 | :warning: added new top configuration generic `TINY_SHIFT_EN` (type = `boolean`, default = `false`) to configure a tiny single-bit (iterative) shifter for CPU ALU shift operations (for highly area-constrained setups) |
| 16.04.2021 | 1.5.3.12 | :sparkles: reworked reset system of the complete CPU: by default most registers (= "uncritical registers") **do not** provide an initialization via hardware reset; a **defined reset value** can be enabled by setting a constant from the main VHDL package (`rtl/core/neorv32_package.vhd`): `constant dedicated_reset_c : boolean := false;` (set `true` to enable CPU-wide dedicated register reset); see new section "2.11. CPU Hardware Reset" of NEORV32.pdf for more information |
| 14.04.2021 | 1.5.3.11 | minor rtl edits to allow synthesis using [`ghdl-yosys-plugin`](https://github.com/ghdl/ghdl-yosys-plugin) (:construction: work in progress :construction:)
| 13.04.2021 | 1.5.3.10 | :bug: fixed bug when configuring `HPM_CNT_WIDTH` less than 32; :warning: added new generic `CPU_CNT_WIDTH` to configure total size of CPU's `cycle` and `instret` CSRs (default = 64-bit); added `Zxnocnt` (no counters) and `Zxscnt` (small counters) flags to `mzexr` CSR to check if `CPU_CNT_WIDTH` is zero or less than 64, respectively; :bug: fixed bug in `crt0.S` start-up code: stack pointer has to be initialized before an exception can occur; updated `cpu_test` example program |
| 11.04.2021 | 1.5.3.9 | :warning: reworked CPU reset system (in addition to modifications in v1.5.3.7): default reset: most register are "initialized" with '-' (don't care) since no real reset is required; however, a "real" reset can be configured using the packages 'def_rst_val_c' constant that defines the reset value for all "uncritical regsiter" (see new NEORV32.pdf section 2.11 "CPU Hardware Reset") |
| 09.04.2021 | 1.5.3.8 | optimized CPU control: register write back during multi-cycle ALU operation only when result is really available (reducing switching activity; avoids possible source operand corruption); optimized `M` extension's co-processor: multiplications and divisions are 2 cycles faster |
| 08.04.2021 | 1.5.3.7 | :bug: fixed bug in HPM event configuration via `mhpmevent*` CSRs - there was a CSR address decoding overlap between the HPM event CSRs and the machine trap setup CSRs (introduced in version 1.5.3.6); :warning: reworked CPU core CSRs: most CSRs are not reset by hardware and need explicit initialization (done by crt0.S start-up code) |
| 02.04.2021 | 1.5.3.6 | :bug: fixed bug in external memory interface (`neorv32_wishbone.vhd`) that caused bus exceptions when using external memories with very high access latencies (race condition in bus timeouts); VHDL code clean-up |
| 30.03.2021 | 1.5.3.5 | added new top's generic `HPM_CNT_WIDTH` (type `natural`, default=40) to configure the total bit width of the hardware performance monitors (HPM) counter (min 1, max 64); mofified `crt0.S`: stops all counters (incl. HPMs), no user-level access to ANY counter; `neorv32.h`: added missing `mcounteren` and `mcountinhibit` CSR bit definitions |
| 28.03.2021 | 1.5.3.4 | default "test setup" `rtl/top_templetes/neorv32_test_setup.vhd`: disabled PMP, implementing 4 HPM counters; :sparkles: added [`boards`](https://github.com/stnolting/neorv32/tree/main/boards) folder for exemplary FPGA setups |
| 27.03.2021 | 1.5.3.3 | minor optimization in CPU control engine; FPU comparator now uses comparators results from main ALU (reduces FPU hardware footprint) |
| 26.03.2021 | 1.5.3.2 | :sparkles: **added single-precision floating-point unit (FPU)** `rtl/core/neorv32_cpu_cp_fpu.vhd` implementing the `Zfinx` CPU extension; added/updated `Zfinx` **intrinsic library** and verification framework: [`sw/example/floating_point_test`](https://github.com/stnolting/neorv32/tree/main/sw/example/floating_point_test); added co-processor timeout counter to CPU to auto-terminate co-processor operations (for debugging only; defaullt=deactivated) |
| 25.03.2021 | 1.5.3.1 | :bug: fixed bug in invalid floating-point instruction detection (caused CPU to stall if executing an invalid floating-point operation); intrinsic core library (mainly used for not-yet-supported CPU extensions like `B` and `Zfinx`): clean-up, added R3 instruction type |
| 24.03.2021 | **:rocket:1.5.3.0** | **New release** |
| 23.03.2021 | 1.5.2.9 | :sparkles: added new top generic to enable single-precision floating-point extensions `Zfinx`: `CPU_EXTENSION_RISCV_Zfinx` - type `boolean`, default = `false`, :warning: **extension is not yet operational!**; bootloader now shows available `Z*` extensions (from `mzext` CSR) like `Zifencei` |
| 21.03.2021 | 1.5.2.8 | :bug: fixed problem with linking `math.h` library in makefile; added floating-point-related global definitions to main VHDL package; added intrinsic core library file `sw/lib/include/neorv32_intrinsics.h` - intrinsic library support for CPU extensions, which are not yet supported by the upstream GCC, are based on this |
| 18.03.2021 | 1.5.2.7 | :bug: fixed bug in `sw/common/crt0.S` dummy exception handler (wrong order of register push/pop); changed upcoming floating-point extension (originally `F` extension) to `Zfinx` extension (-> [RISC-V `Zfinx` spec](https://github.com/riscv/riscv-zfinx)) - updated CPU infrastructure |
| 16.03.2021 | 1.5.2.6 | reworked atomic/exclusive memory access interface: removed CPU's `d_bus_lock_o` and `i_bus_lock_o` signal (was always zero anyway); removed top's `wb_lock_o` signal; added *exclusive access request* to Wishbone tag signal `wb_tag_o` (is now one bit wider); added more details to NEORV32.pdf regarding excluisve/atomic memory accesses (interface/protocol) |
| 09.03.2021 | 1.5.2.5 | added bit-manipulation `Zba` sub-extension (shifted-adds: `SH1ADD` `SH2ADD` `SH3ADD`) |
| 07.03.2021 | 1.5.2.4 | :sparkles: added new IO/peripheral module: **Smart LED Interface (NEOLED)** to interface intelligent LEDs (WS2812/WS2811/NeoPixel(c) compatible; supports RGB and RGBW LEDs in *parallel*) with internal TX buffer; new top generics: `IO_NEOLED_EN`: implement NEOLED interface when *true*; new top signals: `neoled_o`: single-wire async. serial data interface; FIFO re-fill interrupt via *fast interrupt request channel 9* `FIRQ9`; added new "NEOLED" section to data sheet; added SW driver library and simple NEOLED example program (`sw/example/demo_neopixel`) |
| 06.03.2021 | 1.5.2.3 | clean-up of CPU control code: fixed minor bug in F-exension's instruction decoding; changed coding style for CSR write access (old version might have caused "inferring latch..." warning in Intel Quartus); fixed default values for CSRs when according extensions are disabled |
| 04.03.2021 | 1.5.2.2 | added two new generics to configure CFS IO conduit sizes (implementing [issue #13](https://github.com/stnolting/neorv32/issues/13)): `IO_CFS_IN_SIZE` - type: `positive`, configures the size of `cfs_in_i` signal; `IO_CFS_OUT_SIZE` - type: `positive`, configures the size of `cfs_out_o` signal; minor edits to floating-point CPU infrastructure |
| 03.03.2021 | 1.5.2.1 | added CPU core infrastructure for *upcoming* single-precision floating-point extension `F`; :warning: **floating-point extension is NOT OPERATIONAL YET!** added new rtl file for the floating-point unit `rtl/core/neorv32_cpu_cp_fpu.vhd` (blank template!) |
| 01.03.2021 | **:rocket:1.5.2.0** | **New release** |
| 27.02.2021 | 1.5.1.11 | :bug: fixed several small bugs in *bitmanipulation extension* instruction decoding (not all `B` instructions triggered and *illegal instruction exception* when B-extension = disabled) |
| 25.02.2021 | 1.5.1.10 | :bug: fixed bugs in UART RTS/CTS hardware control flow - the new setup was verified on real hardware; added double-buffering to UART RX engine |
| 24.02.2021 | 1.5.1.9 | `mcounteren` CSR is hardwired to zero if user mode is not implemented (`CPU_EXTENSION_RISCV_U` = false); added `Zbs` (single-bit operations) sub-extension to bit-manipulation unit |
| 22.02.2021 | 1.5.1.8 | added programmable *RTS/CTS hardware flow control* to UARTs; new top signals: `uart0_rts_o`, `uart0_cts_i`, `uart1_rts_o`, `uart1_cts_i`; UART.TX engine will only start sending (if `CTS` flow control is activated) if `uart*_cts_i` is asserted (low-active); UART.RX engine signals (if `RTS` flow control is activated) via `uart*_rts_o` if it is ready to receive new data (low-active); added hw flow control parameter to uart setup functions `neorv32_uart*_setup()` |
| 20.02.2021 | 1.5.1.7 | removed `err_o` signal from custom functions subsystem `CFS`; processor *SoC fast interrupt input* `soc_firq_i` reduced to 6 channels (was 8) - mapped to CPU's `FIRQ_10` - `FIRQ_15`; added individual fast IRQs for `UART1` "RX complete" and "TX complete" conditions (-> `FIRQ_4` & `FIRQ_5`); changed FIRQ channels of TWI/SPI/GPIO interrupts |
| 18.02.2021 | 1.5.1.6 | added register buffer for enable signals to processor-internal clock generator; :bug: fixed bug in `sw/example/demo_twi` program: TWI clock speed message was wrong (factor 1/4 was missing) |
| 17.02.2021 | 1.5.1.5 | added a second independent UART: new UART is *secondary UART* `UART0`, the "old" UART is now the *primary UART* `UART0`; by default the **primary UART (UART0) is used for all user interface connection**; reworked *fast interrupt* `FIRQ` assignment/priority list - added UART1 RTX (receive *or* send done) fast interrupt; added hardware driver functions for new `UART1` - the "old" `neorv32_uart_*` function calls will map to the primary UART `UART0` for compatibility; renamed compiler flag to enable UART "simulation mode": `UART_SIM_MODE` -> `UART0_SIM_MODE` for primary UART, `UART1_SIM_MODE` for secondary UART (`UART_SIM_MODE` is still supported for compatibility and maps to `UART0_SIM_MODE`); added second simulation UART receiver for `UART1` to testbench; renamed UART simulation output files: `neorv32.testbench_uart.out` -> `neorv32.testbench_uart0.out` (testbench UART0 receiver), new: `neorv32.testbench_uart1.out` (testbench UART1 receiver), `neorv32.uart.sim_mode.text.out` and `neorv32.uart.sim_mode.data.out` -> `neorv32.uart0.sim_mode.text.out` and `neorv32.uart0.sim_mode.data.out` (for `UART0`), new `neorv32.uart1.sim_mode.text.out` and `neorv32.uart1.sim_mode.data.out` (for `UART1`) |
| 13.02.2021 | 1.5.1.4 | `HW_THREAD_ID` generic is now of type `natural`; `mret` instruction now requires an additional cycle to execute; logic optimization of CPU's control logic -> smaller hardware footprint and higher f_max; updated CPU synthesis results; removed top module's generic initialization using `(others => '0')` (targeting [issue #8](https://github.com/stnolting/neorv32/issues/8)) |
| 09.02.2021 | 1.5.1.3 | modified CPU architecture: now using a "pseudo" ALU co-processor to get the result of a CSR read operation into data path, removing one input from register file input mux -> shorter critical path |
| 08.02.2021 | 1.5.1.2 | added new peripheral/IO module: **Numerically-Controlled Oscillator `NCO`**: three independent channels, 20-bit phase accu, 20-bit tuning word, fixed 50% duty cycle mode or pulsed mode; added according HW drivers and example program |
| 07.02.2021 | **:rocket:1.5.1.0** | **New release** |
| 05.02.2021 | 1.5.0.11 | :bug: fixed error in atomic instruction `LR.W` |
| 05.02.2021 | 1.5.0.10 | CPU now provides 16 fast interrupt request lines (`FIRQ0 .. FIRQ15`) with according `mie`/`mip` CSR bits and `mcause` trap codes; removed IRQ enable flags from SPI, UART & TWI; reworked processor-internal interrupt system - assignment/priority list; UART now features individual IRQs for "RX-done" and "TX-done" conditions; changed bit order in TWI control register |
| 29.01.2021 | 1.5.0.9 | removed custom function units `CFU0` & `CFU1`; :sparkles: replaced them by new *Custom Functions Subsystem `CFS`*, which provides up to 32x32-bit memory-mapped registers; new configuration generics: `IO_CFS_EN`, `IO_CFS_CONFIG`; new top entity signals: `cfs_in_i`, `cfs_out_o`; increased processor's IO area from 128 bytes to 256 bytes, now starting at `0xFFFFFF00` |
| 28.01.2021 | 1.5.0.8 | added *critical limit* for number of implemented PMP regions: When implementing more PMP regions that a certain critical limit an additional register stage is automatically inserted into the CPUs memory interfaces increasing the latency of instruction fetches and data access by +1 cycle. The critical limit can be adapted for custom use by a constant from the main VHDL package file (rtl/core/neorv32_package.vhd). The default value is 8: `constant pmp_num_regions_critical_c : natural := 8;` |
| 27.01.2021 | 1.5.0.7 | added four additional *fast interrupt* channels `FIRQ4..7`, available via processor's top `soc_firq_i(3:0)` signal for custom platform use; fixed minor error in UART setup function (baud rate prescaler calculation for very high baud rates) |
| 26.01.2021 | 1.5.0.6 | minor logic optimization of CPU's `B` extension co-processor (reducing area); minor logic optimization or `HPM` triggers (reducing area); reworked CPU's co-processor interface; minor logic optimization of branch condition check (to shorten critical path) |
| 23.01.2021 | 1.5.0.5 | reworked true random number generator `TRNG`: architecture is now based on several simple ring oscillators with incrementing length; changed control register bits; updated according driver functions and demo program |
| 22.01.2021 | 1.5.0.4 | :bug: fixed BUG in bootloader (that caused it to immediately crash after reset if SPI/MTIME/GPIO peripherals were not implemented); reworked watchdog timer `WDT`: removed watchdog access password, added option to lock configuration until next system reset, changed control register bits - updated driver functions and demo/test programs |
| 17.01.2021 | 1.5.0.3 | CPU data register file can now be mapped to a **single** "true dual-port" block RAM by the synthesizer (requiring only 1024 memory bits instead of 2048); :bug: fixed typo error in `sim/rtl_modules/neorv32_imem.vhd`; modified `M` co-processor (due to register file read access modification), reduced switching activity when co-processor is idle; logic/arithmetic operations of `B` extension only require 3 cycles now, reduced switching activity when co-processor is idle |
| 15.01.2021 | 1.5.0.2 | added instruction cache associativity configuration (number of sets); new configuration generic: `ICACHE_ASSOCIATIVITY` -> number of sets (1 = direct mapped, 2 = 2-way set-associative), has to be a power of two; if associativity is > 1 the used replacement policy is *least recently used (LRU)*; :bug: fixed bug in `sw/lib/source/neorv32_cpu.c` PMP.CFG configuration function |
| 14.01.2021 | 1.5.0.1 | added new HPM trigger event: multi-cycle ALU operation wait cycle (`HPMCNT_EVENT_WAIT_MC`); renamed `neorv32_cache.vhd` -> `neorv32_icache.vhd` |
| 10.01.2021 | **:rocket:1.5.0.0** | Renamed configuration generics: `*_USE` -> `*_EN` |
| 10.01.2021 | 1.4.9.10 | :sparkles: Added support for [**bit manipulation extension (`B`)**](https://github.com/riscv/riscv-bitmanip) - base subset `Zbb` only (:warning: RISC-V `B` (sub-)extensions are not officially ratified yet; compatible to version "0.94-draft"); enabled via new configuration constant `CPU_EXTENSION_RISCV_B` (default = false); uported `Zbb` instructions: `CLZ` `CTZ` `CPOP` `SEXT.B` `SEXT.H` `MIN[U]` `MAX[U]` `ANDN` `ORN` `XNOR` `ROL` `ROR` `RORI` `zext`(*pseudo-instruction* for `PACK rd, rs, zero`) `rev8`(*pseudo-instruction* for `GREVI rd, rs, -8`) `orc.b`(*pseudo-instruction* for `GORCI rd, rs, 7`); added `B` flag to `misa` CSR; added `Zbb` flag to `mzext` CSR |
| 03.01.2021 | 1.4.9.8 | Added HPM trigger for instruction issue wait cycle (caused by pipeline flush); all HPM counters do not increment if CPU is sleep mode; fixed CoreMark timer overflow issues; `rtl/core/neorv32_busswitch.vhd`: removed wait states, less load/store wait cycles -> faster execution; updated CoreMark results |
| 02.01.2021 | 1.4.9.7 | :sparkles: added RISC-V hardware performance monitors (`HPM`); new CSRs: `mhpmevent*`(3..31), `[m]hpmcounter*[h]`(3..31), amount configurable via top's generic `HPM_NUM_CNTS`; supported counter events: active cycle, retired instruction, retired compressed instruction, instruction fetch memory wait cycle, load operation, store operation, load/store memory wait cycle, unconditional jump, conditional branche (all), conditional taken branch, entered trap, illegal instruction exception; PMP can now have up to 64 regions; number of regions configured via top's `PMP_NUM_REGIONS` generic; removed obsolete top's `PMP_USE` generic; removed PMP flag from `mzext` CSR; minimal region granularity (in bytes) configured via top's `PMP_MIN_GRANULARITY` generic, has to be a power of two and >= 8 bytes; :bug: fixed bug in sleep (`wfi`) instruction |
| 29.12.2020 | 1.4.9.5 | New UART features: "frame check" (test if stop bit is set), error indicated via `UART_DATA` reg's `UART_DATA_FERR` flag; configurable parity bit (`UART_CT.UART_CT_PMODE1:UART_CT_PMODE0`, 00=no parity; 10=even parity; 11=odd parity); parity error indicated via `UART_DATA` reg's `UART_DATA_PERR` flag; moved UART's RX overrun flag to `UART_DATA.UART_DATA_OVERR` |
| 26.12.2020 | 1.4.9.4 | removed `zicnt_en` option (was used to discard the standard RISC-V counters and timers from implementation); added missing `mcounteren` CSR (to allow read-access from user-level code to `cycle[h]` / `time[h]` / `[m]instret[h]` CSRs); available bits: 0: `CY`, 1: `TM`, 2: `IR`; added missing `mcountinhibit` CSR (to disable auto-increment of `[m]cycle[h]` / `[m]instret[h]` CSRs); available bits: 0: `CY`, 2: `IR`; :warning: renamed CSR bits C-code-aliases: `CPU_*` -> `CSR_*` |
| 25.12.2020 | 1.4.9.3 | Added missing `UBE` flag to `mstatus` CSR, indicates Endianness for load/stores in user mode (always set indicating BIG-endian mode), is a copy of `mstatush.mbe` |
| 23.12.2020 | 1.4.9.2 | :sparkles: added processor-internal instruction cache `rtl/core/neorv32_cache.vhd` (direct mapped); new configuration generics: `ICACHE_USE` (implement cache), `ICACHE_BLOCK_SIZE` (cache block/page/line size), `ICACHE_NUM_BLOCKS` (number of cache blocks); added `SYSINFO_CACHE` register to SYSINFO to check cache configuration by software |
| 20.12.2020 | 1.4.9.1 | :bug: fixed bug in CPU's instruction fetch engine (alignment_errros/bus_errors were not acknowledged correctly); added `BUS_TIMEOUT` generic to CPU (defines the amount of cycles after which an *unacknowledged* bus access will get terminated and raises a bus access fault exception) |
| 19.12.2020 | **:rocket:1.4.9.0** | Testbench: added memory-mapped triggers to trigger core's "machine software & external interrupts"; `sw/example/cpu_test`: removed CFU tests, added `MEI` and `MSI` tests; added **RISC-V-Compliance Test Framework** to repository (`riscv-compliance/`), core passes all `rv32` tests (riscv-compliance v2.1) |
| 18.12.2020 | 1.4.8.13 | Added additional simulation files: simulation-optimized IMEM-ROM (so far, this is only relevant for the *new* NEORV32 RISC-V Compliance test framework v2.0); **:sparkles: Processor now passes all `rv32` tests of the new [RISC-V Compliance Test Framework v2.0](https://github.com/riscv/riscv-compliance/releases/tag/v2.0) :sparkles:** |
| 16.12.2020 | 1.4.8.12 | :warning: fixed (another) bug in `mtval` CSR generation (wrong value for "breakpoint" trap); updated `mtval` value table in data sheet; fixed bug in load/store operation (introduced in version 1.4.8.10) |
| 16.12.2020 | 1.4.8.11 | :warning: fixed bug in `mtval` CSR generation (wrong values for some traps); fixed bug in `mip` CSR (writing zero to implemented bits now actually clears pending interrupts); fixed bug in IRQ priority encoding (machine software interrupt `MSI` comes before machine timer interrupt `MTI`) |
| 12.12.2020 | 1.4.8.10 | :warning: fixed wrong `trap_reset_c` encoding (in it's expanded form it should be 0x80000000) and reset logic: hardware `mcause` register is now set to `trap_reset_c` after a hardware reset; crt0.S start-up code now sets `mcause` to `trap_reset_c` after finishing hardware setup |
| 11.12.2020 | 1.4.8.9 | Added option to exclude standard RISC-V performance counters (`[m]cycle[h]` and `[m]instret[h]`) for size-constrained implementations; disabled by setting VHDL package's `zicnt_en_c` constant to false; software can determine state of `zicnt_en_c` via `mzext` CSR's `CPU_MZEXT_ZICNT` bit; added new signal to processor top entity: `mtime_i`, this signal is used for updateting the `time[h]` CSRs if the processor-internal MTIME unit is disabled (via `IO_MTIME_USE` = `false`) |
| 10.12.2020 | 1.4.8.8 | Added missing `mstatush` CSR (only bit `MBE` is implemented yet); added option to configure external bus interface for BIG- or little-endian byte-order, configured via VHDL package `xbus_big_endian_c` constant, default = BIG-endian, software can check endianness of the interface via SYSINFO's `SYSINFO_FEATURES(SYSINFO_FEATURES_MEM_EXT_ENDIAN)` flag; added `mstatush` CSR and endianness information to data sheet |
| 09.12.2020 | 1.4.8.7 | Added missing *environment call from U-mode* exception (via `ecall` instruction in user-mode); added environment call from U-mode to data sheet |
| 09.12.2020 | 1.4.8.6 | :warning: fixed bugs in ALU's co-processor interface: ATOMIC `A` extension could not be used without MULDIV `M` extension, CPU might have permanently stalled when executing an instruction from a disabled ISA extension; :lock: added security feature: illegal user-level CSR read access will always return zero; added new section *Execution Safety* to neorv32.pdf data sheet |
| 07.12.2020 | 1.4.8.5 | :warning: fixed bug in next-PC logic (introduced with version 1.4.8.1) that caused instruction fetch from memories with more than 1 cycle latency to fail |
| 05.12.2020 | 1.4.8.4 | :warning: fixed bug in physical memory protection (PMP): region size configuration was incorrect; removed `PMP_NUM_REGIONS` and `PMP_GRANULARITY` CPU/processor generics (PMP configuration now via package constants); reworked section *2.4. Instruction Sets and CPU Extensions* of neorv32.pdf |
| 04.12.2020 | 1.4.8.2 | Added PMA (physical memory attribute) to processor-internal IO region: `NO EXECUTE`; added *3.3.Address Space/Physical Memory Attributes (PMAs)* section to neorv32.pdf |
| 03.12.2020 | 1.4.8.1 | Optimized CPU program counter (PC) update logic and "next PC" computation (shortened critical path); updated bootloader (configuration option for direct-boot-from-SPI-flash only) and *customization* text in neorv32.pdf |
| 01.12.2020 | **:rocket:1.4.8.0** | :warning: fixed bug in CPU-internal co-processor interface; optimized multiplier unit (~1 faster); added CPU `A` (atomic) extension support (only `lr.w` and `sc.w` instructions yet); added `lock` signal to CPU and processor's external bus interface |
| 28.11.2020 | 1.4.7.6 | Split ALU core operations: shortened critical path - replaced ALU output 8:1 mux by a 4:1 mux |
| 26.11.2020 | 1.4.7.5 | Minor rtl clean-up; CSR access instructions are one cycle faster now (3 cycles now); system/environment instructions (`ecall` `ebreak` `mret` `wfi`) need one additional cycle (4 cycles now) |
| 25.11.2020 | 1.4.7.4 | :warning: fixed bug in `FENCE.I` instruction that corrupted instruction fetch when executing code from processor-external memory; default testbench (`sim/neorv32_tb.vhd`) now features external IMEM, external DMEM and external IO connected via external bus interface; simulation now allows CPU to execute code using external memories only (no internal IMEM/DMEM); optimized CPU's instruction fetch interface (no more unnecessary transfer cancel requests) |
| 20.11.2020 | 1.4.7.2 | :warning: fixed bug in CPU bus unit that caused a memory exception after reset in some cases; added second simulated external (Wishbone) memory to testbench (one memory for simulating an external IMEM, one memory for simulating external memory-mapped IO); external bus interface (`wishbone`) now makes sure that a canceled bus transfer is really understood by the accessed peripheral |
| 20.11.2020 | 1.4.7.1 | Removed deprecated "update_enable signal" from IMEM |
| 11.11.2020 | **:rocket:1.4.7.0** | Further optimized pipeline front-end: Jumps and branches are one cycle faster (+5% coremark performance); updated synthesis results; updated performance results; added `hello_world` example program |
| 07.11.2020 | 1.4.6.7 | Updated bootloader (size optimization) and changed processor version output; added project logo; minor data sheet edits |
| 03.11.2020 | 1.4.6.6 | Removed SPI module's *buggy* "LSB-first mode", SPI module now always sends data MSB-first; removed SPI.CTRL `SPI_CT_DIR` bit; modfied bit order in SPI CTRL register; updated SPI SW library |
| 02.11.2020 | 1.4.6.5 | :warning: Fixed bug in CPU's illegal instruction detection logic; CPU rtl code optimizations - further reduced hardware footprint; rtl code clean-ups |
| 01.11.2020 | 1.4.6.4 | :warning: Fixed bug in `[m]instret[h]` and `[m]cycle[h]` carry logic; CPU hardware optimizations (area reduction, shortened critical path) |
| 29.10.2020 | 1.4.6.3 | rtl code clean-up; made preparations for additional co-processors |
| 25.10.2020 | 1.4.6.2 | Added tag signal (`wb_tag_o`) to processor's Wishbone bus; removed processor's `priv_o` - privilege level is now encoded in Wishbone *tag* signal; added a more sophisticated **FreeRTOS** example ("full_demo") |
| 24.10.2020 | **:rocket:1.4.6.0** | Completely reworked external memory interface (WISHBONE), removed now-obsolete processor generic `MEM_EXT_REG_STAGES`; added processor wrapper with **AXI4-Lite interface** |
| 22.10.2020 | 1.4.5.11 | TWI: Added new control register flag to enable/disable SCL clock stretching by peripheral devices |
| 22.10.2020 | 1.4.5.10 | Added `i_bus_priv_o` and `d_bus_priv_o` signals to CPU_top and `priv_o` to Processor_top to show privilege level of bus access (from `mstatus` MPP); :warning: Fixed bug in external memory interface [WISHBONE] (non-standard Wishbone components were able to corrupt processor-internal ACK/ERR signal logic) |
| 20.10.2020 | 1.4.5.9 | :warning: Fixed bug in CPU "sleep" instruction (`WFI` - wait for interrupt) |
| 20.10.2020 | 1.4.5.8 | *Machine timer interrupt* is available as processor input pin (`mtime_irq_i`) if internal `MTIME` is not implemented (`IO_MTIME_USE` = false) |
| 18.10.2020 | 1.4.5.7 | Added new IO peripheral/Device: Second CFU (CFU1); renamed old CFU to CFU0; CFU VHDL files: `neorv32_cfu0.vhd` & `neorv32_cfu1.vhd`; removed CFU interrupt |
| 17.10.2020 | 1.4.5.5 | New makefile target `upload` allows to directly upload an executable to the bootloader from the console |
| 17.10.2020 | 1.4.5.4 | Added new CPU/Processor generic `FAST_SHIFT_EN` (default = *false*) to enable implementation of a fast (but large) barrel shifter for accelerating CPU shift instructions; updated CoreMark performance results |
| 16.10.2020 | 1.4.5.2 | Added read-only flag to custom `mzext` CSR to check if physical memory protection (PMP) is implemented; added [C] `mzext` CSR name aliases to neorv32.h |
| 15.10.2020 | 1.4.5.1 | Fixed "unprecise exceptions": `mtval` did not always reflect the correct value according to the instruction that caused the exceptions; fixed bug in RTE: Debug trap handler was not showing the correct `mepc` value |
| 13.10.2020 | **:rocket:1.4.5.0** | An official *open-source RISC-V architecture ID* was assigned to the project: decimal = `19`, 32-bit hexadecimal = `0x00000013` - software can retrieve the ID from the `marchid` CSR |
| 12.10.2020 | 1.4.4.9 | Added *alignment flags* to makefiles: branch/jump/call targets are forced to be 32-bit aligned -> increases performance when using the `C` extension; added makefile flag listing to NEORV32.pdf; updated performance results for CPUs with `C` extension; `crt0.S` will initialize *all* registers with zero if not using `E` extension and not compiling bootloader |
| 11.10.2020 | 1.4.4.8 | Reworked pipeline frontend: Optimized fetch engine, added issue engine, faster instruction fetch after taken branches + reduced hardware requirements; updated synthesis and performance results |
| 11.10.2020 | 1.4.4.6 | Added option to configure external memory interface (Wishbone) to either use *standard/classic protocol* (default) or *pipelined protocol* (for better timing): via `wb_pipe_mode_c` constant in VHDL package file (`rtl/core/neorv32_package.vhd`); added help text to NEORV32.pdf section "3.4.4. Processor-External Memory Interface (WISHBONE)" |
| 08.10.2020 | 1.4.4.5 | Removed CPU's `BUS_TIMEOUT` and processor's `MEM_EXT_TIMEOUT` generics; instead, a global configuration `bus_timeout_c` in the VHDL package file is used now |
| 08.10.2020 | 1.4.4.4 | Removed `DEVNULL` device; all simulation output options from this device are now available as `SIM_MODE` in the `UART`; `mcause` CSR can now also be written; FIXED: trying to write a read-only CSR will cause an illegal instruction exception; for compatibility reasons any write access to the misa CSR will be ignored and will NOT cause an exception |
| 07.10.2020 | 1.4.4.2 | Simplified ALU's set of core operations; removed co-processor data mux right after ALU -> shorter critical path; CPU control VHDL code clean-up and CSR write logic optimization; optimized IMEM/DMEM access logic; added note regarding alignment of IMEM/DMEM |
| 05.10.2020 | **:rocket:1.4.4.0** | :warning: Fixed bug in external memory interface: Executing code from external memory was causing an instruction fetch stall |
| 02.10.2020 | 1.4.3.9 | `[m]cycleh` and `[m]instreth` CSRs are now 32-bit wide (-> fully RISC-V-compliant) |
| 01.10.2020 | 1.4.3.8 | Added CPU top entity wrapper with resolved port signals `rtl/top_templetes/neorv32_cpu_stdlogic.vhd`; optimized ALU core functions shorter critical path, less control overhead, reduced HW footprint |
| 27.09.2020 | 1.4.3.3 | Further improved ALU and control logic; CSR access instruction require one additional cycle now (to let side effects kick in); updated synthesis results; added CFU hardware driver dummy |
| 26.09.2020 | 1.4.3.2 | :warning: Fixed bug in `CSRRWI` instruction (introduced with version 1.4.3.1); further ALU operand logic optimizations; updated CPU data path figure |
| 25.09.2020 | 1.4.3.1 | Register file's `x0` is now a physical register; this register is initialized by the hardware and locked afterwards; removed "set to zero" stage -> smaller hardware footprint and shorter critical path; added processor top entity wrapper with resolved signals `rtl/top_templetes/neorv32_top_stdlogic.vhd` |
| 16.09.2020 | **:rocket:1.4.3.0** | Simplified memory configuration: removed processor top's memory space configuration generics (`MEM_ISPACE_BASE`, `MEM_ISPACE_SIZE`, `MEM_DSPACE_BASE`, `MEM_DSPACE_SIZE`); data/instruction space sizes are irrelevant for hardware; instruction/data space base addresses are fixed (but can be modified in NEORV32 VHDL package file); modified SYSINFO registers; adapted bootloader, crt0 start-up code and linker script; stack configuration is now done via linker script; reworked chapter "address space"; added CFU interrupt -> fast interrupt channel 1 (shared with GPIO) |
| 14.09.2020 | 1.4.2.0 | Removed option to disable CSR counters (via `CSR_COUNTERS_USE` generic) since these counters are mandatory according to the RISC-V specs; added new IO/peripheral device: custom functions unit (`CFU`) for tightly-coupled custom co-processors; improved timing of processor-internal clock generator; fixed wrong labels in address space figure and removed dedicated exception vectors box; added mask register to GPIO unit to specify which input pins can trigger a pin-change interrupt |
| 11.09.2020 | 1.4.0.4 | Reworked `TRNG` architecture and interface; added text regarding fast interrupt channels usage for the NEORV32 processor |
| 02.09.2020 | 1.4.0.2 | :warning: Fixed bugs in external memory interface; added option to define latency of simulated external memory in testbench; hardware configuration sanity checks will now only appear once in console; added more details to data sheet section 3.3. Address Space; fixed typos in MEM_*_BASE and MEM_*_SIZE generic names |
| 01.09.2020 | 1.4.0.1 | Using registers above `x15` when the `E` extensions is enabled will now correctly cause an illegal instruction exception |
| 29.08.2020 | **:rocket:1.4.0.0** | Rearranged and reworked data sheet; added FreeRTOS port, demo & short referencing chapter; removed bootloader-specific linker scripts main linker script is used for both, applications and bootloader; bootloader can now have `.data` and `.bss` sections; improved IMEM and BOOTROM memory initialization faster synthesis; image generator now constrains init array size to actual executable size; peripheral/IO devices can only be written in full word mode (= 32-bit); GPIO ports are now 32-bit wide |
| 23.08.2020 | 1.3.7.3 | Added custom `mzext` CSR to check for available Z* CPU extensions; multiplier's FAST_MUL mode is one cycle faster now; updated performance data |
| 20.08.2020 | 1.3.7.2 | Removed bootloader-specific crt0 bootloader now uses std crt0; makefiles now also support asm and cpp files; made linker scripts more general; renamed makefile "compile" (which is still available for compatibility) target into "exe" |
| 14.08.2020 | **:rocket:1.3.7.0** | Simplified CPU fetch engine; added configurable CPU instruction prefetch buffer (ipb) FIFO; optimized CPU execute engine; updated performance data |
| 06.08.2020 | 1.3.6.5 | Added `FAST_MUL_EN` generic to enable mapping of the multiplier core to DSP blocks; ALU.shifter is no more triggered when executing MULDIV operations; added benchmark results for DSP-based multiplier configurations; updated implementation and performance results; simplified makefiles using implicit libc definition; crt0 only initializes lowest 16 registers |
| 03.08.2020 | **:rocket:1.3.6.0** | Relocated `DEVNULL` (changed base address); minor edits, optimization and clean-ups |
| 30.07.2020 | 1.3.5.2 | Added register stage to PMP mask generation to shorten critical path; removed automatic IRQ enable/disable from RTE install/uninstall functions |
| 30.07.2020 | 1.3.5.1 | :warning: Fixed bug(s) in PMP mask generation; `misa.Z` flag is not yet defined by the RISC-V specs., hence it is read-only and read as zero |
| 29.07.2020 | 1.3.5.0 | Added user privilege level, enabled via new `CPU_EXTENSION_RISCV_U` generic; :warning: fixed error in `mstatus(mpie)` logic; implemented RISC-V spec.-compliant Physical Memory Protection (PMP); allows up to 8 regions but only NAPOT mode is supported yet |
| 25.07.2020 | 1.3.0.0 | `mcause` CSR is read-only now!; removed `CLIC`, added 4 fast IRQ channels to CPU with according flags in `mie` and `mip` and trap IDs; updated core libraries; updated NEORV32 RTE; highly reworked data sheet; updated synthesis and performance results |
| 21.07.2020 | 1.2.0.6 | Added doc section regarding the CPU's data and instruction interfaces; optimized CPU fetch engine; updated iCE40 synthesis results |
| 20.07.2020 | 1.2.0.5 | Less penalty for taken branches and jumps (2 cycles faster) |
| 19.07.2020 | 1.2.0.0 | CPU bus unit now has independent buses for instruction fetch and data access merged into single processor bus via new bus switch unit; doubled speed of ALU shifter unit again; all bits of `mcause` CSR can now be modified by application program (full RISC-V-compliant); performance counters CSRs `[m]cycleh` and `[m]instreth` are only 20-bit wide; removed NEORV32-specific custom CSRs all processor-related information can be obtained from the new `SYSINFO` IO module (CPU is now more independent from processor configuration); changed IO address of `DEVNULL`; fixed bug in bootloader's trap handler; added `USER_CODE` generic to assign a custom user code that can be read by software (from `SYSINFO`) |
| 14.07.2020 | 1.1.0.0 | Added `fence_o` and `fencei_o` signals to top entity to show if a `fence` or `fencei` instruction is executed; added `mvendorid` and `marchid` CSRs (both are always zero); ALU shift unit is faster now; two lowest bits of `mtvec` are always zero; fixed wrong instruction exception priority; removed `HART_ID` generic `mhartid` CSR is always read as zero; performance counters (`[m]cycle[h]`, `[m]instret[h]` and `time[h]`) are also available in embedded mode but can be explicitly disabled via the `CSR_COUNTERS_USE` generic; mcause CSR only allows write access to bit 31 and bits 3:0; updated synthesis reports |
| 10.07.2020 | 1.0.6.0 | Non-taken branches are now 1 cycle faster; the `time[h]` CSR now correctly reflects the system time from the MTIME unit; fixed WFI instruction permanently stalling the CPU; `[m]cycle[h]` counters now stop counting when CPU is in sleep mode; `minstret[h]` and `mcycle[h]` now also allow write-access |
| 09.07.2020 | 1.0.5.0 | `X` flag of `misa` CSR is zero now; the default SPI flash boot address of the bootloader is now `0x0080000`; new exemplary FPGA utilization results for Intel, Lattice and Xilinx; `misa` CSR is read-only again, switching compressed extension on/off is pretty bad for the fetch engine; `mtval` and `mcause` CSRs now allow write accesses and are finally RISC-V-compliant; time low and high registers of `MTIME` peripheral can now also be written by user; `MTIME` registers only allow full-word write accesses |
| 06.07.2020 | 1.0.1.0 | Added missing `fence` instruction; added new generic to enable optional Zifencei CPU extension for instruction stream synchronization |
| 05.07.2020 | 1.0.0.0 | New CPU architecture: Fetch and execute engines; increased CPI; timer and counter CSRs are now all 64-bit wide; :warning: fixed CSR access errors; fixed `C.LW` decompression logic; `misa` flags `C` and `M` are now r/w compressed mode and multiplier/divider support can be switched on/off during runtime; PC(0) is now always zero; :warning: fixed bug in multiplier/divider co-processor; renamed SPI signals; added RISC-V compliance check information processor now passes the official RISC-V compliance tests |
| 25.06.2020 | 0.0.2.5 | Added `DEVNULL` device; added chapter regarding processor simulation; fixed/added links; fixed typos; added FPGA implementation results for iCE40 UP |
| 23.06.2020 | **:rocket:0.0.2.3** | Publication |

11
CITATION.cff 100644
View File

@ -0,0 +1,11 @@
cff-version: 1.2.0
message: "If you are using this project, please cite it as below."
authors:
- family-names: "Nolting"
given-names: "Stephan"
- family-names: "All the Awesome Contributors"
given-names: ""
title: "The NEORV32 RISC-V Processor"
doi: 10.5281/zenodo.5018888
date-released: 2023-08-06
url: "https://github.com/stnolting/neorv32"

76
CODE_OF_CONDUCT.md 100644
View File

@ -0,0 +1,76 @@
# Contributor Covenant Code of Conduct
## Our Pledge
In the interest of fostering an open and welcoming environment, we as
contributors and maintainers pledge to making participation in our project and
our community a harassment-free experience for everyone, regardless of age, body
size, disability, ethnicity, sex characteristics, gender identity and expression,
level of experience, education, socio-economic status, nationality, personal
appearance, race, religion, or sexual identity and orientation.
## Our Standards
Examples of behavior that contributes to creating a positive environment
include:
* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members
Examples of unacceptable behavior by participants include:
* The use of sexualized language or imagery and unwelcome sexual attention or
advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic
address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Our Responsibilities
Project maintainers are responsible for clarifying the standards of acceptable
behavior and are expected to take appropriate and fair corrective action in
response to any instances of unacceptable behavior.
Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct, or to ban temporarily or
permanently any contributor for other behaviors that they deem inappropriate,
threatening, offensive, or harmful.
## Scope
This Code of Conduct applies both within project spaces and in public spaces
when an individual is representing the project or its community. Examples of
representing a project or community include using an official project e-mail
address, posting via an official social media account, or acting as an appointed
representative at an online or offline event. Representation of a project may be
further defined and clarified by project maintainers.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by contacting the project team via [_NEORV32 Issues_](https://github.com/stnolting/neorv32/issues). All
complaints will be reviewed and investigated and will result in a response that
is deemed necessary and appropriate to the circumstances. The project team is
obligated to maintain confidentiality with regard to the reporter of an incident.
Further details of specific enforcement policies may be posted separately.
Project maintainers who do not follow or enforce the Code of Conduct in good
faith may face temporary or permanent repercussions as determined by other
members of the project's leadership.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
[homepage]: https://www.contributor-covenant.org
For answers to common questions about this code of conduct, see
https://www.contributor-covenant.org/faq

28
CONTRIBUTING.md 100644
View File

@ -0,0 +1,28 @@
# Contributing
Any kind of help is highly appreciated! So if you have any questions, bug reports, ideas or if you want to
give some kind of feedback, feel free to [open a new issue](https://github.com/stnolting/neorv32/issues/new/choose)
or start a new [discussion](https://github.com/stnolting/neorv32/discussions). Also look out for issues and pull requests labeled with
[![help-wanted](https://img.shields.io/badge/-help%20wanted-brightgreen)](https://github.com/stnolting/neorv32/labels/help%20wanted) and
[![good-first-issue](https://img.shields.io/badge/-good%20first%20issue-purple)](https://github.com/stnolting/neorv32/labels/good%20first%20issue);
the latter one might be a good starting point for newcomers and beginners.
Please note that we have a [Code of Conduct](https://github.com/stnolting/neorv32/blob/main/CODE_OF_CONDUCT.md).
Please follow it in all your interactions with this project.
## Contributing Process
Here is a simple guide line if you'd like to contribute code modifications to this project:
1. [Fork](https://github.com/stnolting/neorv32/fork) this repository and clone the fork: `git clone https://github.com/stnolting/neorv32.git`
2. In your local copy, create a feature branch in your fork: `git checkout -b awesome_new_feature_branch`
3. Create a new remote for the upstream repo: `git remote add upstream https://github.com/stnolting/neorv32`
4. Commit your modifications: `git commit -m "Awesome new feature!"`
5. Push to the branch: `git push origin awesome_new_feature_branch`
6. Create a new [pull request](https://github.com/stnolting/neorv32/pulls); please make sure that your feature branch is up-to-date
with the project's `main` branch; we will review it as soon as possible
7. If you like, discuss / show-case your work on the project's [discussion board](https://github.com/stnolting/neorv32/discussions)
If your merge request modifies the code base in a significant way (something more than just a typo fix) please also update
the version identifier (increment the least-significant number) and add an according change log entry
(see [CHANGELOG.md](https://github.com/stnolting/neorv32/blob/main/CHANGELOG.md) for more information).

63
do.py 100644
View File

@ -0,0 +1,63 @@
#!/usr/bin/env python3
# doit
from sys import executable, argv as sys_argv, exit as sys_exit
from os import environ
from pathlib import Path
from doit.action import CmdAction
from doit.cmd_base import ModuleTaskLoader
from doit.doit_cmd import DoitMain
DOIT_CONFIG = {"verbosity": 2, "action_string_formatting": "both"}
ROOT = Path(__file__).parent
def task_SoftwareFrameworkTests():
return {
"actions": [
# Check toolchain
"make -C sw/example/processor_check check",
# Generate executables for all example projects
"make -C sw/example clean_all exe",
# Compile and install bootloader
"make -C sw/bootloader clean_all info bootloader",
],
"doc": "Build all sw/example/*; install bootloader",
}
def task_Documentation():
return {
"actions": ["make -C docs {posargs}"],
"doc": "Run a target in subdir 'doc'",
"uptodate": [False],
"pos_arg": "posargs",
}
def task_DeployToGitHubPages():
cwd = str(ROOT / "public")
return {
"actions": [
CmdAction(cmd, cwd=cwd)
for cmd in [
"git init",
"cp ../.git/config ./.git/config",
"touch .nojekyll",
"git add .",
'git config --local user.email "push@gha"',
'git config --local user.name "GHA"',
"git commit -am '{posargs}'",
"git push -u origin +HEAD:gh-pages",
]
],
"doc": "Create a clean branch in subdir 'public' and push to branch 'gh-pages'",
"pos_arg": "posargs",
}
if __name__ == '__main__':
sys_exit(DoitMain(ModuleTaskLoader(globals())).run(sys_argv[1:]))

2529
docs/Doxyfile 100644

File diff suppressed because it is too large Load Diff

65
docs/Makefile 100644
View File

@ -0,0 +1,65 @@
.DEFAULT_GOAL := help
all: pdf html ug-pdf ug-html
mkdir -p public/img/
cp -vr figures/* public/img/
# Generate PDF datasheet
pdf:
[ -f revnumber.txt ] && REVNUMBER='-a revnumber='"$$(cat revnumber.txt)" || unset REVNUMBER; \
asciidoctor-pdf $$REVNUMBER \
-a allow-uri-read \
-a pdf-theme=neorv32-theme.yml \
-r asciidoctor-diagram \
datasheet/main.adoc \
--out-file public/pdf/NEORV32.pdf
# Generate HTML datasheet
html:
[ -f revnumber.txt ] && REVNUMBER='-a revnumber='"$$(cat revnumber.txt)" || unset REVNUMBER; \
asciidoctor $$REVNUMBER \
-r asciidoctor-diagram \
datasheet/index.adoc \
--out-file public/index.html
# Generate PDF user guide
ug-pdf:
[ -f revnumber.txt ] && REVNUMBER='-a revnumber='"$$(cat revnumber.txt)" || unset REVNUMBER; \
asciidoctor-pdf $$REVNUMBER \
-a allow-uri-read \
-a pdf-theme=neorv32-theme.yml \
-r asciidoctor-diagram \
userguide/main.adoc \
--out-file public/pdf/NEORV32_UserGuide.pdf
# Generate HTML user guide
ug-html:
[ -f revnumber.txt ] && REVNUMBER='-a revnumber='"$$(cat revnumber.txt)" || unset REVNUMBER; \
asciidoctor $$REVNUMBER \
-r asciidoctor-diagram \
userguide/index.adoc \
--out-file public/ug/index.html
# Generate DOXYGEN software documentation
doxygen:
doxygen Doxyfile
# Generate revnumber.txt for overriding the revnumber attribute in 'pdf' and/or 'html'
revnumber:
if [ `git tag -l | grep nightly` ]; then git tag -d nightly; fi
git describe --long --tags | sed 's#\([^-]*-g\)#r\1#;' > revnumber.txt
cat revnumber.txt
# Build 'pdf' and 'html' in an 'asciidoctor-wavedrom' container
container: revnumber
docker run --rm -v /$(shell pwd)://documents/ btdi/asciidoctor make all
# Help
help:
@echo "Targets:"
@echo " help - show this text"
@echo " pdf - build datasheet as pdf file (public/pdf/NEORV32.pdf)"
@echo " html - build datasheet as HTML page (public/index.html)"
@echo " ug-pdf - build user guide as pdf file (public/pdf/NEORV32_UserGuide.pdf)"
@echo " ug-html - build user guide as HTML page (public/ug/index.html)"
@echo " doxygen - build software documentation as HTML page (doxygen_build/html/index.html)"

34
docs/README.md 100644
View File

@ -0,0 +1,34 @@
## Project Documentation
### [`datasheet`](https://github.com/stnolting/neorv32/tree/main/docs/datasheet)
AsciiDoc sources for the NEORV32 data sheet. The online version of the data sheet is
available at [https://stnolting.github.io/neorv32](https://stnolting.github.io/neorv32).
### [`figures`](https://github.com/stnolting/neorv32/tree/main/docs/figures`)
Figures and images used by the data sheet, user guide and the webpage(s). The according
license(s) are listed in [`figures/license.md`](https://github.com/stnolting/neorv32/blob/main/docs/figures/license.md).
### [`icons`](https://github.com/stnolting/neorv32/tree/main/docs/icons`)
Icons used by the data sheet and the user guide.
### [`references`](https://github.com/stnolting/neorv32/tree/main/docs/references`)
Reference material like RISC-V and Wishbone specifications.
### [`sources`](https://github.com/stnolting/neorv32/tree/main/docs/sources)
Various sources for the images in `figures/`.
### [`userguide`](https://github.com/stnolting/neorv32/tree/main/docs/userguide)
AsciiDoc sources for the NEORV32 user guide. The online version of the user guide is
available at [https://stnolting.github.io/neorv32/ug](https://stnolting.github.io/neorv32/ug).

11
docs/attrs.adoc 100644
View File

@ -0,0 +1,11 @@
:keywords: neorv32, risc-v, riscv, rv32, fpga, soft-core, vhdl, microcontroller, cpu, soc, processor, gcc, openocd, gdb
:description: A size-optimized, customizable and highly extensible MCU-class 32-bit RISC-V soft-core CPU and microcontroller-like SoC written in platform-independent VHDL.
:revnumber: v1.9.5
:doctype: book
:sectnums:
:stem:
:reproducible:
:listing-caption: Listing
:toclevels: 3
:title-logo-image: neorv32_logo_riscv.png[pdfwidth=6.25in,align=center]
:favicon: img/icon.png

View File

@ -0,0 +1,4 @@
:icons: font
:imagesdir: ../figures
:toc: macro
:title-logo-image: image:neorv32_logo_riscv.png[pdfwidth=6.25in,align=center]

View File

@ -0,0 +1,14 @@
<<<
// ####################################################################################################################
include::overview.adoc[]
include::soc.adoc[]
include::cpu.adoc[]
include::software.adoc[]
include::on_chip_debugger.adoc[]
include::../legal.adoc[]

View File

@ -0,0 +1,978 @@
:sectnums:
== NEORV32 Central Processing Unit (CPU)
The NEORV32 CPU is an area-optimized RISC-V core implementing the `rv32i_zicsr_zifencei` base (privileged) ISA and
supporting several additional/optional ISA extensions. The CPU's micro architecture is based on a von-Neumann
machine build upon a mixture of multi-cycle and pipelined execution schemes.
[NOTE]
This chapter assumes that the reader is familiar with the official
RISC-V _User_ and _Privileged Architecture_ specifications.
**Section Structure**
* <<_risc_v_compatibility>>
* <<_cpu_top_entity_signals>> and <<_cpu_top_entity_generics>>
* <<_architecture>> and <<_full_virtualization>>
* <<_instruction_sets_and_extensions>> and <<_custom_functions_unit_cfu>>
* <<_control_and_status_registers_csrs>>
* <<_traps_exceptions_and_interrupts>>
* <<_bus_interface>>
// ####################################################################################################################
:sectnums:
=== RISC-V Compatibility
The NEORV32 CPU passes the tests of the **official RISCOF RISC-V Architecture Test Framework**. This framework is used to check
RISC-V implementations for compatibility to the official RISC-V user/privileged ISA specifications. The NEORV32 port of this
test framework is available in a separate repository at GitHub: https://github.com/stnolting/neorv32-riscof
.Unsupported ISA Extensions
[TIP]
Executing instructions or accessing CSRs from yet unsupported ISA extensions will raise an illegal
instruction exception (see section <<_full_virtualization>>).
**Incompatibility Issues and Limitations**
.`time[h]` CSRs (Wall Clock Time)
[IMPORTANT]
The NEORV32 does not implement the `time[h]` registers. Any access to these registers will trap. It is
recommended that the trap handler software provides a means of accessing the platform-defined <<_machine_system_timer_mtime>>.
.No Hardware Support of Misaligned Memory Accesses
[IMPORTANT]
The CPU does not support resolving unaligned memory access by the hardware (this is not a
RISC-V-incompatibility issue but an important thing to know!). Any kind of unaligned memory access
will raise an exception to allow a _software-based_ emulation provided by the application. However, unaligned memory
access can be **emulated** using the NEORV32 runtime environment. See section <<_application_context_handling>>
for more information.
.No Atomic Read-Modify-Write Operations
[IMPORTANT]
The NEORV32 <<_a_isa_extension>> only supports the load-reservate (LR) and store-conditional (SR) instructions.
The remaining read-modify-write operations are not supported. However, these missing instructions can
be emulated. The NEORV32 <<_core_libraries>> provide an emulation wrapper for the missing AMO/read-modify-write
instructions that is based on LR/SC pairs. A demo/program can be found in `sw/example/atomic_test`.
<<<
// ####################################################################################################################
:sectnums:
=== CPU Top Entity - Signals
The following table shows all interface signals of the CPU top entity `rtl/core/neorv32_cpu.vhd`. The
type of all signals is _std_ulogic_ or _std_ulogic_vector_, respectively. The "Dir." column shows the signal
direction as seen from the CPU.
.NEORV32 CPU Signal List
[cols="<3,^3,^1,<5"]
[options="header", grid="rows"]
|=======================
| Signal | Width/Type | Dir | Description
4+^| **Global Signals**
| `clk_i` | 1 | in | Global clock line, all registers triggering on rising edge, this clock can be switched off during <<_sleep_mode>>
| `clk_aux_i` | 1 | in | Always-on clock, used to keep the the sleep control active when `clk_i` is switched off
| `rstn_i` | 1 | in | Global reset, low-active
| `sleep_o` | 1 | out | CPU is in <<_sleep_mode>> when set
| `debug_o` | 1 | out | CPU is in <<_cpu_debug_mode,debug mode>> when set
4+^| **Interrupts (<<_traps_exceptions_and_interrupts>>)**
| `msi_i` | 1 | in | RISC-V machine software interrupt
| `mei_i` | 1 | in | RISC-V machine external interrupt
| `mti_i` | 1 | in | RISC-V machine timer interrupt
| `firq_i` | 16 | in | Custom fast interrupt request signals
| `dbi_i` | 1 | in | Request CPU to halt and enter debug mode (RISC-V <<_on_chip_debugger_ocd>>)
4+^| **Instruction <<_bus_interface>>**
| `ibus_req_o` | `bus_req_t` | out | Instruction fetch bus request
| `ibus_rsp_i` | `bus_rsp_t` | in | Instruction fetch bus response
4+^| **Data <<_bus_interface>>**
| `dbus_req_o` | `bus_req_t` | out | Data access (load/store) bus request
| `dbus_rsp_i` | `bus_rsp_t` | in | Data access (load/store) bus response
|=======================
.Bus Interface Protocol
[TIP]
See section <<_bus_interface>> for the instruction fetch and data access interface protocol and the
according interface types (`bus_req_t` and `bus_rsp_t`).
<<<
// ####################################################################################################################
:sectnums:
=== CPU Top Entity - Generics
Most of the CPU configuration generics are a subset of the actual Processor configuration generics
(see section <<_processor_top_entity_generics>>). and are not listed here. However, the CPU provides
some _specific_ generics that are used to configure the CPU for the NEORV32 processor setup. These generics
are assigned by the processor setup only and are not available for user defined configuration.
The specific generics are listed below.
.Table Abbreviations
[NOTE]
The generic type "suv(x:y)" defines a `std_ulogic_vector(x downto y)`.
.NEORV32 CPU-Exclusive Generic List
[cols="<4,^2,<8"]
[options="header",grid="rows"]
|=======================
| Name | Type | Description
| `CPU_BOOT_ADDR` | suv(31:0) | CPU reset address. See section <<_address_space>>.
| `CPU_DEBUG_PARK_ADDR` | suv(31:0) | "Park loop" entry address for the <<_on_chip_debugger_ocd>>, has to be 4-byte aligned.
| `CPU_DEBUG_EXC_ADDR` | suv(31:0) | "Exception" entry address for the <<_on_chip_debugger_ocd>>, has to be 4-byte aligned.
| `CPU_EXTENSION_RISCV_Sdext` | boolean | Implement RISC-V-compatible "debug" CPU operation mode required for the <<_on_chip_debugger_ocd>>.
| `CPU_EXTENSION_RISCV_Sdtrig` | boolean | Implement RISC-V-compatible trigger module. See section <<_on_chip_debugger_ocd>>.
|=======================
<<<
// ####################################################################################################################
:sectnums:
=== Architecture
image::neorv32_cpu.png[align=center]
The CPU implements a pipelined multi-cycle architecture: each instruction is executed as a series of consecutive
micro-operations. In order to increase performance, the CPU's front-end (instruction fetch) and back-end
(instruction execution) are de-couples via a FIFO (the instruction prefetch buffer. Thus, the front-end can already
fetch new instructions while the back-end is still processing the previously-fetched instructions.
Basically, the CPU's micro architecture is somewhere between a classical pipelined architecture, where each stage
requires exactly one processing cycle (if not stalled) and a classical multi-cycle architecture, which executes
every single instruction (_including_ fetch) in a series of consecutive micro-operations. The combination of these
two design paradigms allows an increased instruction execution in contrast to a pure multi-cycle approach (due to
overlapping operation of fetch and execute) at a reduced hardware footprint (due to the multi-cycle concept).
As a Von-Neumann machine, the CPU provides independent interfaces for instruction fetch and data access. However,
these two bus interfaces are merged into a single processor-internal bus via a prioritizing bus switch (data accesses
have higher priority). Hence, _all_ memory addresses including peripheral devices are mapped to a single unified 32-bit
<<_address_space>>.
[NOTE]
The CPU does not perform any speculative/out-of-order operations at all. Hence, it is not vulnerable to security issues
caused by speculative execution (like Spectre or Meltdown).
:sectnums:
==== CPU Register File
The data register file contains the general purpose architecture registers `x0` to `x31`. For the `rv32e` ISA only the lower
16 registers are implemented. Register zero (`x0`/`zero`) always read as zero and any write access to it has no effect.
Up to four individual synchronous read ports allow to fetch up to 4 register operands at once. The write and read accesses
are mutually exclusive as they happen in separate cycles. Hence, there is no need to consider things like "read-during-write"
behavior.
The register file provides two different implementation options configured via the top's `REGFILE_HW_RST` generic.
* `REGFILE_HW_RST = false` (default): In this configuration the register file is implemented as plain memory array without a
dictated hardware reset. This architecture allows to infer FPGA block RAM for the entire register file resulting in minimal
logic utilization and optimal timing.
* `REGFILE_HW_RST = true`: This configuration is based on individual FFs that do provide a dedicated hardware reset.
Hence, the register cannot be mapped to FPGA block RAM. This optional should only be selected if the application requires a
reset of the register file (e.g. for security reasons) or if the design shall be synthesized for an **ASIC** implementation.
The state of this configuration generic can be checked by software via the <<_mxisa>> CSR.
.FPGA Implementation
[WARNING]
Enabling the `REGFILE_HW_RST` option for FPGA implementation is not recommended as this will massively increase the amount
of required logic resources.
.Implementation of the `zero` Register within FPGA Block RAM
[NOTE]
Register `zero` is also mapped to a _physical memory location_ within the register file's block RAM. By this, there is no need
to add a further multiplexer to "insert" zero if reading from register `zero` reducing logic requirements and shortening the
critical path. However, this also requires that the physical storage bits of register `zero` are explicitly initialized (set
to zero) by the hardware. This is done transparently by the CPU control requiring no additional processing overhead.
.Block RAM Ports
[NOTE]
The default register file configuration uses two access ports: a read-only port for reading register `rs2` (second source operand)
and a read/write port for reading register `rs1` (first source operand) and for writing processing results to register `rd`
(destination register). Hence, a simple dual-port RAM can be used to implement the entire register file. From a functional point
of view, read and write accesses to the register file do never occur in the same clock cycle, so no bypass logic is required at all.
:sectnums:
==== CPU Arithmetic Logic Unit
The arithmetic/logic unit (ALU) is used for actual data processing as well as generating memory and branch addresses.
All "simple" <<_i_isa_extension>> computational instructions (like `add` and `or`) are implemented as plain combinatorial logic
requiring only a single cycle to complete. More sophisticated instructions like shift operations or multiplications are processed
by so-called "ALU co-processors".
The co-processors are implemented as iterative units that require several cycles to complete processing. Besides the base ISA's
shift instructions, the co-processors are used to implement all further processing-based ISA extensions (e.g. <<_m_isa_extension>>
and <<_b_isa_extension>>).
.Multi-Cycle Execution Monitor
[NOTE]
The CPU control will raise an illegal instruction exception if a multi-cycle functional unit (like the <<_custom_functions_unit_cfu>>)
does not complete processing in a bound amount of time (configured via the package's `monitor_mc_tmo_c` constant; default = 512 clock cycles).
.Tuning Options
[TIP]
The ALU architecture can be tuned for an application-specific area-vs-performance trade-off. The `FAST_MUL_EN` and `FAST_SHIFT_EN`
generics can be used to implement performance-optimized barrel shifters and DSP blocks, respectively. See sections <<_i_isa_extension>>,
<<_b_isa_extension>> and <<_m_isa_extension>> for specific examples.
:sectnums:
==== CPU Bus Unit
The bus unit takes care of handling data memory accesses via load and store instructions. It handles data adjustment when accessing
sub-word data quantities (16-bit or 8-bit) and performs sign-extension for singed load operations. The bus unit also includes the optional
<<_pmp_isa_extension>> that performs permission checks for all data and instruction accesses.
A list of the bus interface signals and a detailed description of the protocol can be found in section <<_bus_interface>>.
All bus interface signals are driven/buffered by registers; so even a complex SoC interconnection bus network will not
effect maximal operation frequency.
.Unaligned Accesses
[WARNING]
The CPU does not support a hardware-based handling of unaligned memory accesses! Any unaligned access will raise a bus load/store unaligned
address exception. The exception handler can be used to _emulate_ unaligned memory accesses in software.
See the NEORV32 Runtime Environment's <<_application_context_handling>> section for more information.
:sectnums:
==== CPU Control Unit
The CPU control unit is responsible for generating all the control signals for the different CPU modules.
The control unit is split into a "front-end" and a "back-end".
**Front-End**
The front-end is responsible for fetching instructions in chunks of 32-bits. This can be a single aligned 32-bit instruction,
two aligned 16-bit instructions or a mixture of those. The instructions including control and exception information are stored
to a FIFO queue - the instruction prefetch buffer (IPB). This FIFO has a depth of two entries by default but can be customized
via the `ipb_depth_c` VHDL package constant.
The FIFO allows the front-end to do "speculative" instruction fetches, as it keeps fetching the next consecutive instruction
all the time. This also allows to decouple front-end (instruction fetch) and back-end (instruction execution) so both modules
can operate in parallel to increase performance. However, all potential side effects that are caused by this "speculative"
instruction fetch are already handled by the CPU front-end ensuring a defined execution stage while preventing security
side attacks.
**Back-End**
Instruction data from the instruction prefetch buffer is decompressed (if the `C` ISA extension is enabled) and sent to the
CPU back-end for actual execution. Execution is conducted by a state-machine that controls all of the CPU modules. The back-end also
includes the <<_control_and_status_registers_csrs>> as well as the trap controller.
==== Sleep Mode
The NEORV32 CPU provides a single sleep mode that can be entered to power-down the core reducing
dynamic power consumption. Sleep mode is entered by executing the `wfi` ("wait for interrupt") instruction.
[NOTE]
The `wfi` instruction will raise an illegal instruction exception when executed in user-mode
if `TW` in <<_mstatus>> is set. When executed in debug-mode or during single-stepping `wfi` will behave as
simple `nop` without entering sleep mode.
After executing the `wfi` instruction the CPU's `sleep_o` signal (<<_cpu_top_entity_signals>>) will become set
as soon as the CPU has fully halted ("CPU is sleeping"):
[start=1]
.The front-end (instruction fetch) is stopped. There is no pending instruction fetch bus access.
.The back-end (instruction execution) is stopped. There is no pending data bus access.
.There is not enabled interrupt pending.
CPU-external modules like memories, timers and peripheral interfaces are not affected by this. Furthermore, the CPU will
continue to buffer/enqueue incoming interrupt. The CPU will leave sleep mode as soon as any _enabled (via <<_mie>>)
interrupt source becomes _pending_ or if a debug session is started.
===== Power-Down Mode
Optionally, the sleep mode can also be used to shut down the CPU's main clock to further reduce power consumption
by halting the core's clock tree. This clock gating mode is enabled by the `CLOCK_GATING_EN` generic
(<<_processor_top_entity_generics>>). See section <<_processor_clocking>> for more information.
==== Full Virtualization
Just like the RISC-V ISA, the NEORV32 aims to provide _maximum virtualization_ capabilities on CPU and SoC level to
allow a high standard of **execution safety**. The CPU supports **all** traps specified by the official RISC-V
specifications. Thus, the CPU provides defined hardware fall-backs via traps for any expected and unexpected situations
(e.g. executing a malformed or not supported instruction or accessing a non-allocated memory address). For any kind
of trap the core is always in a defined and fully synchronized state throughout the whole system (i.e. there are no
out-of-order operations that might have to be reverted). This allows a defined and predictable execution behavior
at any time improving overall execution safety.
<<<
// ####################################################################################################################
:sectnums:
=== Bus Interface
The NEORV32 CPU provides separated instruction fetch and data access interfaces making it a **Harvard Architecture**:
the instruction fetch interface (`i_bus_*` signals) is used for fetching instructions and the data access interface
(`d_bus_*` signals) is used to access data via load and store operations. Each of these interfaces can access an address
space of up to 2^32^ bytes (4GB).
The bus interface uses two custom interface types: `bus_req_t` is used to propagate the bus access **requests**. These
signals are driven by the _accessing_ device (i.e. the CPU core). `bus_rsp_t` is used to return the bus **response** and
is driven by the _accessed_ device or bus system (i.e. a processor-internal memory or IO device).
.Bus Interface - Request Bus (`bus_req_t`)
[cols="^1,^1,<6"]
[options="header",grid="rows"]
|=======================
| Signal | Width | Description
| `addr` | 32 | Access address (byte addressing)
| `data` | 32 | Write data
| `ben` | 4 | Byte-enable for each byte in `data`
| `stb` | 1 | Request trigger ("strobe", single-shot)
| `rw` | 1 | Access direction (`0` = read, `1` = write)
| `src` | 1 | Access source (`0` = instruction fetch, `1` = load/store)
| `priv` | 1 | Set if privileged (M-mode) access
| `rvso` | 1 | Set if current access is a reservation-set operation (atomic `lr` or `sc` instruction)
| `fence` | 1 | Data/instruction fence operation; valid without `stb` being set
|=======================
.Bus Interface - Response Bus (`bus_rsp_t`)
[cols="^1,^1,<6"]
[options="header",grid="rows"]
|=======================
| Signal | Width | Description
| `data` | 32 | Read data (single-shot)
| `ack` | 1 | Transfer acknowledge / success (single-shot)
| `err` | 1 | Transfer error / fail (single-shot)
|=======================
:sectnums:
==== Bus Interface Protocol
Transactions are triggered entirely by the request bus. A new bus request is initiated by setting the _strobe_
signal `stb` high for exactly one cycle. All remaining signals of the bus are set together with `stb` and will
remain unchanged until the transaction is completed.
The transaction is completed when the accessed device returns a response via the response interface:
`ack` is high for exactly one cycle if the transaction was completed successfully. `err` is high for exactly
one cycle if the transaction failed to complete. These two signals are mutually exclusive. In case of a read
access the read data is returned together with the `ack` signal. Otherwise, the return data signal is
kept at all-zero allowing wired-or interconnection of all response buses.
The figure below shows three exemplary bus accesses:
[start=1]
. A read access to address `A_addr` returning `rdata` after several cycles (slow response; `ACK` arrives after several cycles).
. A write access to address `B_addr` writing `wdata` (fastest response; `ACK` arrives right in the next cycle).
. A failing read access to address `C_addr` (slow response; `ERR` arrives after several cycles).
.Three Exemplary Bus Transactions
image::bus_interface.png[700]
:sectnums:
==== Atomic Accesses
The load-reservate (`lr.w`) and store-conditional (`sc.w`) instructions from the <<_a_isa_extension>> execute as standard
load/store bus transactions but with the `rvso` ("reservation set operation") signal being set. It is the task of the
<<_reservation_set_controller>> to handle these LR/SC bus transactions accordingly.
.Reservation Set Controller
[NOTE]
See section <<_address_space>> / <<_reservation_set_controller>> for more information.
.Read-Modify-Write Operations
[IMPORTANT]
Read-modify-write operations (line an atomic swap / `amoswap.w`) are **not** supported. However, the NEORV32
<<_core_libraries>> provide an emulation wrapper for those unsupported instructions that is
based on LR/SC pairs. A demo/program can be found in `sw/example/atomic_test`.
The figure below shows three exemplary bus accesses (1 to 3 from left to right). The `req` signal record represents
the CPU-side of the bus interface. For easier understanding the current state of the reservation set is added as `rvs_valid` signal.
[start=1]
. A load-reservate (LR) instruction using `addr` as address. This instruction returns the loaded data `rdata` via `rsp.data`
and also registers a reservation for the address `addr` (`rvs_valid` becomes set).
. A store-conditional (SC) instruction attempts to write `wdata1` to address `addr`. This SC operation **succeeds**, so
`wdata1` is actually written to address `addr`. The successful operation is indicated by a **0** being returned via
`rsp.data` together with `ack`. As the LR/SC is completed the registered reservation is invalidated (`rvs_valid` becomes cleared).
. Another store-conditional (SC) instruction attempts to write `wdata2` to address `addr`. As the reservation set is already
invalidated (`rvs_valid` is `0`) the store access fails, so `wdata2` is **not** written to address `addr` at all. The failed
operation is indicated by a **1** being returned via `rsp.data` together with `ack`.
.Three Exemplary LR/SC Bus Transactions
image::bus_interface_atomic.png[700]
.SC Status
[NOTE]
The "normal" load data mechanism is used to return success/failure of the `sc.w` instruction to the CPU (via the LSB of `rsp.data`).
<<<
// ####################################################################################################################
:sectnums:
=== Instruction Sets and Extensions
The NEORV32 CPU provides several optional RISC-V and custom ISA extensions. The extensions can be enabled/configured
via the according <<_processor_top_entity_generics>>. This chapter gives a brief overview of the different ISA extensions.
.NEORV32 Instruction Set Extensions
[cols="<2,<5,<3"]
[options="header",grid="rows"]
|=======================
| Name | Description | <<_processor_top_entity_generics, Enabled by Generic>>
| <<_a_isa_extension,`A`>> | Atomic memory access instructions | `CPU_EXTENSION_RISCV_A`
| <<_b_isa_extension,`B`>> | Bit-manipulation instructions | `CPU_EXTENSION_RISCV_B`
| <<_c_isa_extension,`C`>> | Compressed (16-bit) instructions | `CPU_EXTENSION_RISCV_C`
| <<_e_isa_extension,`E`>> | Embedded CPU extension (reduced register file size) | `CPU_EXTENSION_RISCV_E`
| <<_i_isa_extension,`I`>> | Integer base ISA | Enabled if `CPU_EXTENSION_RISCV_E` is **not** enabled
| <<_m_isa_extension,`M`>> | Integer multiplication and division instructions | `CPU_EXTENSION_RISCV_M`
| <<_u_isa_extension,`U`>> | Less-privileged _user_ mode extension | `CPU_EXTENSION_RISCV_U`
| <<_x_isa_extension,`X`>> | Platform-specific / NEORV32-specific extension | Always enabled
| <<_zifencei_isa_extension,`Zifencei`>> | Instruction stream synchronization instruction | Always enabled
| <<_zfinx_isa_extension,`Zfinx`>> | Floating-point instructions using integer registers | `CPU_EXTENSION_RISCV_Zfinx`
| <<_zicntr_isa_extension,`Zicntr`>> | Base counters extension | `CPU_EXTENSION_RISCV_Zicntr`
| <<_zicond_isa_extension,`Zicond`>> | Integer conditional operations | `CPU_EXTENSION_RISCV_Zicond`
| <<_zicsr_isa_extension,`Zicsr`>> | Control and status register access instructions | Always enabled
| <<_zihpm_isa_extension,`Zihpm`>> | Hardware performance monitors extension | `CPU_EXTENSION_RISCV_Zihpm`
| <<_zmmul_isa_extension,`Zmmul`>> | Integer multiplication-only instruction | `CPU_EXTENSION_RISCV_Zmmul`
| <<_zcfu_isa_extension,`Zcfu`>> | Custom / user-defined instructions | `CPU_EXTENSION_RISCV_Zxcfu`
| <<_pmp_isa_extension,`PMP`>> | Physical memory protection extension | `PMP_NUM_REGIONS`
| <<_sdext_isa_extension,`Sdext`>> | External debug support extension | `ON_CHIP_DEBUGGER_EN`
| <<_sdtrig_isa_extension,`Sdtrig`>> | Trigger module extension | `ON_CHIP_DEBUGGER_EN`
|=======================
.RISC-V ISA Specifications
[TIP]
For more information regarding the RISC-V ISA extensions please refer to the "RISC-V Instruction Set Manual - Volume
I: Unprivileged ISA" and "The RISC-V Instruction Set Manual Volume II: Privileged Architecture" Acopy of all currently
implemented ISA extensions can be found in the projects `docs/references` folder.
.Discovering ISA Extensions
[TIP]
Software can discover available ISA extensions via the <<_misa>> and <<_mxisa>> CSRs or by executing an instruction
and checking for an illegal instruction exception (i.e. <<_full_virtualization>>).
.ISA Extensions-Specific CSRs
[NOTE]
The <<_control_and_status_registers_csrs>> section lists the according ISA extensions for all CSRs.
==== `A` ISA Extension
The `A` ISA extension adds instructions and mechanisms for atomic memory access operations. Note that the NEORV32 `A`
only includes the _load-reservate_ (`lr.w`) and _store-conditional_ (`sc.w`) instructions - the remaining read-modify-write
instructions (like `amoswap`) are **not supported**. However, these missing instructions can be emulated using the
LR and SC operations.
.AMO Emulation
[NOTE]
The NEORV32 <<_core_libraries>> provide an emulation wrapper for the missing AMO/read-modify-write instructions that is
based on LR/SC pairs. A demo/program can be found in `sw/example/atomic_test`.
Atomic instructions allow to notify an application if a certain memory location has been altered by another instance
(like another process running on the same CPU or a DMA access). Hence, they can be used to implement synchronization
mechanisms like mutexes and semaphores).
The NEORV32 `A` extension is enabled via the `CPU_EXTENSION_RISCV_A` generic (see <<_processor_top_entity_generics>>).
When enabled the following additional instructions are available.
.Instructions and Timing
[cols="<2,<4,<3"]
[options="header", grid="rows"]
|=======================
| Class | Instructions | Execution cycles
| Load-reservate word | `lr.w` | 5
| Store-conditional word | `sc.w` | 5
|=======================
The `lr.w` instructions stores one word to a word-aligned address and registers a _reservation set_. The `sc.w`
instruction stores a word to a word-aligned address only if the reservation set is still valid. Furthermore, the
`sc.w` operations returns the state of the reservation set (0 = reservation set still valid, data has been written;
1 = reservation set was broken, no data has been written). The reservation set is invalidated if another `lr.w` instruction
is executed or if any write access to the _reservated_ address takes place. Traps and/or CPU privilege level changes
do not modify current reservation sets.
.`aq` and `rl` Bits
[NOTE]
The instruction word's `aq` and `lr` memory ordering bits are not evaluated by the hardware at all.
.Atomic Memory Access on Hardware Level
[NOTE]
More information regarding the atomic memory accesses and the according reservation
sets can be found in section <<_reservation_set_controller>>.
.Cache Coherency
[IMPORTANT]
Atomic operations **always bypass** the CPU caches using direct/uncached accesses. Care must be taken
to maintain data cache coherency (e.g. by using the `fence` instruction).
==== `B` ISA Extension
The `B` ISA extension adds instructions for bit-manipulation operations.
This ISA extension is implemented as multi-cycle ALU co-process (`rtl/core/neorv32_cpu_cp_bitmanip.vhd`).
The NEORV32 `B` ISA extension includes the following sub-extensions:
* `Zba` - Address-generation instructions
* `Zbb` - Basic bit-manipulation instructions
* `Zbc` - Carry-less multiplication instructions
* `Zbs` - Single-bit instructions
.Instructions and Timing
[cols="<2,<4,<3"]
[options="header", grid="rows"]
|=======================
| Class | Instructions | Execution cycles
| Arithmetic/logic | `min[u]` `max[u]` `sext.b` `sext.h` `andn` `orn` `xnor` `zext`(pack) `rev8`(grevi) `orc.b`(gorci) | 4
| Shifts | `clz` `ctz` | 3 + 1..32; FAST_SHIFT: 4
| Shifts | `cpop` | 36; FAST_SHIFT: 4
| Shifts | `rol` `ror[i]` | 4 + _shift_amount_; FAST_SHIFT: 4
| Shifted-add | `sh1add` `sh2add` `sh3add` | 4
| Single-bit | `sbset[i]` `sbclr[i]` `sbinv[i]` `sbext[i]` | 4
| Carry-less multiply | `clmul` `clmulh` `clmulr` | 36
|=======================
.Barrel Shifter
[TIP]
Shift operations can be accelerated (at the cost of additional logic resources) by enabling the `FAST_SHIFT_EN`
configuration option that will replace the (time-variant) bit-serial shifter by a (time-constant) barrel shifter.
==== `C` ISA Extension
The "compressed" ISA extension provides 16-bit encodings of commonly used instructions to reduce code space size.
.Instructions and Timing
[cols="<2,<4,<3"]
[options="header", grid="rows"]
|=======================
| Class | Instructions | Execution cycles
| ALU | `c.addi4spn` `c.nop` `c.add[i]` `c.li` `c.addi16sp` `c.lui` `c.and[i]` `c.sub` `c.xor` `c.or` `c.mv` | 2
| ALU | `c.srli` `c.srai` `c.slli` | 3 + 1..32; FAST_SHIFT: 4
| Branches | `c.beqz` `c.bnez` | taken: 6; not taken: 3
| Jumps / calls | `c.jal[r]` `c.j` `c.jr` | 6
| Memory access | `c.lw` `c.sw` `c.lwsp` `c.swsp` | 4
| System | `c.break` | 3
|=======================
==== `E` ISA Extension
The "embedded" ISA extensions reduces the size of the general purpose register file from 32 entries to 16 entries to
shrink hardware size. It provides the same instructions as the the base `I` ISA extensions.
[NOTE]
Due to the reduced register file size an alternate toolchain ABI (`ilp32e*`) is required.
==== `I` ISA Extension
The `I` ISA extensions is the base RISC-V integer ISA that is always enabled.
.Instructions and Timing
[cols="<2,<4,<3"]
[options="header", grid="rows"]
|=======================
| Class | Instructions | Execution cycles
| ALU | `add[i]` `slt[i]` `slt[i]u` `xor[i]` `or[i]` `and[i]` `sub` `lui` `auipc` | 2
| ALU shifts | `sll[i]` `srl[i]` `sra[i]` | 3 + 1..32; FAST_SHIFT: 4
| Branches | `beq` `bne` `blt` `bge` `bltu` `bgeu` | taken: 6; not taken: 3
| Jump/call | `jal[r]` | 6
| Load/store | `lb` `lh` `lw` `lbu` `lhu` `sb` `sh` `sw` | 5
| System | `ecall` `ebreak` | 3
| Data fence | `fence` | 5
| System | `wfi` | 3
| System | `mret` | 5
| Illegal inst. | - | 3
|=======================
.`fence` Instruction
[NOTE]
The `fence` instruction word's _predecessor_ and _successor_ bits (used for memory ordering) are not evaluated
by the hardware at all. For the NEORV32 the `fence` instruction behaves exactly like the `fence.i` instruction
(see <<_zifencei_isa_extension>>). However, software should still use distinct `fence` and `fence.i` to provide
platform-compatibility and to indicate the actual intention of the according fence instruction(s).
.`wfi` Instruction
[NOTE]
The `wfi` instruction is used to enter <<_sleep_mode>>. Executing the `wfi` instruction in user-mode
will raise an illegal instruction exception if the `TW` bit of <<_mstatus>> is set.
.Barrel Shifter
[TIP]
The shift operations are implemented as multi-cycle ALU co-process (`rtl/core/neorv32_cpu_cp_shifter.vhd`).
These operations can be accelerated (at the cost of additional logic resources) by enabling the `FAST_SHIFT_EN`
configuration option that will replace the (time-variant) bit-serial shifter by a (time-constant) barrel shifter.
==== `M` ISA Extension
Hardware-accelerated integer multiplication and division operations are available via the RISC-V `M` ISA extension.
This ISA extension is implemented as multi-cycle ALU co-process (`rtl/core/neorv32_cpu_cp_muldiv.vhd`).
.Instructions and Timing
[cols="<2,<4,<3"]
[options="header", grid="rows"]
|=======================
| Class | Instructions | Execution cycles
| Multiplication | `mul` `mulh` `mulhsu` `mulhu` | 36; FAST_MUL: 4
| Division | `div` `divu` `rem` `remu` | 36
|=======================
.DSP Blocks
[TIP]
Multiplication operations can be accelerated (at the cost of additional logic resources) by enabling the `FAST_MUL_EN`
configuration option that will replace the (time-variant) bit-serial multiplier by (time-constant) FPGA DSP blocks.
==== `U` ISA Extension
In addition to the highest-privileged machine-mode, the user-mode ISA extensions adds a second **less-privileged**
operation mode. Code executed in user-mode has reduced CSR access rights. Furthermore, user-mode accesses to the address space
(like peripheral/IO devices) can be constrained via the physical memory protection.
Any kind of privilege rights violation will raise an exception to allow <<_full_virtualization>>.
==== `X` ISA Extension
The NEORV32-specific ISA extensions `X` is always enabled. The most important points of the NEORV32-specific extensions are:
* The CPU provides 16 _fast interrupt_ interrupts (`FIRQ`), which are controlled via custom bits in the <<_mie>>
and <<_mip>> CSRs. These extensions are mapped to CSR bits, that are available for custom use according to the
RISC-V specs. Also, custom trap codes for <<_mcause>> are implemented.
* All undefined/unimplemented/malformed/illegal instructions do raise an illegal instruction exception (see <<_full_virtualization>>).
* There are <<_neorv32_specific_csrs>>.
==== `Zifencei` ISA Extension
The `Zifencei` CPU extension allows manual synchronization of the instruction stream. This extension is always enabled.
.NEORV32 Fence Instructions
[NOTE]
The NEORV32 treats both fence instructions (`fence` = data fence, `fence.i` = instruction fence) in exactly the same way.
Both instructions cause a flush of the CPU's instruction prefetch buffer and also send a fence request via the system
bus (see <<_bus_interface>>). This system bus fence operation will, for example, clear/flush all downstream caches.
.Instructions and Timing
[cols="<2,<4,<3"]
[options="header", grid="rows"]
|=======================
| Class | Instructions | Execution cycles
| Instruction fence | `fence.i` | 5
|=======================
==== `Zfinx` ISA Extension
The `Zfinx` floating-point extension is an _alternative_ of the standard `F` floating-point ISA extension.
It also uses the integer register file `x` to store and operate on floating-point data
instead of a dedicated floating-point register file. Thus, the `Zfinx` extension requires
less hardware resources and features faster context changes. This also implies that there are NO dedicated `f`
register file-related load/store or move instructions. The `Zfinx` extension'S floating-point unit is controlled
via dedicated <<_floating_point_csrs>>.
This ISA extension is implemented as multi-cycle ALU co-process (`rtl/core/neorv32_cpu_cp_fpu.vhd`).
.Fused Multiply-Add and Division Instructions
[WARNING]
Fused multiply-add instructions `f[n]m[add/sub].s` are not supported!
Division `fdiv.s` and square root `fsqrt.s` instructions are not supported yet!
.Subnormal Number
[WARNING]
Subnormal numbers ("de-normalized" numbers, i.e. exponent = 0) are not supported by the NEORV32 FPU.
Subnormal numbers are _flushed to zero_ setting them to +/- 0 before being processed by **any** FPU operation.
If a computational instruction generates a subnormal result it is also flushed to zero during normalization.
.Instructions and Timing
[cols="<2,<4,<3"]
[options="header", grid="rows"]
|=======================
| Class | Instructions | Execution cycles
| Artihmetic | `fadd.s` | 110
| Artihmetic | `fsub.s` | 112
| Artihmetic | `fmul.s` | 22
| Compare | `fmin.s` `fmax.s` `feq.s` `flt.s` `fle.s` | 13
| Conversion | `fcvt.w.s` `fcvt.wu.s` `fcvt.s.w` `fcvt.s.wu` | 48
| Misc | `fsgnj.s` `fsgnjn.s` `fsgnjx.s` `fclass.s` | 12
|=======================
==== `Zicntr` ISA Extension
The `Zicntr` ISA extension adds the basic <<_cycleh>>, <<_mcycleh>>, <<_instreth>> and <<_minstreth>>
counter CSRs. Section <<_machine_counter_and_timer_csrs>> shows a list of all `Zicntr`-related CSRs.
[NOTE]
The user-mode `time[h]` CSRs are **not implemented**. Any access will trap allowing the trap handler to
retrieve system time from the <<_machine_system_timer_mtime>>.
[NOTE]
This extensions is stated as _mandatory_ by the RISC-V spec. However, area-constrained setups may remove
support for these counters.
==== `Zicond` ISA Extension
The `Zicond` ISA extension adds integer conditional move primitives that allow to implement branch-less
control flows. It is enabled by the top's `CPU_EXTENSION_RISCV_Zicond` generic.
This ISA extension is implemented as multi-cycle ALU co-process (`rtl/core/neorv32_cpu_cp_cond.vhd`).
.Instructions and Timing
[cols="<2,<4,<3"]
[options="header", grid="rows"]
|=======================
| Class | Instructions | Execution cycles
| Conditional | `czero.eqz` `czero.nez` | 3
|=======================
==== `Zicsr` ISA Extension
This ISA extensions provides instructions for accessing the <<_control_and_status_registers_csrs>> as well as further
privileged-architecture extensions. This extension is mandatory and cannot be disabled. Hence, there is no generic
for enabling/disabling this ISA extension.
[NOTE]
If `rd=x0` for the `csrrw[i]` instructions there will be no actual read access to the according CSR.
However, access privileges are still enforced so these instruction variants _do_ cause side-effects
(the RISC-V spec. state that these combinations "shall" not cause any side-effects).
.Instructions and Timing
[cols="<2,<4,<3"]
[options="header", grid="rows"]
|=======================
| Class | Instructions | Execution cycles
| System | `csrrw[i]` `csrrs[i]` `csrrc[i]` | 3
|=======================
==== `Zihpm` ISA Extension
In additions to the base counters the NEORV32 CPU provides up to 13 hardware performance monitors (HPM 3..15),
which can be used to benchmark applications. Each HPM consists of an N-bit wide counter (split in a high-word 32-bit
CSR and a low-word 32-bit CSR), where N is defined via the top's
`HPM_CNT_WIDTH` generic and a corresponding event configuration CSR. The event configuration
CSR defines the architectural events that lead to an increment of the associated HPM counter. See section
<<_hardware_performance_monitors_hpm_csrs>> for a list of all HPM-related CSRs and event configurations.
[TIP]
Auto-increment of the HPMs can be deactivated individually via the <<_mcountinhibit>> CSR.
==== `Zmmul` - ISA Extension
This is a sub-extension of the <<_m_isa_extension>> ISA extension. It implements only the multiplication operations
of the `M` extensions and is intended for size-constrained setups that require hardware-based
integer multiplications but not hardware-based divisions, which will be computed entirely in software.
==== `Zxcfu` ISA Extension
The `Zxcfu` presents a NEORV32-specific ISA extension. It adds the <<_custom_functions_unit_cfu>> to
the CPU core, which allows to add custom RISC-V instructions to the processor core.
For detailed information regarding the CFU, its hardware and the according software interface
see section <<_custom_functions_unit_cfu>>.
Software can utilize the custom instructions by using _intrinsics_, which are basically inline assembly functions that
behave like regular C functions but that evaluate to a single custom instruction word (no calling overhead at all).
==== `PMP` ISA Extension
The NEORV32 physical memory protection (PMP, also known as `Smpmp` ISA extension) provides an elementary memory
protection mechanism that can be used to constrain read, write and execute rights of arbitrary memory regions.
The NEORV32 PMP is fully compatible to the RISC-V Privileged Architecture Specifications. In general, the PMP can
**grant permissions to user mode**, which by default has none, and can **revoke permissions from M-mode**, which
by default has full permissions. The PMP is configured via the <<_machine_physical_memory_protection_csrs>>.
Several <<_processor_top_entity_generics>> are provided to fine-tune the CPU's PMP capabilities:
* `PMP_NUM_REGIONS` defines the number of implemented PMP region
* `PMP_MIN_GRANULARITY` defines the minimal granularity of each region
* `PMP_TOR_MODE_EN` controls the implementation of the top-of-region (TOR) mode
* `PMP_NAP_MODE_EN` controls the implementation of the naturally-aligned-power-of-two (NA4 and NAPOT) modes
.PMP Rules when in Debug Mode
[NOTE]
When in debug-mode all PMP rules are ignored making the debugger have maximum access rights.
[IMPORTANT]
Instruction fetches are also triggered when denied by a certain PMP rule. However, the fetched instruction(s)
will not be executed and will not change CPU core state.
==== `Sdext` ISA Extension
This ISA extension enables the RISC-V-compatible "external debug support" by implementing
the CPU "debug mode", which is required for the on-chip debugger.
See section <<_on_chip_debugger_ocd>> / <<_cpu_debug_mode>> for more information.
.Instructions and Timing
[cols="<2,<4,<3"]
[options="header", grid="rows"]
|=======================
| Class | Instructions | Execution cycles
| System | `dret` | 5
|=======================
==== `Sdtrig` ISA Extension
This ISA extension implements the RISC-V-compatible "trigger module".
See section <<_on_chip_debugger_ocd>> / <<_trigger_module>> for more information.
<<<
// ####################################################################################################################
include::cpu_cfu.adoc[]
<<<
// ####################################################################################################################
include::cpu_csr.adoc[]
<<<
// ####################################################################################################################
:sectnums:
==== Traps, Exceptions and Interrupts
In this document the following terminology is used (derived from the RISC-V trace specification
available at https://github.com/riscv-non-isa/riscv-trace-spec):
* **exception**: an unusual condition occurring at run time associated (i.e. _synchronous_) with an instruction in a RISC-V hart
* **interrupt**: an external _asynchronous_ event that may cause a RISC-V hart to experience an unexpected transfer of control
* **trap**: the transfer of control to a trap handler caused by either an _exception_ or an _interrupt_
Whenever an exception or interrupt is triggered, the CPU switches to machine-mode (if not already in machine-mode)
and continues operation at the address being stored in the <<_mtvec>> CSR. The cause of the the trap can be determined via the
<<_mcause>> CSR. A list of all implemented `mcause` values and the according description can be found below in section
<<_neorv32_trap_listing>>. The address that reflects the current program counter when a trap was taken is stored to
<<_mepc>> CSR. Additional information regarding the cause of the trap can be retrieved from the <<_mtval>> and <<_mtinst>> CSRs.
The traps are prioritized. If several _exceptions_ occur at once only the one with highest priority is triggered
while all remaining exceptions are ignored and discarded. If several _interrupts_ trigger at once, the one with highest priority
is serviced first while the remaining ones stay _pending_. After completing the interrupt handler the interrupt with
the second highest priority will get serviced and so on until no further interrupts are pending.
.Interrupts when in User-Mode
[IMPORTANT]
If the core is currently operating in less privileged user-mode, interrupts are globally enabled
even if <<_mstatus>>.mie is cleared.
.Interrupt Signal Requirements - Standard RISC-V Interrupts
[IMPORTANT]
All standard RISC-V interrupt request signals are **high-active**. A request has to stay at high-level
until it is explicitly acknowledged by the CPU software (for example by writing to a specific memory-mapped register).
.Interrupt Signal Requirements - NEORV32-Specific Fast Interrupt Requests
[IMPORTANT]
The NEORV32-specific FIRQ request lines are triggered (= becoming pending) by a one-shot high-level.
.Instruction Atomicity
[NOTE]
All instructions execute as atomic operations - interrupts can only trigger _between_ consecutive instructions.
Even if there is a permanent interrupt request, exactly one instruction from the interrupted program will be executed before
another interrupt handler can start. This allows program progress even if there are permanent interrupt requests.
:sectnums:
===== Memory Access Exceptions
If a load operation causes any exception, the instruction's destination register is **not written** at all. Furthermore,
exceptions caused by a misaligned memory address a physical memory protection fault do not trigger a memory access request at all.
For 32-bit-only instructions (= no `C` extension) the misaligned instruction exception is raised if bit 1 of the fetch
address is set (i.e. not on a 32-bit boundary). If the `C` extension is implemented there will **never** be a misaligned
instruction exception at all.
:sectnums:
===== Custom Fast Interrupt Request Lines
As a custom extension, the NEORV32 CPU features 16 fast interrupt request (FIRQ) lines via the `firq_i` CPU top
entity signals. These interrupts have custom configuration and status flags in the <<_mie>> and <<_mip>> CSRs and also
provide custom trap codes in <<_mcause>>. These FIRQs are reserved for NEORV32 processor-internal usage only.
:sectnums:
===== NEORV32 Trap Listing
The following tables show all traps that are currently supported by the NEORV32 CPU. It also shows the prioritization
and the CSR side-effects.
**Table Annotations**
The "Prio." column shows the priority of each trap with the highest priority being 1. The "RTE Trap ID" aliases are
defined by the NEORV32 core library (the runtime environment _RTE_) and can be used in plain C code when interacting
with the pre-defined RTE function. The <<_mcause>>, <<_mepc>>, <<_mtval>> and <<_mtinst>> columns show the value being
written to the according CSRs when a trap is triggered:
* **I-PC** - address of intercepted instruction (instruction has _not_ been executed yet)
* **PC** - address of instruction that caused the trap (instruction has been executed)
* **ADR** - bad data memory access address that caused the trap
* **INS** - the transformed/decompressed instruction word that caused the trap
* **0** - zero
.NEORV32 Trap Listing
[cols="1,4,8,10,2,2,2"]
[options="header",grid="rows"]
|=======================
| Prio. | `mcause` | RTE Trap ID | Cause | `mepc` | `mtval` | `mtinst`
7+^| **Exceptions** (_synchronous_ to instruction execution)
| 1 | `0x00000001` | `TRAP_CODE_I_ACCESS` | instruction access fault | I-PC | 0 | INS
| 2 | `0x00000002` | `TRAP_CODE_I_ILLEGAL` | illegal instruction | PC | 0 | INS
| 3 | `0x00000000` | `TRAP_CODE_I_MISALIGNED` | instruction address misaligned | PC | 0 | INS
| 4 | `0x0000000b` | `TRAP_CODE_MENV_CALL` | environment call from M-mode | PC | 0 | INS
| 5 | `0x00000008` | `TRAP_CODE_UENV_CALL` | environment call from U-mode | PC | 0 | INS
| 6 | `0x00000003` | `TRAP_CODE_BREAKPOINT` | software breakpoint / trigger firing | PC | 0 | INS
| 7 | `0x00000006` | `TRAP_CODE_S_MISALIGNED` | store address misaligned | PC | ADR | INS
| 8 | `0x00000004` | `TRAP_CODE_L_MISALIGNED` | load address misaligned | PC | ADR | INS
| 9 | `0x00000007` | `TRAP_CODE_S_ACCESS` | store access fault | PC | ADR | INS
| 10 | `0x00000005` | `TRAP_CODE_L_ACCESS` | load access fault | PC | ADR | INS
7+^| **Interrupts** (_asynchronous_ to instruction execution)
| 11 | `0x80000010` | `TRAP_CODE_FIRQ_0` | fast interrupt request channel 0 | I-PC | 0 | 0
| 12 | `0x80000011` | `TRAP_CODE_FIRQ_1` | fast interrupt request channel 1 | I-PC | 0 | 0
| 13 | `0x80000012` | `TRAP_CODE_FIRQ_2` | fast interrupt request channel 2 | I-PC | 0 | 0
| 14 | `0x80000013` | `TRAP_CODE_FIRQ_3` | fast interrupt request channel 3 | I-PC | 0 | 0
| 15 | `0x80000014` | `TRAP_CODE_FIRQ_4` | fast interrupt request channel 4 | I-PC | 0 | 0
| 16 | `0x80000015` | `TRAP_CODE_FIRQ_5` | fast interrupt request channel 5 | I-PC | 0 | 0
| 17 | `0x80000016` | `TRAP_CODE_FIRQ_6` | fast interrupt request channel 6 | I-PC | 0 | 0
| 18 | `0x80000017` | `TRAP_CODE_FIRQ_7` | fast interrupt request channel 7 | I-PC | 0 | 0
| 19 | `0x80000018` | `TRAP_CODE_FIRQ_8` | fast interrupt request channel 8 | I-PC | 0 | 0
| 20 | `0x80000019` | `TRAP_CODE_FIRQ_9` | fast interrupt request channel 9 | I-PC | 0 | 0
| 21 | `0x8000001a` | `TRAP_CODE_FIRQ_10` | fast interrupt request channel 10 | I-PC | 0 | 0
| 22 | `0x8000001b` | `TRAP_CODE_FIRQ_11` | fast interrupt request channel 11 | I-PC | 0 | 0
| 23 | `0x8000001c` | `TRAP_CODE_FIRQ_12` | fast interrupt request channel 12 | I-PC | 0 | 0
| 24 | `0x8000001d` | `TRAP_CODE_FIRQ_13` | fast interrupt request channel 13 | I-PC | 0 | 0
| 25 | `0x8000001e` | `TRAP_CODE_FIRQ_14` | fast interrupt request channel 14 | I-PC | 0 | 0
| 26 | `0x8000001f` | `TRAP_CODE_FIRQ_15` | fast interrupt request channel 15 | I-PC | 0 | 0
| 27 | `0x8000000B` | `TRAP_CODE_MEI` | machine external interrupt (MEI) | I-PC | 0 | 0
| 28 | `0x80000003` | `TRAP_CODE_MSI` | machine software interrupt (MSI) | I-PC | 0 | 0
| 29 | `0x80000007` | `TRAP_CODE_MTI` | machine timer interrupt (MTI) | I-PC | 0 | 0
|=======================
.NEORV32 Trap Description
[cols="<3,<7"]
[options="header",grid="rows"]
|=======================
| Trap ID [C] | Triggered when ...
| `TRAP_CODE_I_ACCESS` | bus timeout, bus access error or <<_pmp_isa_extension,PMP>> rule violation during instruction fetch
| `TRAP_CODE_I_ILLEGAL` | trying to execute an invalid instruction word (malformed or not supported) or on a privilege violation
| `TRAP_CODE_I_MISALIGNED` | fetching a 32-bit instruction word that is not 32-bit-aligned (see note below)
| `TRAP_CODE_MENV_CALL` | executing `ecall` instruction in machine-mode
| `TRAP_CODE_UENV_CALL` | executing `ecall` instruction in user-mode
| `TRAP_CODE_BREAKPOINT` | executing `ebreak` instruction or if <<_trigger_module>> fires
| `TRAP_CODE_S_MISALIGNED` | storing data to an address that is not naturally aligned to the data size (half/word)
| `TRAP_CODE_L_MISALIGNED` | loading data from an address that is not naturally aligned to the data size (half/word)
| `TRAP_CODE_S_ACCESS` | bus timeout, bus access error or <<_pmp_isa_extension,PMP>> rule violation during load data operation
| `TRAP_CODE_L_ACCESS` | bus timeout, bus access error or <<_pmp_isa_extension,PMP>> rule violation during store data operation
| `TRAP_CODE_FIRQ_*` | caused by interrupt-condition of **processor-internal modules**, see <<_neorv32_specific_fast_interrupt_requests>>
| `TRAP_CODE_MEI` | machine external interrupt (via dedicated <<_processor_top_entity_signals>>)
| `TRAP_CODE_MSI` | machine software interrupt (via dedicated <<_processor_top_entity_signals>>)
| `TRAP_CODE_MTI` | machine timer interrupt (internal <<_machine_system_timer_mtime>> or via dedicated <<_processor_top_entity_signals>>)
|=======================
.Resumable Exceptions
[WARNING]
Note that not all exceptions are resumable. For example, the "instruction access fault" exception or the "instruction
address misaligned" exception are not resumable in most cases. These exception might indicate a fatal memory hardware failure.

View File

@ -0,0 +1,258 @@
<<<
:sectnums:
=== Custom Functions Unit (CFU)
The Custom Functions Unit is the central part of the <<_zxcfu_isa_extension>> and represents
the actual hardware module, which can be used to implement _custom RISC-V instructions_.
The CFU is intended for operations that are inefficient in terms of performance, latency, energy consumption or
program memory requirements when implemented entirely in software. Some potential application fields and exemplary
use-cases might include:
* **AI:** sub-word / vector / SIMD operations like processing all four bytes of a 32-bit data word in parallel
* **Cryptographic:** bit substitution and permutation
* **Communication:** conversions like binary to gray-code; multiply-add operations
* **Image processing:** look-up-tables for color space transformations
* implementing instructions from **other RISC-V ISA extensions** that are not yet supported by the NEORV32
[NOTE]
The CFU is not intended for complex and _CPU-independent_ functional units that implement complete accelerators
(like block-based AES encryption). These kind of accelerators should be implemented as memory-mapped
<<_custom_functions_subsystem_cfs>>. A comparison of all NEORV32-specific chip-internal hardware extension
options is provided in the user guide section
https://stnolting.github.io/neorv32/ug/#_adding_custom_hardware_modules[Adding Custom Hardware Modules].
:sectnums:
==== CFU Instruction Formats
The custom instructions executed by the CFU utilize a specific opcode space in the `rv32` 32-bit instruction
space that has been explicitly reserved for user-defined extensions by the RISC-V specifications ("Guaranteed
Non-Standard Encoding Space"). The NEORV32 CFU uses the `custom` opcodes to identify the instructions implemented
by the CFU and to differentiate between the different instruction formats. The according binary encoding of these
opcodes is shown below:
* `custom-0`: `0001011` RISC-V standard, used for <<_cfu_r3_type_instructions>>
* `custom-1`: `0101011` RISC-V standard, used for <<_cfu_r4_type_instructions>>
* `custom-2`: `1011011` NEORV32-specific, used for <<_cfu_r5_type_instructions>> type A
* `custom-3`: `1111011` NEORV32-specific, used for <<_cfu_r5_type_instructions>> type B
:sectnums:
===== CFU R3-Type Instructions
The R3-type CFU instructions operate on two source registers `rs1` and `rs2` and return the processing result to
the destination register `rd`. The actual operation can be defined by using the `funct7` and `funct3` bit fields.
These immediates can also be used to pass additional data to the CFU like offsets, look-up-tables addresses or
shift-amounts. However, the actual functionality is entirely user-defined.
Example operation: `rd <= rs1 xnor rs2`
.CFU R3-type instruction format
image::cfu_r3type_instruction.png[align=center]
* `funct7`: 7-bit immediate (further operand data or function select)
* `rs2`: address of second source register (32-bit source data)
* `rs1`: address of first source register (32-bit source data)
* `funct3`: 3-bit immediate (further operand data or function select)
* `rd`: address of destination register (for the 32-bit processing result)
* `opcode`: `0001011` (RISC-V "custom-0" opcode)
.RISC-V compatibility
[NOTE]
The CFU R3-type instruction format is compliant to the RISC-V ISA specification.
.Instruction encoding space
[NOTE]
By using the `funct7` and `funct3` bit fields entirely for selecting the actual operation a total of 1024 custom
R3-type instructions can be implemented (7-bit + 3-bit = 10 bit -> 1024 different values).
:sectnums:
===== CFU R4-Type Instructions
The R4-type CFU instructions operate on three source registers `rs1`, `rs2` and `rs2` and return the processing
result to the destination register `rd`. The actual operation can be defined by using the `funct3` bit field.
Alternatively, this immediate can also be used to pass additional data to the CFU like offsets, look-up-tables
addresses or shift-amounts. However, the actual functionality is entirely user-defined.
Example operation: `rd <= (rs1 * rs2 + rs3)[31:0]`
.CFU R4-type instruction format
image::cfu_r4type_instruction.png[align=center]
* `rs3`: address of third source register (32-bit source data)
* `rs2`: address of second source register (32-bit source data)
* `rs1`: address of first source register (32-bit source data)
* `funct3`: 3-bit immediate (further operand data or function select)
* `rd`: address of destination register (for the 32-bit processing result)
* `opcode`: `0101011` (RISC-V "custom-1" opcode)
.RISC-V compatibility
[NOTE]
The CFU R4-type instruction format is compliant to the RISC-V ISA specification.
.Unused instruction bits
[NOTE]
The RISC-V ISA specification defines bits [26:25] of the R4-type instruction word to be all-zero. These bits
are ignored by the hardware (CFU and illegal instruction check logic) and should be set to all-zero to preserve
compatibility with future ISA spec. versions.
.Instruction encoding space
[NOTE]
By using the `funct3` bit field entirely for selecting the actual operation a total of 8 custom R4-type
instructions can be implemented (3-bit -> 8 different values).
:sectnums:
===== CFU R5-Type Instructions
The R5-type CFU instructions operate on four source registers `rs1`, `rs2`, `rs3` and `r4` and return the
processing result to the destination register `rd`. As all bits of the instruction word are used to encode the
five registers and the opcode, no further immediate bits are available to specify the actual operation. There
are two different R5-type instruction with two different opcodes available. Hence, only two R5-type operations
can be implemented out of the box.
Example operation: `rd <= rs1 & rs2 & rs3 & rs4`
.CFU R5-type instruction A format
image::cfu_r5type_instruction_a.png[align=center]
.CFU R5-type instruction B format
image::cfu_r5type_instruction_b.png[align=center]
* `rs4.hi` & `rs4.lo`: address of fourth source register (32-bit source data)
* `rs3`: address of third source register (32-bit source data)
* `rs2`: address of second source register (32-bit source data)
* `rs1`: address of first source register (32-bit source data)
* `rd`: address of destination register (for the 32-bit processing result)
* `opcode`: `1011011` (RISC-V "custom-2" opcode) and/or `1111011` (RISC-V "custom-3" opcode)
.RISC-V compatibility
[IMPORTANT]
The RISC-V ISA specifications does not specify a R5-type instruction format. Hence, this instruction
format is NEORV32-specific.
.Instruction encoding space
[IMPORTANT]
There are no immediate fields in the CFU R5-type instruction so the actual operation is specified entirely
by the opcode resulting in just two different operations out of the box. However, another CFU instruction
(like a R3-type instruction) can be used to "program" the actual operation of a R5-type instruction by
writing operation information to a CFU-internal "command" register.
:sectnums:
==== Using Custom Instructions in Software
The custom instructions provided by the CFU can be used in plain C code by using **intrinsics**. Intrinsics
behave like "normal" C functions but under the hood they are a set of macros that hide the complexity of inline assembly.
Using intrinsics removes the need to modify the compiler, built-in libraries or the assembler when using custom
instructions. Each intrinsic will be compiled into a single 32-bit instruction word providing maximum code efficiency.
.CFU Example Program
[TIP]
There is an example program for the CFU, which shows how to use the _default_ CFU hardware module.
This example program is located in `sw/example/demo_cfu`.
The NEORV32 software framework provides four pre-defined prototypes for custom instructions, which are defined in
`sw/lib/include/neorv32_cpu_cfu.h`:
.CFU instruction prototypes
[source,c]
----
neorv32_cfu_r3_instr(funct7, funct3, rs1, rs2) // R3-type instructions
neorv32_cfu_r4_instr(funct3, rs1, rs2, rs3) // R4-type instructions
neorv32_cfu_r5_instr_a(rs1, rs2, rs3, rs4) // R5-type instruction A
neorv32_cfu_r5_instr_b(rs1, rs2, rs3, rs4) // R5-type instruction B
----
The intrinsic functions always return a 32-bit value of type `uint32_t` (the processing result), which can be discarded
if not needed. Each intrinsic function requires several arguments depending on the instruction type/format:
* `funct7` - 7-bit immediate (R3-type only)
* `funct3` - 3-bit immediate (R3-type, R4-type)
* `rs1` - source operand 1, 32-bit (R3-type, R4-type)
* `rs2` - source operand 2, 32-bit (R3-type, R4-type)
* `rs3` - source operand 3, 32-bit (R3-type, R4-type, R5-type)
* `rs4` - source operand 4, 32-bit (R4-type, R4-type, R5-type)
The `funct3` and `funct7` bit-fields are used to pass 3-bit or 7-bit literals to the CFU. The `rs1`, `rs2`, `rs3`
and `r4` arguments pass the actual data to the CFU. These register arguments can be populated with variables or
literals. The following example shows how to pass arguments:
.CFU instruction usage example
[source,c]
----
uint32_t tmp = some_function();
...
uint32_t res = neorv32_cfu_r3_instr(0b0000000, 0b101, tmp, 123);
uint32_t foo = neorv32_cfu_r4_instr(0b011, tmp, res, (uint32_t)some_array[i]);
uint32_t bar = neorv32_cfu_r5_instr_a(tmp, res, foo, tmp);
----
:sectnums:
==== CFU Control and Status Registers (CFU-CSRs)
The CPU provides up to four control and status registers (<<_cfureg, `cfureg*`>>) to be
used within the CFU. These CSRs are mapped to the "custom user-mode read/write" CSR address space, which is
explicitly reserved for platform-specific application by the RISC-V spec. For example, these CSRs can be used
to pass additional operands to the CFU, to obtain additional results, to check processing status or to program
operation modes.
.CFU CSR Access Example
[source,c]
----
neorv32_cpu_csr_write(CSR_CFUREG0, 0xabcdabcd); // write data to CFU CSR 0
uint32_t tmp = neorv32_cpu_csr_read(CSR_CFUREG3); // read data from CFU CSR 3
----
.Additional CFU-internal CSRs
[TIP]
If more than four CFU-internal CSRs are required the designer can implement an "indirect access mechanism" based
on just two of the default CSRs: one CSR is used to configure the index while the other is used as alias to exchange
data with the indexed CFU-internal CSR - this concept is similar to the RISC-V Indirect CSR Access Extension
Specification (`Smcsrind`).
:sectnums:
==== Custom Instructions Hardware
The actual functionality of the CFU's custom instructions is defined by the user-defined logic inside
the CFU hardware module `rtl/core/neorv32_cpu_cp_cfu.vhd`.
CFU operations can be entirely combinatorial (like bit-reversal) so the result is available at the end of
the current clock cycle. Operations can also take several clock cycles to complete (like multiplications)
and may also include internal states and memories. The CFU's internal control unit takes care of
interfacing the custom user logic to the CPU pipeline.
.CFU Hardware Example & More Details
[TIP]
The default CFU hardware module already implement some exemplary instructions that are used for illustration
by the CFU example program. See the CFU's VHDL source file (`rtl/core/neorv32_cpu_cp_cfu.vhd`), which
is highly commented to explain the available signals, implementation options and the handshake with the CPU pipeline.
.CFU Hardware Resource Requirements
[NOTE]
Enabling the CFU and actually implementing R4-type and/or R5-type instructions (or more precisely, using
the according operands for the CFU hardware) will add one or two, respectively, additional read ports to
the core's register file significantly increasing resource requirements.
.CFU Access
[NOTE]
The CFU is accessible from all privilege modes (including CFU-internal registers accessed via the indirects CSR
access mechanism). It is the task of the CFU designers to add according access-constraining logic if certain CFU
states shall not be exposed to all privilege levels (i.e. exncryption keys).
.CFU Execution Time
[NOTE]
The CFU has to complete computation within a **bound time window**. Otherwise, the CFU operation is terminated
by the hardware and an illegal instruction exception is raised. See section <<_cpu_arithmetic_logic_unit>>
for more information.
.CFU Exception
[NOTE]
The CFU can intentionally raise an illegal instruction exception by not asserting the `done` at all causing an
execution timeout. For example this can be used to signal invalid configurations/operations to the runtime
environment. See the CFU's VHDL file for more information.

View File

@ -0,0 +1,998 @@
<<<
:sectnums:
=== Control and Status Registers (CSRs)
The following table shows a summary of all available NEORV32 CSRs. The address field defines the CSR address for
the CSR access instructions. The "Name [ASM]" column provides the CSR name aliases that can be used in (inline) assembly.
The "Name [C]" column lists the name aliases that are defined by the NEORV32 core library. These can be used in plain C code.
The "Access" column shows the minimal required privilege mode required for accessing the according CSR (`M` = machine-mode,
`U` = user-mode, `D` = debug-mode) and the read/write capabilities (`RW` = read-write, `RO` = read-only)
.Unused, Reserved, Unimplemented and Disabled CSRs
[IMPORTANT]
All CSRs and CSR bits that are not listed in the table below are _unimplemented_ and are _hardwired to zero_. Additionally,
CSRs that are unavailable ("disabled") because the according ISA extension is not enabled are also considered _unimplemented_
and are also hardwired to zero. Any access to such a CSR will raise an illegal instruction exception. All writable CSRs provide
**WARL** behavior (write all values; read only legal values). Application software should always read back a CSR after writing
to check if the targeted bits can actually be modified.
.NEORV32 Control and Status Registers (CSRs)
[cols="<2,<4,<5,^1,<11"]
[options="header"]
|=======================
| Address | Name [ASM] | Name [C] | Access | Description
5+^| **<<_floating_point_csrs>>**
| 0x001 | <<_fflags>> | `CSR_FFLAGS` | URW | Floating-point accrued exceptions
| 0x002 | <<_frm>> | `CSR_FRM` | URW | Floating-point dynamic rounding mode
| 0x003 | <<_fcsr>> | `CSR_FCSR` | URW | Floating-point control and status
5+^| **<<_machine_trap_setup_csrs>>**
| 0x300 | <<_mstatus>> | `CSR_MSTATUS` | MRW | Machine status register - low word
| 0x301 | <<_misa>> | `CSR_MISA` | MRW | Machine CPU ISA and extensions
| 0x304 | <<_mie>> | `CSR_MIE` | MRW | Machine interrupt enable register
| 0x305 | <<_mtvec>> | `CSR_MTVEC` | MRW | Machine trap-handler base address for ALL traps
| 0x306 | <<_mcounteren>> | `CSR_MCOUNTEREN` | MRW | Machine counter-enable register
| 0x310 | <<_mstatush>> | `CSR_MSTATUSH` | MRW | Machine status register - high word
5+^| **<<_machine_configuration_csrs>>**
| 0x30a | <<_menvcfg>> | `CSR_MENVCFG` | MRW | Machine environment configuration register - low word
| 0x31a | <<_menvcfgh>> | `CSR_MENVCFGH` | MRW | Machine environment configuration register - high word
5+^| **<<_machine_counter_setup_csrs>>**
| 0x320 | <<_mcountinhibit>> | `CSR_MCOUNTINHIBIT` | MRW | Machine counter-inhibit register
5+^| **<<_machine_trap_handling_csrs>>**
| 0x340 | <<_mscratch>> | `CSR_MSCRATCH` | MRW | Machine scratch register
| 0x341 | <<_mepc>> | `CSR_MEPC` | MRW | Machine exception program counter
| 0x342 | <<_mcause>> | `CSR_MCAUSE` | MRW | Machine trap cause
| 0x343 | <<_mtval>> | `CSR_MTVAL` | MRW | Machine trap value
| 0x344 | <<_mip>> | `CSR_MIP` | MRW | Machine interrupt pending register
| 0x34a | <<_mtinst>> | `CSR_MTINST` | MRW | Machine trap instruction
5+^| **<<_machine_physical_memory_protection_csrs>>**
| 0x3a0 .. 0x303 | <<_pmpcfg, `pmpcfg0`>> .. <<_pmpcfg, `pmpcfg3`>> | `CSR_PMPCFG0` .. `CSR_PMPCFG3` | MRW | Physical memory protection configuration registers
| 0x3b0 .. 0x3bf | <<_pmpaddr, `pmpaddr0`>> .. <<_pmpaddr, `pmpaddr15`>> | `CSR_PMPADDR0` .. `CSR_PMPADDR15` | MRW | Physical memory protection address registers
5+^| **<<_trigger_module_csrs>>**
| 0x7a0 | <<_tselect>> | `CSR_TSELECT` | MRW | Trigger select register
| 0x7a1 | <<_tdata1>> | `CSR_TDATA1` | MRW | Trigger data register 1
| 0x7a2 | <<_tdata2>> | `CSR_TDATA2` | MRW | Trigger data register 2
| 0x7a4 | <<_tinfo>> | `CSR_TINFO` | MRW | Trigger information register
5+^| **<<_cpu_debug_mode_csrs>>**
| 0x7b0 | <<_dcsr>> | - | DRW | Debug control and status register
| 0x7b1 | <<_dpc>> | - | DRW | Debug program counter
| 0x7b2 | <<_dscratch0>> | - | DRW | Debug scratch register 0
5+^| **<<_custom_functions_unit_cfu_csrs>>**
| 0x800 .. 0x803 | <<_cfureg, `cfureg0`>> .. <<_cfureg, `cfureg3`>> | `CSR_CFUCREG0` .. `CSR_CFUCREG3` | URW | Custom CFU registers 0 to 3
5+^| **<<_machine_counter_and_timer_csrs>>**
| 0xb00 | <<_mcycleh, `mcycle`>> | `CSR_MCYCLE` | MRW | Machine cycle counter low word
| 0xb02 | <<_minstreth, `minstret`>> | `CSR_MINSTRET` | MRW | Machine instruction-retired counter low word
| 0xb80 | <<_mcycleh, `mcycleh`>> | `CSR_MCYCLEH` | MRW | Machine cycle counter high word
| 0xb82 | <<_minstreth, `minstreth`>> | `CSR_MINSTRETH` | MRW | Machine instruction-retired counter high word
| 0xc00 | <<_cycleh, `cycle`>> | `CSR_CYCLE` | URO | Cycle counter low word
| 0xc02 | <<_instreth, `instret`>> | `CSR_INSTRET` | URO | Instruction-retired counter low word
| 0xc80 | <<_cycleh, `cycleh`>> | `CSR_CYCLEH` | URO | Cycle counter high word
| 0xc82 | <<_instreth, `instreth`>> | `CSR_INSTRETH` | URO | Instruction-retired counter high word
5+^| **<<_hardware_performance_monitors_hpm_csrs>>**
| 0x323 .. 0x32f | <<_mhpmevent, `mhpmevent3`>> .. <<_mhpmevent, `mhpmevent15`>> | `CSR_MHPMEVENT3` .. `CSR_MHPMEVENT15` | MRW | Machine performance-monitoring event select for counter 3..15
| 0xb03 .. 0xb0f | <<_mhpmcounterh, `mhpmcounter3`>> .. <<_mhpmcounterh, `mhpmcounter15`>> | `CSR_MHPMCOUNTER3` .. `CSR_MHPMCOUNTER15` | MRW | Machine performance-monitoring counter 3..15 low word
| 0xb83 .. 0xb8f | <<_mhpmcounterh, `mhpmcounter3h`>> .. <<_mhpmcounterh, `mhpmcounter15h`>> | `CSR_MHPMCOUNTER3H` .. `CSR_MHPMCOUNTER15H` | MRW | Machine performance-monitoring counter 3..15 high word
| 0xc03 .. 0xc0f | <<_hpmcounterh, `hpmcounter3`>> .. <<_hpmcounterh, `hpmcounter15`>> | `CSR_HPMCOUNTER3` .. `CSR_HPMCOUNTER15H` | URO | User performance-monitoring counter 3..15 low word
| 0xc83 .. 0xc8f | <<_hpmcounterh, `hpmcounter3h`>> .. <<_hpmcounterh, `hpmcounter15h`>> | `CSR_HPMCOUNTER3H` .. `CSR_HPMCOUNTER15H` | URO | User performance-monitoring counter 3..15 high word
5+^| **<<_machine_information_csrs>>**
| 0xf11 | <<_mvendorid>> | `CSR_MVENDORID` | MRO | Machine vendor ID
| 0xf12 | <<_marchid>> | `CSR_MARCHID` | MRO | Machine architecture ID
| 0xf13 | <<_mimpid>> | `CSR_MIMPID` | MRO | Machine implementation ID / version
| 0xf14 | <<_mhartid>> | `CSR_MHARTID` | MRO | Machine hardware thread ID
| 0xf15 | <<_mconfigptr>> | `CSR_MCONFIGPTR` | MRO | Machine configuration pointer register
5+^| **<<_neorv32_specific_csrs>>**
| 0xfc0 | <<_mxisa>> | `CSR_MXISA` | MRO | NEORV32-specific "eXtended" machine CPU ISA and extensions
|=======================
<<<
// ####################################################################################################################
:sectnums:
==== Floating-Point CSRs
[discrete]
===== **`fflags`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Floating-point accrued exceptions
| Address | `0x001`
| Reset value | `0x00000000`
| ISA | `Zicsr` & `Zfinx`
| Description | FPU status flags.
|=======================
.`fflags` CSR bits
[cols="^1,^1,<10"]
[options="header",grid="rows"]
|=======================
| Bit | R/W | Function
| 0 | r/w | **NX**: inexact
| 1 | r/w | **UF**: underflow
| 2 | r/w | **OF**: overflow
| 3 | r/w | **DZ**: division by zero
| 4 | r/w | **NV**: invalid operation
|=======================
{empty} +
[discrete]
===== **`frm`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Floating-point dynamic rounding mode
| Address | `0x002`
| Reset value | `0x00000000`
| ISA | `Zicsr` & `Zfinx`
| Description | The `frm` CSR is used to configure the rounding mode of the FPU.
|=======================
.`frm` CSR bits
[cols="^1,^1,<10"]
[options="header",grid="rows"]
|=======================
| Bit | R/W | Function
| 2:0 | r/w | Rounding mode
|=======================
{empty} +
[discrete]
===== **`fcsr`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Floating-point control and status register
| Address | `0x003`
| Reset value | `0x00000000`
| ISA | `Zicsr` & `Zfinx`
| Description | The `fcsr` provides combined access to the <<_fflags>> and <<_frm>> flags.
|=======================
.`fcsr` CSR bits
[cols="^1,^1,<10"]
[options="header",grid="rows"]
|=======================
| Bit | R/W | Function
| 4:0 | r/w | Accrued exception flags (<<_fflags>>)
| 7:5 | r/w | Rounding mode (<<_frm>>)
|=======================
<<<
// ####################################################################################################################
:sectnums:
==== Machine Trap Setup CSRs
[discrete]
===== **`mstatus`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine status register - low word
| Address | `0x300`
| Reset value | `0x00001800`
| ISA | `Zicsr`
| Description | The `mstatus` CSR is used to configure general machine environment parameters.
|=======================
.`mstatus` CSR bits
[cols="^1,^3,^1,<9"]
[options="header",grid="rows"]
|=======================
| Bit | Name [C] | R/W | Function
| 3 | `CSR_MSTATUS_MIE` | r/w | **MIE**: Machine-mode interrupt enable flag
| 7 | `CSR_MSTATUS_MPIE` | r/w | **MPIE**: Previous machine-mode interrupt enable flag state
| 12:11 | `CSR_MSTATUS_MPP_H` : `CSR_MSTATUS_MPP_L` | r/w | **MPP**: Previous machine privilege mode, `11` = machine-mode "M", `00` = user-mode "U"; other values will fall-back to machine-mode
| 17 | `CSR_MSTATUS_MPRV` | r/w | **MPRV**: Effective privilege mode for load/stores; use `MPP` as effective privilege mode when set; hardwired to zero if user-mode not implemented
| 21 | `CSR_MSTATUS_TW` | r/w | **TW**: Trap on execution of `wfi` instruction in user mode when set; hardwired to zero if user-mode not implemented
|=======================
[NOTE]
If the core is in user-mode, machine-mode interrupts are globally **enabled** even if `mstatus.mie` is cleared:
"Interrupts for higher-privilege modes, y>x, are always globally enabled regardless of the setting of the global yIE
bit for the higher-privilege mode." - RISC-V ISA Spec.
{empty} +
[discrete]
===== **`misa`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | ISA and extensions
| Address | `0x301`
| Reset value | `DEFINED`, according to enabled ISA extensions
| ISA | `Zicsr`
| Description | The `misa` CSR provides information regarding the availability of basic RISC-V ISa extensions.
|=======================
[NOTE]
The NEORV32 `misa` CSR is read-only. Hence, active CPU extensions are entirely defined by pre-synthesis configurations
and cannot be switched on/off during runtime. For compatibility reasons any write access to this CSR is simply ignored and
will _not_ cause an illegal instruction exception.
.`misa` CSR bits
[cols="^1,^3,^1,<9"]
[options="header",grid="rows"]
|=======================
| Bit | Name [C] | R/W | Function
| 0 | `CSR_MISA_A_EXT` | r/- | **A**: CPU extension (atomic memory access) available, set when <<_a_isa_extension>> enabled
| 1 | `CSR_MISA_B_EXT` | r/- | **B**: CPU extension (bit-manipulation) available, set when <<_b_isa_extension>> enabled
| 2 | `CSR_MISA_C_EXT` | r/- | **C**: CPU extension (compressed instruction) available, set when <<_c_isa_extension>> enabled
| 4 | `CSR_MISA_E_EXT` | r/- | **E**: CPU extension (embedded) available, set when <<_e_isa_extension>> enabled
| 8 | `CSR_MISA_I_EXT` | r/- | **I**: CPU base ISA, cleared when <<_e_isa_extension>> enabled
| 12 | `CSR_MISA_M_EXT` | r/- | **M**: CPU extension (mul/div) available, set when <<_m_isa_extension>> enabled
| 20 | `CSR_MISA_U_EXT` | r/- | **U**: CPU extension (user mode) available, set when <<_u_isa_extension>> enabled
| 23 | `CSR_MISA_X_EXT` | r/- | **X**: bit is always set to indicate non-standard / NEORV32-specific extensions
| 31:30 | `CSR_MISA_MXL_HI_EXT` : `CSR_MISA_MXL_LO_EXT` | r/- | **MXL**: 32-bit architecture indicator (always `01`)
|=======================
[TIP]
Machine-mode software can discover available `Z*` _sub-extensions_ (like `Zicsr` or `Zfinx`) by checking the NEORV32-specific
<<_mxisa>> CSR.
{empty} +
[discrete]
===== **`mie`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine interrupt-enable register
| Address | `0x304`
| Reset value | `0x00000000`
| ISA | `Zicsr`
| Description | The `mie` CSR is used to enable/disable individual interrupt sources.
|=======================
.`mie` CSR bits
[cols="^1,^3,^1,<9"]
[options="header",grid="rows"]
|=======================
| Bit | Name [C] | R/W | Function
| 3 | `CSR_MIE_MSIE` | r/w | **MSIE**: Machine _software_ interrupt enable
| 7 | `CSR_MIE_MTIE` | r/w | **MTIE**: Machine _timer_ interrupt enable (from <<_machine_system_timer_mtime>>)
| 11 | `CSR_MIE_MEIE` | r/w | **MEIE**: Machine _external_ interrupt enable
| 31:16 | `CSR_MIE_FIRQ15E` : `CSR_MIE_FIRQ0E` | r/w | Fast interrupt channel 15..0 enable
|=======================
{empty} +
[discrete]
===== **`mtvec`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine trap-handler base address
| Address | `0x305`
| Reset value | `CPU_BOOT_ADDR`, CPU boot address, 4-byte aligned (see <<_cpu_top_entity_generics>> and <<_address_space>>)
| ISA | `Zicsr`
| Description | The `mtvec` CSR holds the trap vector configuration.
|=======================
.`mtvec` CSR bits
[cols="^1,^1,<10"]
[options="header",grid="rows"]
|=======================
| Bit | R/W | Function
| 1:0 | r/w | **MODE**: mode configuration, `00` = DIRECT, `01` = VECTORED. (Others will fall back to DIRECT mode.)
| 31:2 | r/w | **BASE**: in DIRECT mode = 4-byte aligned base address of trap base handler, _all_ traps set `pc` = `BASE`; in VECTORED mode = 128-byte aligned base address of trap vector table, interrupts cause a jump to `pc` = `BASE` + 4 * `mcause` and exceptions to `pc` = `BASE`.
|=======================
.Interrupt Latency
[TIP]
The vectored `mtvec` mode is useful for reducing the time between interrupt request (IRQ) and servicing it (ISR). As software does not need to determine the interrupt cause the reduction in latency can be 5 to 10 times and as low as _26_ cycles.
{empty} +
[discrete]
===== **`mcounteren`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine counter enable
| Address | `0x306`
| Reset value | `0x00000000`
| ISA | `Zicsr` & `U`
| Description | The `mcounteren` CSR is used to constrain user-mode access to the CPU's counter CSRs.
|=======================
.`mcounteren` CSR bits
[cols="^1,^1,<8"]
[options="header",grid="rows"]
|=======================
| Bit | R/W | Function
| 0 | r/w (!) | **CY**: User-mode is allowed to read <<_cycleh>> CSRs when set
| 1 | r/- | **TM**: not implemented, hardwired to zero
| 2 | r/w (!) | **IR**: User-mode is allowed to read <<_instreth>> CSRs when set
| 15:3 | r/w (!) | **HPM**: user-mode is allowed to read <<_hpmcounterh>> CSRs when set
|=======================
[IMPORTANT]
Physically, the NEORV32's `mcounteren` CSR is implemented as a **single 1-bit register**. Setting _any_ bit of
the CSR will result in all bits being set. Hence, user-mode access can either be granted for **all** counter CSRs
or entirely denied allowing access to **none** counter CSRs.
{empty} +
[discrete]
===== **`mstatush`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine status register - high word
| Address | `0x310`
| Reset value | `0x00000000`
| ISA | `Zicsr`
| Description | The features of this CSR are not implemented yet. The register is read-only and always returns zero.
|=======================
<<<
// ####################################################################################################################
:sectnums:
==== Machine Trap Handling CSRs
[discrete]
===== **`mscratch`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Scratch register for machine trap handlers
| Address | `0x340`
| Reset value | `0x00000000`
| ISA | `Zicsr`
| Description | The `mscratch` is a general-purpose machine-mode scratch register.
|=======================
{empty} +
[discrete]
===== **`mepc`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine exception program counter
| Address | `0x341`
| Reset value | `CPU_BOOT_ADDR`, CPU boot address, 4-byte aligned (see <<_cpu_top_entity_generics>> and <<_address_space>>)
| ISA | `Zicsr`
| Description | The `mepc` CSR provides the instruction address where execution has stopped/failed when
an instruction is triggered / an exception is raised. See section <<_traps_exceptions_and_interrupts>> for a list of all legal values.
The `mret` instruction will return to the address stored in `mepc` by automatically moving `mepc` to the program counter.
|=======================
[NOTE]
`mepc[0]` is hardwired to zero. If IALIGN = 32 (i.e. <<_c_isa_extension>> is disabled) then `mepc[1]` is also hardwired to zero.
{empty} +
[discrete]
===== **`mcause`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine trap cause
| Address | `0x342`
| Reset value | `0x00000000`
| ISA | `Zicsr`
| Description | The `mcause` CSRs shows the exact cause of a trap. See section <<_traps_exceptions_and_interrupts>> for a list of all legal values.
|=======================
.`mcause` CSR bits
[cols="^1,^1,<10"]
[options="header",grid="rows"]
|=======================
| Bit | R/W | Function
| 4:0 | r/w | **Exception code**: see <<_neorv32_trap_listing>>
| 31 | r/w | **Interrupt**: `1` if the trap is caused by an interrupt (`0` if the trap is caused by an exception)
|=======================
{empty} +
[discrete]
===== **`mtval`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine trap value
| Address | `0x343`
| Reset value | `0x00000000`
| ISA | `Zicsr`
| Description | The `mtval` CSR provides additional information why a trap was entered. See section <<_traps_exceptions_and_interrupts>> for more information.
|=======================
.Read-Only
[IMPORTANT]
Note that the NEORV32 `mtval` CSR is updated by the hardware only and cannot be written from software.
However, any write-access will be ignored and will not cause an exception to maintain RISC-V compatibility.
{empty} +
[discrete]
===== **`mip`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine interrupt pending
| Address | `0x344`
| Reset value | `0x00000000`
| ISA | `Zicsr`
| Description | The `mip` CSR shows currently _pending_ machine-mode interrupt requests.
|=======================
.`mip` CSR bits
[cols="^1,^3,^1,<9"]
[options="header",grid="rows"]
|=======================
| Bit | Name [C] | R/W | Function
| 3 | `CSR_MIP_MSIP` | r/- | **MSIP**: Machine _software_ interrupt pending; _cleared by platform-defined mechanism_
| 7 | `CSR_MIP_MTIP` | r/- | **MTIP**: Machine _timer_ interrupt pending; _cleared by platform-defined mechanism_
| 11 | `CSR_MIP_MEIP` | r/- | **MEIP**: Machine _external_ interrupt pending; _cleared by platform-defined mechanism_
| 31:16 | `CSR_MIP_FIRQ15P` : `CSR_MIP_FIRQ0P` | r/w | **FIRQxP**: Fast interrupt channel 15..0 pending; has to be cleared manually by writing zero
|=======================
.FIRQ Channel Mapping
[TIP]
See section <<_neorv32_specific_fast_interrupt_requests>> for the mapping of the FIRQ channels and the according
interrupt-triggering processor module.
[NOTE]
The FIRQ channels can be triggered manually by software by writing `1` to the according `mip` bit.
{empty} +
[discrete]
===== **`mtinst`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine trap instruction
| Address | `0x34a`
| Reset value | `0x00000000`
| ISA | `Zicsr`
| Description | The `mtinst` CSR provides additional information why a trap was entered. See section <<_traps_exceptions_and_interrupts>> for more information.
|=======================
.Read-Only
[IMPORTANT]
Note that the NEORV32 `mtinst` CSR is updated by the hardware only and cannot be written from software.
However, any write-access will be ignored and will not cause an exception to maintain RISC-V compatibility.
.Instruction Transformation
[IMPORTANT]
The RISC-V priv. spec. suggests that the instruction word written to `mtinst` by the hardware should be "transformed".
However, the NEORV32 `mtinst` CSR uses a simplified transformation scheme: if the trap-causing instruction is a
standard 32-bit instruction, `mtinst` contains the exact instruction word that caused the trap. If the trap-causing
instruction is a compressed instruction, `mtinst` contains the de-compressed 32-bit equivalent with bit 1 being cleared.
<<<
// ####################################################################################################################
:sectnums:
==== Machine Configuration CSRs
[discrete]
===== **`menvcfg`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine environment configuration register - low word
| Address | `0x30a`
| Reset value | `0x00000000`
| ISA | `Zicsr` & `U`
| Description | Currently, the features of this CSR are not supported. Hence, the entire register is hardwired to all-zero.
|=======================
{empty} +
[discrete]
===== **`menvcfgh`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine environment configuration register - high word
| Address | `0x31a`
| Reset value | `0x00000000`
| ISA | `Zicsr` & `U`
| Description | Currently, the features of this CSR are not supported. Hence, the entire register is hardwired to all-zero.
|=======================
<<<
// ####################################################################################################################
:sectnums:
==== Machine Physical Memory Protection CSRs
The physical memory protection system is configured via the `PMP_NUM_REGIONS` and `PMP_MIN_GRANULARITY` top entity
generics. `PMP_NUM_REGIONS` defines the total number of implemented regions. Note that the maximum number of regions
is constrained to 16. If trying to access a PMP-related CSR beyond `PMP_NUM_REGIONS` **no illegal instruction exception**
is triggered. The according CSRs are read-only (writes are ignored) and always return zero.
See section <<_pmp_isa_extension>> for more information.
[discrete]
===== **`pmpcfg`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | PMP region configuration registers
| Address | `0x3a0` (`pmpcfg0`)
| | `0x3a1` (`pmpcfg1`)
| | `0x3a2` (`pmpcfg2`)
| | `0x3a3` (`pmpcfg3`)
| Reset value | `0x00000000`
| ISA | `Zicsr` & `PMP`
| Description | Configuration of physical memory protection regions. Each region provides an individual 8-bit array in these CSRs.
|=======================
.`pmpcfg0` CSR Bits
[cols="^1,^2,^1,<11"]
[options="header",grid="rows"]
|=======================
| Bit | Name [C] | R/W | Function
| 0 | `PMPCFG_R` | r/w | **R**: Read permission
| 1 | `PMPCFG_W` | r/w | **W**: Write permission
| 2 | `PMPCFG_X` | r/w | **X**: Execute permission
| 4:3 | `PMPCFG_A_MSB` : `PMPCFG_A_LSB` | r/w | **A**: Mode configuration (`00` = OFF, `01` = TOR, `10` = NA4, `11` = NAPOT)
| 7 | `PMPCFG_L` | r/w | **L**: Lock bit, prevents further write accesses, also enforces access rights in machine-mode, can only be cleared by CPU reset
|=======================
.Implemented Modes
[NOTE]
In order to reduce the CPU size certain PMP modes (`A` bits) can be excluded from synthesis.
Use the `PMP_TOR_MODE_EN` and `PMP_NAP_MODE_EN` <<_processor_top_entity_generics>> to control
implementation of the according modes.
{empty} +
[discrete]
===== **`pmpaddr`**
The `pmpaddr*` CSRs are used to configure the region's address boundaries.
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Physical memory protection address registers
| Address | `0x3b0` (`pmpaddr1`)
| | `0x3b1` (`pmpaddr2`)
| | `0x3b2` (`pmpaddr3`)
| | `0x3b3` (`pmpaddr4`)
| | `0x3b4` (`pmpaddr5`)
| | `0x3b5` (`pmpaddr6`)
| | `0x3b6` (`pmpaddr6`)
| | `0x3b7` (`pmpaddr7`)
| | `0x3b8` (`pmpaddr8`)
| | `0x3b9` (`pmpaddr9`)
| | `0x3ba` (`pmpaddr10`)
| | `0x3bb` (`pmpaddr11`)
| | `0x3bc` (`pmpaddr12`)
| | `0x3bd` (`pmpaddr13`)
| | `0x3be` (`pmpaddr14`)
| | `0x3bf` (`pmpaddr15`)
| Reset value | `0x00000000`
| ISA | `Zicsr` & `PMP`
| Description | Region address configuration. The two MSBs of each CSR are hardwired to zero (= bits 33:32 of the physical address).
|=======================
<<<
// ####################################################################################################################
:sectnums:
==== Custom Functions Unit (CFU) CSRs
[discrete]
===== **`cfureg`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Custom (user-defined) CFU CSRs
| Address | `0x800` (`cfureg0`)
| | `0x801` (`cfureg1`)
| | `0x802` (`cfureg2`)
| | `0x803` (`cfureg3`)
| Reset value | `0x00000000`
| ISA | `Zicsr` & `Zxcfu`
| Description | User-defined CSRs to be used within the <<_custom_functions_unit_cfu>>.
|=======================
<<<
// ####################################################################################################################
:sectnums:
==== (Machine) Counter and Timer CSRs
.`time[h]` CSRs (Wall Clock Time)
[IMPORTANT]
The NEORV32 does not implement the user-mode `time[h]` registers. Any access to these registers will trap.
It is recommended that the trap handler software provides a means of accessing the platform-defined <<_machine_system_timer_mtime>>.
.Instruction Retired Counter Increment
[NOTE]
The `[m]instret[h]` counter always increments when a instruction enters the pipeline's execute stage no matter
if this instruction is actually going to retire or if it causes an exception.
[discrete]
===== **`cycle[h]`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Cycle counter
| Address | `0xc00` (`cycle`)
| | `0xc80` (`cycleh`)
| Reset value | `0x00000000`
| ISA | `Zicsr` & `Zicntr`
| Description | The `cycle[h]` CSRs are user-mode shadow copies of the according <<_mcycleh>> CSRs. The user-mode
counter are read-only. Any write access will raise an illegal instruction exception.
|=======================
{empty} +
[discrete]
===== **`instret[h]`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Instructions-retired counter
| Address | `0xc02` (`instret`)
| | `0xc82` (`instreth`)
| Reset value | `0x00000000`
| ISA | `Zicsr` & `Zicntr`
| Description | The `instret[h]` CSRs are user-mode shadow copies of the according <<_minstreth>> CSRs. The user-mode
counter are read-only. Any write access will raise an illegal instruction exception.
|=======================
{empty} +
[discrete]
===== **`mcycle[h]`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine cycle counter
| Address | `0xb00` (`mcycle`)
| | `0xb80` (`mcycleh`)
| Reset value | `0x00000000`
| ISA | `Zicsr` & `Zicntr`
| Description | If not halted via the <<_mcountinhibit>> CSR the `cycle[h]` CSRs will increment with every active CPU clock
cycle (CPU not in sleep mode). These registers are read/write only for machine-mode software.
|=======================
{empty} +
[discrete]
===== **`minstret[h]`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine instructions-retired counter
| Address | `0xb02` (`minstret`)
| | `0xb82` (`minstreth`)
| Reset value | `0x00000000`
| ISA | `Zicsr` & `Zicntr`
| Description | If not halted via the <<_mcountinhibit>> CSR the `minstret[h]` CSRs will increment with every retired instruction.
These registers are read/write only for machine-mode software
|=======================
.Instruction Retiring
[IMPORTANT]
Note that **all** executed instruction do increment the `[m]instret`[h] counters even if they do not retire
(e.g. if the instruction causes an exception).
<<<
// ####################################################################################################################
:sectnums:
==== Hardware Performance Monitors (HPM) CSRs
The actual number of implemented hardware performance monitors is configured via the `HPM_NUM_CNTS` top entity generic,
Note that always all 13 HPM counter and configuration registers (`mhpmcounter*[h]` and `mhpmevent*`) are implemented, but
only the actually configured ones are implemented as "real" physical registers - the remaining ones will be hardwired to zero.
If trying to access an HPM-related CSR beyond `HPM_NUM_CNTS` **no illegal instruction exception is
triggered**. These CSRs are read-only (writes are ignored) and always return zero.
The total counter width of the HPMs can be configured before synthesis via the `HPM_CNT_WIDTH` generic (0..64-bit).
If `HPM_NUM_CNTS` is less than 64, all remaining MSB-aligned bits are hardwired to zero.
[discrete]
===== **`mhpmevent`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine hardware performance monitor event select
| Address | `0x233` (`mhpmevent3`)
| | `0x234` (`mhpmevent4`)
| | `0x235` (`mhpmevent5`)
| | `0x236` (`mhpmevent6`)
| | `0x237` (`mhpmevent7`)
| | `0x238` (`mhpmevent8`)
| | `0x239` (`mhpmevent9`)
| | `0x23a` (`mhpmevent10`)
| | `0x23b` (`mhpmevent11`)
| | `0x23c` (`mhpmevent12`)
| | `0x23d` (`mhpmevent13`)
| | `0x23e` (`mhpmevent14`)
| | `0x23f` (`mhpmevent15`)
| Reset value | `0x00000000`
| ISA | `Zicsr` & `Zihpm`
| Description | The value in these CSRs define the architectural events that cause an increment of the according `mhpmcounter*[h]` counter(s).
All available events are listed in the table below. If more than one event is selected, the according counter will increment if _any_ of
the enabled events is observed (logical OR). Note that the counter will only increment by 1 step per clock
cycle even if more than one trigger event is observed.
|=======================
.`mhpmevent*` CSR Bits
[cols="^1,^3,^1,<9"]
[options="header",grid="rows"]
|=======================
| Bit | Name [C] | R/W | Event Description
4+^| **RISC-V-compatible**
| 0 | `HPMCNT_EVENT_CY` | r/w | active clock cycle (CPU not in <<_sleep_mode>>)
| 1 | `HPMCNT_EVENT_TM` | r/- | _not implemented_, hardwired to zero
| 2 | `HPMCNT_EVENT_IR` | r/w | any executed instruction (16-bit/compressed or 32-bit/uncompressed)
4+^| **NEORV32-specific**
| 3 | `HPMCNT_EVENT_COMPR` | r/w | any executed 16-bit/compressed (<<_c_isa_extension>>) instruction
| 4 | `HPMCNT_EVENT_WAIT_DIS` | r/w | instruction dispatch wait cycle (wait for instruction prefetch-buffer refill (<<_cpu_control_unit>> IPB);
caused by a fence instruction, a control flow transfer or a instruction fetch bus wait cycle)
| 5 | `HPMCNT_EVENT_WAIT_ALU` | r/w | any delay/wait cycle caused by a _multi-cycle_ <<_cpu_arithmetic_logic_unit>> operation
| 6 | `HPMCNT_EVENT_BRANCH` | r/w | any executed branch instruction (unconditional, conditional-taken or conditional-not-taken)
| 7 | `HPMCNT_EVENT_BRANCHED` | r/w | any control transfer operation (unconditional jump, taken conditional branch or trap entry/exit)
| 8 | `HPMCNT_EVENT_LOAD` | r/w | any executed load operation (including atomic memory operations, <<_a_isa_extension>>)
| 9 | `HPMCNT_EVENT_STORE` | r/w | any executed store operation (including atomic memory operations, <<_a_isa_extension>>)
| 10 | `HPMCNT_EVENT_WAIT_LSU` | r/w | any memory/bus/cache/etc. delay/wait cycle while executing any load or store operation (caused by a data bus wait cycle))
| 11 | `HPMCNT_EVENT_TRAP` | r/w | starting processing of any trap (<<_traps_exceptions_and_interrupts>>)
|=======================
.Instruction Retiring ("Retired == Executed")
[IMPORTANT]
The CPU HPM/counter logic treats all executed instruction as "retired" even if they raise an exception,
cause an interrupt, trigger a privilege mode change or were not meant to retire (by the RISC-V spec.).
{empty} +
[discrete]
===== **`mhpmcounter[h]`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine hardware performance monitor (HPM) counter
| Address | `0xb03`, `0xb83` (`mhpmcounter3`, `mhpmcounter3h`)
| | `0xb04`, `0xb84` (`mhpmcounter4`, `mhpmcounter4h`)
| | `0xb05`, `0xb85` (`mhpmcounter5`, `mhpmcounter5h`)
| | `0xb06`, `0xb86` (`mhpmcounter6`, `mhpmcounter6h`)
| | `0xb07`, `0xb87` (`mhpmcounter7`, `mhpmcounter7h`)
| | `0xb08`, `0xb88` (`mhpmcounter8`, `mhpmcounter8h`)
| | `0xb09`, `0xb89` (`mhpmcounter9`, `mhpmcounter9h`)
| | `0xb0a`, `0xb8a` (`mhpmcounter10`, `mhpmcounter10h`)
| | `0xb0b`, `0xb8b` (`mhpmcounter11`, `mhpmcounter11h`)
| | `0xb0c`, `0xb8c` (`mhpmcounter12`, `mhpmcounter12h`)
| | `0xb0d`, `0xb8d` (`mhpmcounter13`, `mhpmcounter13h`)
| | `0xb0e`, `0xb8e` (`mhpmcounter14`, `mhpmcounter14h`)
| | `0xb0f`, `0xb8f` (`mhpmcounter15`, `mhpmcounter15h`)
| Reset value | `0x00000000`
| ISA | `Zicsr` & `Zihpm`
| Description | If not halted via the <<_mcountinhibit>> CSR the HPM counter CSR(s) increment whenever a
configured event from the according <<_mhpmevent>> CSR occurs. The counter registers are read/write for machine mode
and are not accessible for lower-privileged software.
|=======================
{empty} +
[discrete]
===== **`hpmcounter[h]`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | User hardware performance monitor (HPM) counter
| Address | `0xc03`, `0xc83` (`hpmcounter3`, `hpmcounter3h`)
| | `0xc04`, `0xc84` (`hpmcounter4`, `hpmcounter4h`)
| | `0xc05`, `0xc85` (`hpmcounter5`, `hpmcounter5h`)
| | `0xc06`, `0xc86` (`hpmcounter6`, `hpmcounter6h`)
| | `0xc07`, `0xc87` (`hpmcounter7`, `hpmcounter7h`)
| | `0xc08`, `0xc88` (`hpmcounter8`, `hpmcounter8h`)
| | `0xc09`, `0xc89` (`hpmcounter9`, `hpmcounter9h`)
| | `0xc0a`, `0xc8a` (`hpmcounter10`, `hpmcounter10h`)
| | `0xc0b`, `0xc8b` (`hpmcounter11`, `hpmcounter11h`)
| | `0xc0c`, `0xc8c` (`hpmcounter12`, `hpmcounter12h`)
| | `0xc0d`, `0xc8d` (`hpmcounter13`, `hpmcounter13h`)
| | `0xc0e`, `0xc8e` (`hpmcounter14`, `hpmcounter14h`)
| | `0xc0f`, `0xc8f` (`hpmcounter15`, `hpmcounter15h`)
| Reset value | `0x00000000`
| ISA | `Zicsr` & `Zihpm`
| Description | The `hpmcounter*[h]` are user-mode shadow copies of the according <<_mhpmcounterh>> CSRs. The user mode
counter CSRs are read-only. Any write access will raise an illegal instruction exception.
|=======================
<<<
// ####################################################################################################################
:sectnums:
==== Machine Counter Setup CSRs
[discrete]
===== **`mcountinhibit`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine counter-inhibit register
| Address | `0x320`
| Reset value | `0x00000000`
| ISA | `Zicsr`
| Description | Set bit to halt the according counter CSR.
|=======================
.`mcountinhibit` CSR Bits
[cols="^1,^3,^1,<9"]
[options="header",grid="rows"]
|=======================
| Bit | Name [C] | R/W | Description
| 0 | `CSR_MCOUNTINHIBIT_IR` | r/w | **IR**: Set to `1` to halt `[m]instret[h]`; hardwired to zero if `Zicntr` ISA extension is disabled
| 1 | - | r/- | **TM**: Hardwired to zero as `time[h]` CSRs are not implemented
| 2 | `CSR_MCOUNTINHIBIT_CY` | r/w | **CY**: Set to `1` to halt `[m]cycle[h]`; hardwired to zero if `Zicntr` ISA extension is disabled
| 15:3 | `CSR_MCOUNTINHIBIT_HPM3` : `CSR_MCOUNTINHIBIT_HPM15` | r/w | **HPMx**: Set to `1` to halt `[m]hpmcount*[h]`; hardwired to zero if `Zihpm` ISA extension is disabled
|=======================
<<<
// ####################################################################################################################
:sectnums:
==== Machine Information CSRs
[discrete]
===== **`mvendorid`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine vendor ID
| Address | `0xf11`
| Reset value | `DEFINED`
| ISA | `Zicsr`
| Description | Vendor ID (JEDEC identifier), assigned via the `VENDOR_ID` top generic (<<_processor_top_entity_generics>>).
|=======================
{empty} +
[discrete]
===== **`marchid`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine architecture ID
| Address | `0xf12`
| Reset value | `0x00000013`
| ISA | `Zicsr`
| Description | The `marchid` CSR is read-only and provides the NEORV32 official RISC-V open-source architecture ID
(decimal: 19, 32-bit hexadecimal: 0x00000013).
|=======================
{empty} +
[discrete]
===== **`mimpid`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine implementation ID
| Address | `0xf13`
| Reset value | `DEFINED`
| ISA | `Zicsr`
| Description | The `mimpid` CSR is read-only and provides the version of the
NEORV32 as BCD-coded number (example: `mimpid = 0x01020312` → 01.02.03.12 → version 1.2.3.12).
|=======================
{empty} +
[discrete]
===== **`mhartid`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine hardware thread ID
| Address | `0xf14`
| Reset value | `DEFINED`
| ISA | `Zicsr`
| Description | The `mhartid` CSR is read-only and provides the core's hart ID,
which is assigned via the `HW_THREAD_ID` top generic (<<_processor_top_entity_generics>>).
|=======================
{empty} +
[discrete]
===== **`mconfigptr`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine configuration pointer register
| Address | `0xf15`
| Reset value | `0x00000000`
| ISA | `Zicsr`
| Description | The features of this CSR are not implemented yet. The register is read-only and always returns zero.
|=======================
<<<
// ####################################################################################################################
:sectnums:
==== NEORV32-Specific CSRs
[NOTE]
All NEORV32-specific CSRs are mapped to addresses that are explicitly reserved for custom **Machine-Mode, read-only** CSRs
(assured by the RISC-V privileged specifications). Hence, these CSRs can only be accessed when in machine-mode. Any access
outside of machine-mode will raise an illegal instruction exception.
[discrete]
===== **`mxisa`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Machine extended isa and extensions register
| Address | `0xfc0`
| Reset value | `DEFINED`
| ISA | `Zicsr` & `X`
| Description | The `mxisa` CSRs is a NEORV32-specific read-only CSR that helps machine-mode software to
discover ISA sub-extensions and CPU configuration options
|=======================
.`mxisa` CSR Bits
[cols="^1,^3,^1,<5"]
[options="header",grid="rows"]
|=======================
| Bit | Name [C] | R/W | Description
| 0 | `CSR_MXISA_ZICSR` | r/- | <<_zicsr_isa_extension>> available
| 1 | `CSR_MXISA_ZIFENCEI` | r/- | <<_zifencei_isa_extension>> available
| 2 | `CSR_MXISA_ZMMUL` | r/- | <<_zmmul_isa_extension>> available
| 3 | `CSR_MXISA_ZXCFU` | r/- | <<_zxcfu_isa_extension>> available
| 4 | - | r/- | _reserved_, hardwired to zero
| 5 | `CSR_MXISA_ZFINX` | r/- | <<_zfinx_isa_extension>> available
| 6 | `CSR_MXISA_ZICOND` | r/- | <<_zicond_isa_extension>> available
| 7 | `CSR_MXISA_ZICNTR` | r/- | <<_zicntr_isa_extension>> available
| 8 | `CSR_MXISA_PMP` | r/- | <<_pmp_isa_extension>> available
| 9 | `CSR_MXISA_ZIHPM` | r/- | <<_zihpm_isa_extension>> available
| 10 | `CSR_MXISA_SDEXT` | r/- | <<_sdext_isa_extension>> available
| 11 | `CSR_MXISA_SDTRIG` | r/- | <<_sdtrig_isa_extension>> available
| 19:12 | - | r/- | hardwired to zero
| 20 | `CSR_MXISA_IS_SIM` | r/- | set if CPU is being **simulated** (⚠️ not guaranteed)
| 28:21 | - | r/- | hardwired to zero
| 29 | `CSR_MXISA_RFHWRST` | r/- | full hardware reset of register file available when set (`REGFILE_HW_RST`)
| 30 | `CSR_MXISA_FASTMUL` | r/- | fast multiplication available when set (`FAST_MUL_EN`)
| 31 | `CSR_MXISA_FASTSHIFT` | r/- | fast shifts available when set (`FAST_SHIFT_EN`)
|=======================

View File

@ -0,0 +1,25 @@
= The NEORV32 RISC-V Processor - Datasheet
include::../attrs.adoc[]
:title: [Datasheet] The NEORV32 RISC-V Processor
:icons: font
:imagesdir: img
:toc: left
:title-logo-image: neorv32_logo_riscv.png[pdfwidth=6.25in,align=center]
:favicon: img/icon.png
image::neorv32_logo.png[align=center,link="https://github.com/stnolting/neorv32"]
image::riscv_logo.png[width=350,align=center,link="https://riscv.org/"]
[.text-center]
https://github.com/stnolting/neorv32[image:https://img.shields.io/badge/GitHub-stnolting%2Fneorv32-ffbd00?style=flat-square&logo=github&[title='homepage']]
https://github.com/stnolting/neorv32/blob/main/LICENSE[image:https://img.shields.io/github/license/stnolting/neorv32?longCache=true&style=flat-square[title='license']]
https://gitter.im/neorv32/community[image:https://img.shields.io/badge/Chat-on%20gitter-4db797.svg?longCache=true&style=flat-square&logo=gitter&logoColor=e8ecef[title='chat on gitter']]
https://github.com/stnolting/neorv32/releases/tag/nightly[image:https://img.shields.io/badge/data%20sheet-PDF-ffbd00?longCache=true&style=flat-square&logo=asciidoctor[title='datasheet (pdf)']]
https://github.com/stnolting/neorv32/releases/tag/nightly[image:https://img.shields.io/badge/user%20guide-PDF-ffbd00?longCache=true&style=flat-square&logo=asciidoctor[title='userguide (pdf)']]
https://stnolting.github.io/neorv32/ug[image:https://img.shields.io/badge/-HTML-ffbd00?longCache=true&style=flat-square[title='userguide (html)']]
https://stnolting.github.io/neorv32/sw/files.html[image:https://img.shields.io/badge/doxygen-HTML-ffbd00?longCache=true&style=flat-square&logo=Doxygen[title='doxygen']] +
https://github.com/stnolting/neorv32/releases[image:https://img.shields.io/github/v/release/stnolting/neorv32?longCache=true&style=flat-square&logo=GitHub[title='release']]
https://github.com/stnolting/neorv32/releases[image:https://img.shields.io/github/commits-since/stnolting/neorv32/latest?longCache=true&style=flat-square&logo=GitHub[title='release-commits']]
include::content.adoc[]

View File

@ -0,0 +1,19 @@
= The NEORV32 RISC-V Processor: Datasheet
include::../attrs.adoc[]
include::../attrs.main.adoc[]
<<<
// ####################################################################################################################
.**Documentation**
[TIP]
The online documentation of the project (a.k.a. the **data sheet**) is available on GitHub-pages: https://stnolting.github.io/neorv32/ +
+
The online documentation of the **software framework** is also available on GitHub-pages: https://stnolting.github.io/neorv32/sw/files.html
<<<
// ####################################################################################################################
toc::[]
include::content.adoc[]

View File

@ -0,0 +1,786 @@
<<<
:sectnums:
== On-Chip Debugger (OCD)
The NEORV32 Processor features an _on-chip debugger_ (OCD) implementing the **execution-based debugging** scheme,
which is compatible to the **Minimal RISC-V Debug Specification**. A copy of the specification is
available in `docs/references`.
**Key Features**
* standard JTAG access port
* full control of the CPU: halting, single-stepping and resuming
* indirect access to all core registers (via program buffer)
* indirect access to the whole processor address space (via program buffer)
* trigger module for hardware breakpoints
* compatible with upstream OpenOCD and GDB
**Section Structure**
* <<_debug_transport_module_dtm>>
* <<_debug_module_dm>>
* <<_cpu_debug_mode>>
* <<_trigger_module>>
.GDB + SVD
[TIP]
Together with a third-party plugin the processor's SVD file can be imported right into GDB to allow comfortable
debugging of peripheral/IO devices (see https://github.com/stnolting/neorv32/discussions/656).
.Hands-On Tutorial
[TIP]
A simple example on how to use NEORV32 on-chip debugger in combination with OpenOCD and the GNU debugger is shown in
section https://stnolting.github.io/neorv32/ug/#_debugging_using_the_on_chip_debugger[Debugging using the On-Chip Debugger]
of the User Guide.
.OCD Security Note
[NOTE]
JTAG access via the OCD is **always authenticated** (`dmstatus.authenticated = 1`). Hence, the entire system can always
be accessed via the on-chip debugger.
The NEORV32 on-chip debugger complex is based on four hardware modules:
.NEORV32 on-chip debugger complex
image::neorv32_ocd_complex.png[align=center]
[start=1]
. <<_debug_transport_module_dtm>> (`rtl/core/neorv32_debug_dtm.vhd`): JTAG access tap to allow an external
adapter to interface with the _debug module (DM)_ using the _debug module interface (dmi)_.
. <<_debug_module_dm>> (`rtl/core/neorv32_debug_tm.vhd`): RISC-V debug module that is configured by the DTM via the _dmi_.
From the CPU's "point of view" this module behaves as another memory-mapped "peripheral" that can be accessed via the
processor-internal bus. The memory-mapped registers provide an internal _data buffer_ for data transfer from/to the DM, a
_code ROM_ containing the "park loop" code, a _program buffer_ to allow the debugger to execute small programs defined by the
DM and a _status register_ that is used to communicate _exception, _halt_, _resume_ and _execute_ requests/acknowledges from/to the DM.
. CPU <<_cpu_debug_mode>> extension (part of `rtl/core/neorv32_cpu_control.vhd`): This extension provides the "debug execution mode",
which executes the park loop code from the DM. The mode also provides additional CSRs and instructions.
. CPU <<_trigger_module>> (also part of `rtl/core/neorv32_cpu_control.vhd`): This module provides a single _hardware_ breakpoint,
which allows to debug code executed from ROM.
**Theory of Operation**
When debugging the system using the OCD, the debugger issues a halt request to the CPU (via the CPU's
`db_halt_req_i` signal) to make the CPU enter _debug mode_. In this state, the application-defined architectural
state of the system/CPU is "frozen" so the debugger can monitor if without interfering with the actual application.
However, the OCD can also modify the entire architectural state at any time. While in debug mode, the debugger has
full control over the entire CPU and processor.
While in debug mode, the CPU executes the "park loop" code from the code ROM of the DM.
This park loop implements an endless loop, where the CPU polls the memory-mapped _status register_ that is
controlled by the debug module (DM). The flags in this register are used to communicate requests from
the DM and to acknowledge them by the CPU: trigger execution of the program buffer or resume the halted
application. Furthermore, the CPU uses this register to signal that the CPU has halted after a halt request
and to signal that an exception has been triggered while being in debug mode.
<<<
// ####################################################################################################################
:sectnums:
=== Debug Transport Module (DTM)
The debug transport module (VHDL module: `rtl/core/neorv32_debug_dtm.vhd`) provides a JTAG test access port (TAP).
External JTAG access is provided by the following top-level ports.
.JTAG top level signals
[cols="^2,^2,^2,<8"]
[options="header",grid="rows"]
|=======================
| Name | Width | Direction | Description
| `jtag_trst_i` | 1 | in | TAP reset (low-active); this signal is optional, make sure to pull it **high** if not used
| `jtag_tck_i` | 1 | in | serial clock
| `jtag_tdi_i` | 1 | in | serial data input
| `jtag_tdo_o` | 1 | out | serial data output
| `jtag_tms_i` | 1 | in | mode select
|=======================
.Maximum JTAG Clock
[IMPORTANT]
All JTAG signals are synchronized to the processor's clock domain. Hence, no additional clock domain is required for the DTM.
However, this constraints the maximal JTAG clock frequency (`jtag_tck_i`) to be less than or equal to **1/5** of the processor
clock frequency (`clk_i`).
.Maintaining JTAG Chain
[NOTE]
If the on-chip debugger is disabled the JTAG serial input `jtag_tdi_i` is directly
connected to the JTAG serial output `jtag_tdo_o` to maintain the JTAG chain.
JTAG accesses are based on a single _instruction register_ `IR`, which is 5 bit wide, and several _data registers_ `DR`
with different sizes. The individual data registers are accessed by writing the according address to the instruction
register. The following table shows the available data registers and their addresses:
.JTAG TAP registers
[cols="^2,^2,^2,<8"]
[options="header",grid="rows"]
|=======================
| Address (via `IR`) | Name | Size (bits) | Description
| `00001` | `IDCODE` | 32 | identifier, hardwired to `0x00000001`
| `10000` | `DTMCS` | 32 | debug transport module control and status register
| `10001` | `DMI` | 41 | debug module interface (_dmi_); 7-bit address, 32-bit read/write data, 2-bit operation (`00` = NOP; `10` = write; `01` = read)
| others | `BYPASS` | 1 | default JTAG bypass register
|=======================
.`DTMCS` - DTM Control and Status Register
[cols="^2,^3,^1,<8"]
[options="header",grid="rows"]
|=======================
| Bit(s) | Name | R/W | Description
| 31:18 | - | r/- | _reserved_, hardwired to zero
| 17 | `dmihardreset` | r/w | setting this bit will reset the debug module interface; this bit auto-clears
| 16 | `dmireset` | r/w | setting this bit will clear the sticky error state; this bit auto-clears
| 15 | - | r/- | _reserved_, hardwired to zero
| 14:12 | `idle` | r/- | recommended idle states (= 0, no idle states required)
| 11:10 | `dmistat` | r/- | DMI status: `00` = no error, `01` = reserved, `10` = operation failed, `11` = failed operation during pending DMI operation
| 9:4 | `abits` | r/- | number of address bits in `DMI` register (= 6)
| 3:0 | `version` | r/- | `0001` = DTM is compatible to spec. versions v0.13 and v1.0
|=======================
<<<
// ####################################################################################################################
:sectnums:
=== Debug Module (DM)
The debug module "DM" (VHDL module: `rtl/core/neorv32_debug_dm.vhd`) acts as a translation interface between abstract
operations issued by the debugger (application) and the platform-specific debugger (hardware) implementation.
It supports the following features:
* Gives the debugger necessary information about the implementation.
* Allows the hart to be halted/resumed and provides status of the current state.
* Provides abstract read and write access to the halted hart's GPRs.
* Provides access to a reset signal that allows debugging from the very first instruction after reset.
* Provides a Program Buffer to force the hart to execute arbitrary instructions.
* Allows memory access from a hart's point of view.
The NEORV32 DM follows the "Minimal RISC-V External Debug Specification" to provide full debugging capabilities while
keeping resource/area requirements at a minimum. It implements the **execution based debugging scheme** for a
single hart and provides the following hardware features:
* program buffer with 2 entries and implicit `ebreak` instruction afterwards
* no _direct_ bus access; indirect bus access via the CPU using the program buffer
* abstract commands: "access register" plus auto-execution
* no dedicated halt-on-reset capabilities yet (but can be emulated)
.DM Spec. Version
[TIP]
By default, the OCD's debug module supports version 1.0 of the RISC-V debug spec. For backwards compatibility, the DM
can be "downgraded" back to version 0.13 via the `DM_LEGACY_MODE` generic (see <<_processor_top_entity_generics>>).
The DM provides two access "point of views": accesses from the DTM via the _debug module interface (dmi)_ and
accesses from the CPU via the processor-internal bus. From the DTM's point of view, the DM implements a set of
<<_dm_registers>> that are used to control and monitor the actual debugging. From the CPU's point of view, the
DM implements several memory-mapped registers (within the _normal_ address space) that are used for communicating
debugging control and status (<<_dm_cpu_access>>).
:sectnums:
==== DM Registers
The DM is controlled via a set of registers that are accessed via the DTM's _debug module interface_.
The following registers are implemented:
[NOTE]
Write accesses to registers that are not implemented are simply ignored and read accesses will always return zero.
.Available DM registers
[cols="^2,^3,<7"]
[options="header",grid="rows"]
|=======================
| Address | Name | Description
| `0x04` | <<_data0>> | Abstract data 0, used for data transfer between debugger and processor
| `0x10` | <<_dmcontrol>> | Debug module control
| `0x11` | <<_dmstatus>> | Debug module status
| `0x12` | <<_hartinfo>> | Hart information
| `0x16` | <<_abstracts>> | Abstract control and status
| `0x17` | <<_command>> | Abstract command
| `0x18` | <<_abstractauto>> | Abstract command auto-execution
| `0x1d` | `nextdm` | Base address of next DM; reads as zero to indicate there is only one DM
| `0x20` | <<_progbuf, `progbuf0`>> | Program buffer 0
| `0x21` | <<_progbuf, `progbuf1`>> | Program buffer 1
| `0x38` | `sbcs` | System bus access control and status; reads as zero to indicate there is **no** direct system bus access
| `0x40` | <<_haltsum0>> | Halted harts
|=======================
:sectnums!:
===== **`data0`**
[cols="4,27,>7"]
[frame="topbot",grid="none"]
|======
| 0x04 | **Abstract data 0** | `data0`
3+| Reset value: `0x00000000`
3+| Basic read/write data exchange register to be used with abstract commands (for example to read/write data from/to CPU GPRs).
|======
:sectnums!:
===== **`dmcontrol`**
[cols="4,27,>7"]
[frame="topbot",grid="none"]
|======
| 0x10 | **Debug module control register** | `dmcontrol`
3+| Reset value: `0x00000000`
3+| Control of the overall debug module and the hart. The following table shows all implemented bits. All remaining bits/bit-fields
are configured as "zero" and are read-only. Writing '1' to these bits/fields will be ignored.
|======
.`dmcontrol` Register Bits
[cols="^1,^2,^1,<8"]
[options="header",grid="rows"]
|=======================
| Bit | Name [RISC-V] | R/W | Description
| 31 | `haltreq` | -/w | set/clear hart halt request
| 30 | `resumereq` | -/w | request hart to resume
| 28 | `ackhavereset` | -/w | write `1` to clear `*havereset` flags
| 1 | `ndmreset` | r/w | put whole system (except OCD) into reset state when `1`
| 0 | `dmactive` | r/w | DM enable; writing `0`-`1` will reset the DM
|=======================
:sectnums!:
===== **`dmstatus`**
[cols="4,27,>7"]
[frame="topbot",grid="none"]
|======
| 0x11 | **Debug module status register** | `dmstatus`
3+| Reset value: `0x00400083`
3+| Current status of the overall debug module and the hart. The entire register is read-only.
|======
.`dmstatus` Register Bits
[cols="^1,^2,<10"]
[options="header",grid="rows"]
|=======================
| Bit | Name [RISC-V] | Description
| 31:23 | _reserved_ | reserved; always zero
| 22 | `impebreak` | always `1`; indicates an implicit `ebreak` instruction after the last program buffer entry
| 21:20 | _reserved_ | reserved; always zero
| 19 | `allhavereset` .2+| `1` when the hart is in reset
| 18 | `anyhavereset`
| 17 | `allresumeack` .2+| `1` when the hart has acknowledged a resume request
| 16 | `anyresumeack`
| 15 | `allnonexistent` .2+| always zero to indicate the hart is always existent
| 14 | `anynonexistent`
| 13 | `allunavail` .2+| `1` when the DM is disabled to indicate the hart is unavailable
| 12 | `anyunavail`
| 11 | `allrunning` .2+| `1` when the hart is running
| 10 | `anyrunning`
| 9 | `allhalted` .2+| `1` when the hart is halted
| 8 | `anyhalted`
| 7 | `authenticated` | always `1`; there is no authentication
| 6 | `authbusy` | always `0`; there is no authentication
| 5 | `hasresethaltreq` | always `0`; halt-on-reset is not supported (directly)
| 4 | `confstrptrvalid` | always `0`; no configuration string available
| 3:0 | `version` | debug spec. version; `0011` (v1.0) or `0010` (v0.13); configured via the `DM_LEGACY_MODE` <<_processor_top_entity_generics>>
|=======================
:sectnums!:
===== **`hartinfo`**
[cols="4,27,>7"]
[frame="topbot",grid="none"]
|======
| 0x12 | **Hart information** | `hartinfo`
3+| Reset value: _see below_
3+| This register gives information about the hart. The entire register is read-only.
|======
.`hartinfo` Register Bits
[cols="^1,^2,<8"]
[options="header",grid="rows"]
|=======================
| Bit | Name [RISC-V] | Description
| 31:24 | _reserved_ | reserved; always zero
| 23:20 | `nscratch` | `0001`, number of `dscratch*` CPU registers = 1
| 19:17 | _reserved_ | reserved; always zero
| 16 | `dataccess` | `0`, the `data` registers are shadowed in the hart's address space
| 15:12 | `datasize` | `0001`, number of 32-bit words in the address space dedicated to shadowing the `data` registers (1 register)
| 11:0 | `dataaddr` | = `dm_data_base_c(11:0)`, signed base address of `data` words (see address map in <<_dm_cpu_access>>)
|=======================
:sectnums!:
===== **`abstracts`**
[cols="4,27,>7"]
[frame="topbot",grid="none"]
|======
| 0x16 | **Abstract control and status** | `abstracts`
3+| Reset value: `0x02000801`
3+| Command execution info and status.
|======
.`abstracts` Register Bits
[cols="^1,^2,^1,<8"]
[options="header",grid="rows"]
|=======================
| Bit | Name [RISC-V] | R/W | Description
| 31:29 | _reserved_ | r/- | reserved; always zero
| 28:24 | `progbufsize` | r/- | always `0010`: size of the program buffer (`progbuf`) = 2 entries
| 23:11 | _reserved_ | r/- | reserved; always zero
| 12 | `busy` | r/- | `1` when a command is being executed
| 11 | `relaxedpriv` | r/- | always `1`: PMP rules are ignored when in debug mode
| 10:8 | `cmderr` | r/w | error during command execution (see below); has to be cleared by writing `111`
| 7:4 | _reserved_ | r/- | reserved; always zero
| 3:0 | `datacount` | r/- | always `0001`: number of implemented `data` registers for abstract commands = 1
|=======================
Error codes in `cmderr` (highest priority first):
* `000` - no error
* `100` - command cannot be executed since hart is not in expected state
* `011` - exception during command execution
* `010` - unsupported command
* `001` - invalid DM register read/write while command is/was executing
:sectnums!:
===== **`command`**
[cols="4,27,>7"]
[frame="topbot",grid="none"]
|======
| 0x17 | **Abstract command** | `command`
3+| Reset value: `0x00000000`
3+| Writing this register will trigger the execution of an abstract command. New command can only be executed if
`cmderr` is zero. The entire register in write-only (reads will return zero).
|======
[NOTE]
The NEORV32 DM only supports **Access Register** abstract commands. These commands can only access the
hart's GPRs (abstract command register index `0x1000` - `0x101f`).
.`command` Register Bits
[cols="^1,^2,^1,<8"]
[options="header",grid="rows"]
|=======================
| Bit | Name [RISC-V] | R/W | Description / required value
| 31:24 | `cmdtype` | -/w | `00000000` to indicate "access register" command
| 23 | _reserved_ | -/w | reserved, has to be `0` when writing
| 22:20 | `aarsize` | -/w | `010` to indicate 32-bit accesses
| 21 | `aarpostincrement` | -/w | `0`, post-increment is not supported
| 18 | `postexec` | -/w | if set the program buffer is executed _after_ the command
| 17 | `transfer` | -/w | if set the operation in `write` is conducted
| 16 | `write` | -/w | `1`: copy `data0` to `[regno]`, `0`: copy `[regno]` to `data0`
| 15:0 | `regno` | -/w | GPR-access only; has to be `0x1000` - `0x101f`
|=======================
:sectnums!:
===== **`abstractauto`**
[cols="4,27,>7"]
[frame="topbot",grid="none"]
|======
| 0x18 | **Abstract command auto-execution** | `abstractauto`
3+| Reset value: `0x00000000`
3+| Register to configure when a read/write access to a DM repeats execution of the last abstract command.
|======
.`abstractauto` Register Bits
[cols="^1,^2,^1,<8"]
[options="header",grid="rows"]
|=======================
| Bit | Name [RISC-V] | R/W | Description
| 17 | `autoexecprogbuf[1]` | r/w | when set reading/writing from/to `progbuf1` will execute `command` again
| 16 | `autoexecprogbuf[0]` | r/w | when set reading/writing from/to `progbuf0` will execute `command` again
| 0 | `autoexecdata[0]` | r/w | when set reading/writing from/to `data0` will execute `command` again
|=======================
:sectnums!:
===== **`progbuf`**
[cols="4,27,>7"]
[frame="topbot",grid="none"]
|======
| 0x20 | **Program buffer 0** | `progbuf0`
| 0x21 | **Program buffer 1** | `progbuf1`
3+| Reset value: `0x00000013` ("NOP")
3+| Program buffer (two entries) for the DM.
|======
:sectnums!:
===== **`haltsum0`**
[cols="4,27,>7"]
[frame="topbot",grid="none"]
|======
| 0x408 | **Halted harts status** | `haltsum0`
3+| Reset value: `0x00000000`
3+| Hart has halted when according bit is set.
|======
.`haltsum0` Register Bits
[cols="^1,^2,^1,<8"]
[options="header",grid="rows"]
|=======================
| Bit | Name [RISC-V] | R/W | Description
| 0 | `haltsum0[0]` | r/- | Hart is halted when set.
|=======================
:sectnums:
==== DM CPU Access
From the CPU's perspective, the DM behaves as a memory-mapped peripheral. It occupies 256 bytes of the CPU's address
space starting at address `dm_base_c` (see table below). This address space is divided into four sections of 64 bytes
each to provide access to the _park loop code ROM_, the _program buffer_, the _data buffer_ and the _status register_.
The program buffer, the data buffer and the status register do not fully occupy the 64-byte-wide sections and are
mirrored to fill the entire section.
.DM CPU Access - Address Map
[cols="^2,^2,<5"]
[options="header",grid="rows"]
|=======================
| Base address | Actual size | Description
| `0xffffff00` | 64 bytes | ROM for the "park loop" code
| `0xffffff40` | 16 bytes | Program buffer (<<_progbuf>>)
| `0xffffff80` | 4 bytes | Data buffer (<<_data0>>)
| `0xffffffc0` | 4 bytes | Control and <<_status_register>>
|=======================
.DM Register Access
[IMPORTANT]
All memory-mapped registers of the DM can only be accessed by the CPU if it is actually in debug mode.
Hence, the DM registers are not "visible" for normal CPU operations.
Any CPU access outside of debug mode will raise a bus access fault exception.
.Park Loop Code Sources ("OCD Firmware")
[NOTE]
The assembly sources of the **park loop code** are available in `sw/ocd-firmware/park_loop.S`. Please note that
these sources are not intended to be changed by the user.
:sectnums:
===== Code ROM Entry Points
The park loop code provides two entry points, where the actual code execution can start. These are used to enter
the park loop either when an explicit request has been issued (for example a halt request) or when an exception
has occurred _while executing_ the park loop code itself.
.Park Loop Entry Points
[cols="^6,<4"]
[options="header",grid="rows"]
|=======================
| Address | Description
| `dm_exc_entry_c` (`dm_base_c` + 0) | Exception entry address
| `dm_park_entry_c` (`dm_base_c` + 8) | Normal entry address
|=======================
When the CPU enters or re-enters debug mode (for example via an `ebreak` in the DM's program buffer), it jumps to
the _normal entry point_ that is configured via the `CPU_DEBUG_PARK_ADDR` generic
(<<_cpu_top_entity_generics>>). By default, this generic is set to `dm_park_entry_c`, which is defined in main
package file. If an exception is encountered during debug mode, the CPU jumps to the address of the _exception
entry point_ configured via the `CPU_DEBUG_EXC_ADDR` generic (<<_cpu_top_entity_generics>>). By default, this generic
is set to `dm_exc_entry_c`, which is also defined in main package file.
:sectnums:
===== Status Register
The status register provides a direct communication channel between the CPU's debug mode executing the park loop
and the debugger-controlled debug module. This register is used to communicate _requests_, which are issued by the
DM and the according _acknowledges_, which are generated by the CPU.
There are only 4 bits in this register that are used to implement the requests/acknowledges. Each bit is left-aligned
in one sub-byte of the entire 32-bit register. Thus, the CPU can access each bit individually using _store-byte_ and
_load-byte_ instructions. This eliminates the need to perform bit-masking in the park loop code leading to less code size
and faster execution.
.DM Status Register - CPU Access
[cols="^1,^3,^2,<8"]
[options="header",grid="rows"]
|=======================
| Bit | Name | CPU access <| Description
.2+| 0 | `sreg_halt_ack` | read <| -
| - | write <| Set by the CPU while it is halted (and executing the park loop).
.2+| 8 | `sreg_resume_req` | read <| Set by the DM to request the CPU to resume normal operation.
| `sreg_resume_ack` | write <| Set by the CPU before it starts resuming.
.2+| 16 | `sreg_execute_req` | read <| Set by the DM to request execution of the program buffer.
| `sreg_execute_ack` | write <| Set by the CPU before it starts executing the program buffer.
.2+| 24 | - | read <| -
| `sreg_execute_ack` | write <| Set by the CPU if an exception occurs while being in debug mode.
|=======================
<<<
// ####################################################################################################################
:sectnums:
=== CPU Debug Mode
The NEORV32 CPU Debug Mode is compatible to the **Minimal RISC-V Debug Specification 1.0**
`Sdext` (external debug) ISA extension. When enabled via the <<_sdext_isa_extension>> generic (CPU) and/or
the `ON_CHIP_DEBUGGER_EN` (Processor) it adds a new CPU operation mode ("debug mode"), three additional CSRs
(section <<_cpu_debug_mode_csrs>>) and one additional instruction (`dret`) to the core.
.ISA Requirements
[IMPORTANT]
The CPU debug mode requires the `Zicsr` and `Zifencei` CPU extension to be implemented.
The CPU debug-mode is entered on any of the following events:
[start=1]
. The CPU executes a `ebreak` instruction (when in machine-mode and `ebreakm` in <<_dcsr>> is set OR when in user-mode and `ebreaku` in <<_dcsr>> is set).
. A debug halt request is issued by the DM (via CPU signal `db_halt_req_i`, high-active, triggering on rising-edge).
. The CPU completes executing of a single instruction while being single-step debugging mode (enabled if `step` in <<_dcsr>> is set).
. A hardware trigger from the <<_trigger_module>> fires (if `action` in <<_tdata1>> / `mcontrol` is set).
[NOTE]
From a hardware point of view these entry conditions are special traps that are handled transparently by
the control logic.
**Whenever the CPU enters debug-mode it performs the following operations:**
* wake-up CPU if it was send to sleep mode by the `wfi` instruction
* move the current program counter to <<_dpc>>
* copy the hart's current privilege level to the `prv` flags in <<_dcsr>>
* set `cause` in <<_dcrs>> according to the cause why debug mode is entered
* **no update** of `mtval`, `mcause`, `mtval` and `mstatus` CSRs
* load the address configured via the CPU's `CPU_DEBUG_PARK_ADDR` (<<_cpu_top_entity_generics>>) generic to the program counter jumping to the
"debugger park loop" code stored in the debug module (DM)
**When the CPU is in debug-mode the following things are important:**
* while in debug mode, the CPU executes the parking loop and - if requested by the DM - the program buffer
* effective CPU privilege level is `machine` mode; any active physical memory protection (PMP) configuration is bypassed
* the `wfi` instruction acts as a `nop` (also during single-stepping)
* if an exception occurs while being in debug mode:
** if the exception was caused by any debug-mode entry action the CPU jumps to the normal entry point (defined by `CPU_DEBUG_PARK_ADDR` generic of the <<_cpu_top_entity_generics>>) of the park loop again (for example when executing `ebreak` while in debug-mode)
** for all other exception sources the CPU jumps to the exception entry point (defined by `CPU_DEBUG_EXC_ADDR` generic of the <<_cpu_top_entity_generics>>) to signal an exception to the DM; the CPU restarts the park loop again afterwards
* interrupts are disabled; however, they will remain pending and will get executed after the CPU has left debug mode
* if the DM makes a resume request, the park loop exits and the CPU leaves debug mode (executing `dret`)
* the standard counters <<_machine_counter_and_timer_csrs>> `[m]cycle[h]` and `[m]instret[h]` are stopped
* all <<_hardware_performance_monitors_hpm_csrs>> are stopped
Debug mode is left either by executing the `dret` instruction or by performing a hardware reset of the CPU.
Executing `dret` outside of debug mode will raise an illegal instruction exception.
**Whenever the CPU leaves debug mode it performs the following operations:**
* set the hart's current privilege level according to the `prv` flags of <<_dcsr>>
* restore the original program counter from <<_dpcs>> resuming normal operation
:sectnums:
==== CPU Debug Mode CSRs
Two additional CSRs are required by the _Minimal RISC-V Debug Specification_: the debug mode control and status register
`dcsr` and the debug program counter `dpc`. An additional general purpose scratch register for debug mode only
(`dscratch0`) allows faster execution by having a fast-accessible backup register.
[NOTE]
The debug-mode CSRs are only accessible when the CPU is _in_ debug mode. If these CSRs are accessed outside of debug mode
an illegal instruction exception is raised.
:sectnums!:
===== **`dcsr`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Debug control and status register
| Address | `0x7b0`
| Reset value | `0x40000413`
| ISA | `Zicsr` & `Sdext`
| Description | This register is used to configure the debug mode environment and provides additional status information.
|=======================
.Debug control and status register `dcsr` bits
[cols="^1,^2,^1,<8"]
[options="header",grid="rows"]
|=======================
| Bit | Name [RISC-V] | R/W | Description
| 31:28 | `xdebugver` | r/- | `0100` - CPU debug mode is compatible to spec. version 1.0
| 27:16 | - | r/- | `000000000000` - _reserved_
| 15 | `ebereakm` | r/w | `ebreak` instructions in `machine` mode will _enter_ debug mode when set
| 14 | `ebereakh` | r/- | `0` - hypervisor mode not supported
| 13 | `ebereaks` | r/- | `0` - supervisor mode not supported
| 12 | `ebereaku` | r/w | `ebreak` instructions in `user` mode will _enter_ debug mode when set
| 11 | `stepie` | r/- | `0` - IRQs are disabled during single-stepping
| 10 | `stopcount` | r/- | `1` - standard counters and HPMs are stopped when in debug mode
| 9 | `stoptime` | r/- | `0` - timers increment as usual
| 8:6 | `cause` | r/- | cause identifier - why debug mode was entered (see below)
| 5 | - | r/- | `0` - _reserved_
| 4 | `mprven` | r/- | `1` - <<_mstatus>>.mprv is also evaluated when in debug mode
| 3 | `nmip` | r/- | `0` - non-maskable interrupt is pending
| 2 | `step` | r/w | enable single-stepping when set
| 1:0 | `prv` | r/w | CPU privilege level before/after debug mode
|=======================
Cause codes in `dcsr.cause` (highest priority first):
* `010` - triggered by hardware <<_trigger_module>>
* `001` - executed `EBREAK` instruction
* `011` - external halt request (from DM)
* `100` - return from single-stepping
:sectnums!:
===== **`dpc`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Debug program counter
| Address | `0x7b1`
| Reset value | `CPU_BOOT_ADDR`, CPU boot address, 4-byte aligned (see <<_cpu_top_entity_generics>> and <<_address_space>>)
| ISA | `Zicsr` & `Sdext`
| Description | The register is used to store the current program counter when debug mode is entered. The `dret` instruction will
return to the address stored in `dpc` by automatically moving `dpc` to the program counter.
|=======================
[NOTE]
`dpc[0]` is hardwired to zero. If IALIGN = 32 (i.e. <<_c_isa_extension>> is disabled) then `dpc[1]` is also hardwired to zero.
:sectnums!:
===== **`dscratch0`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Debug scratch register 0
| Address | `0x7b2`
| Reset value | `0x00000000`
| ISA | `Zicsr` & `Sdext`
| Description | The register provides a general purpose debug mode-only scratch register.
|=======================
<<<
// ####################################################################################################################
:sectnums:
=== Trigger Module
"Normal" _software_ breakpoints (using GDB's `b`/`break` command) are implemented by temporarily replacing the according
instruction word by an `[c.]ebreak` instruction. However, this is not possible when debugging code that is executed from
read-only memory (for example when debugging programs that are executed via the <<_execute_in_place_module_xip>>).
To circumvent this limitation a hardware trigger logic allows to (re-)enter debug-mode when instruction execution
reaches a programmable address. These "hardware-assisted breakpoints" are used by GDB's `hb`/`hbreak` commands.
The RISC-V `Sdtrig` ISA extension adds a programmable _trigger module_ to the CPU core that is enabled via the
<<_sdtrig_isa_extension>> generic. The trigger module implements a subset of the features described in the
"RISC-V Debug Specification / Trigger Module" and complies to version v1.0 of the `Sdtrig` spec.
The NEORV32 trigger module features only a _single_ trigger implementing a "type 6 - instruction address match" trigger.
This limitation is granted by the RISC-V debug spec and is sufficient to **debug code executed from read-only memory (ROM)**.
The trigger module can also be used independently of the CPU debug-mode / `Sdext` ISA extension.
Machine-mode software can use the trigger module to raise a breakpoint exception when instruction execution
reaches a programmed address.
.Trigger Timing
[NOTE]
When enabled the address match trigger will fire **BEFORE** the instruction at the programmed address gets executed.
.MEPC & DPC CSRs
[WARNING]
The breakpoint exception when raised by the trigger module behaves different then the "normal" trapping (see
<<_neorv32_trap_listing>>): <<_mepc>> / <<_dpc>> is set to the address of the next instruction that needs to be
executed to preserve the program flow. A "normal" breakpoint exception would set <<_mepc>> / <<_dpc>> to the address
of the actual `ebreak` instruction itself.
:sectnums:
==== Trigger Module CSRs
The `Sdtrig` ISA extension adds 4 additional CSRs that are accessible from debug-mode and also from machine-mode.
Machine-mode write accesses can be ignored by setting ´dmode´ in <<_tdata1>>. This is automatically done by the debugger
if it uses the trigger module for implementing a "hardware breakpoint"
:sectnums!:
===== **`tselect`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Trigger select register
| Address | `0x7a0`
| Reset value | `0x00000000`
| ISA | `Zicsr` & `Sdtrig`
| Description | This CSR is hardwired to zero indicating there is only one trigger available. Any write access is ignored.
|=======================
:sectnums!:
===== **`tdata1`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Trigger data register 1, visible as trigger "type 6 match control" (`mcontrol6`)
| Address | `0x7a1`
| Reset value | `0x60000048`
| ISA | `Zicsr` & `Sdtrig`
| Description | This CSR is used to configure the address match trigger using the "type 6" format.
|=======================
.Match Control CSR (`tdata1`) Bits
[cols="^1,^2,^1,<8"]
[options="header",grid="rows"]
|=======================
| Bit | Name [RISC-V] | R/W | Description
| 31:28 | `type` | r/- | `0100` - address match trigger type 6
| 27 | `dmode` | r/w | set to ignore write accesses to <<_tdata1>> and <<_tdata2>> from machine-mode; writable from debug-mode only
| 26 | `uncertain` | r/- | `0` - trigger satisfies the configured conditions
| 25 | `hit1` | r/- | `0` - hardwired to zero, only `hit0` is used
| 24 | `vs` | r/- | `0` - VS-mode not supported
| 23 | `vu` | r/- | `0` - VU-mode not supported
| 22 | `hit0` | r/c | set when trigger has fired (**BEFORE** executing the triggering address); must be explicitly cleared by writing zero
| 21 | `select` | r/- | `0` - only address matching is supported
| 20:19 | reserved | r/- | `00` - hardwired to zero
| 18:16 | `size` | r/- | `000` - match accesses of any size
| 15:12 | `action` | r/w | `0000` = breakpoint exception on trigger match, `0001` = enter debug-mode on trigger match
| 11 | `chain` | r/- | `0` - chaining is not supported as there is only one trigger
| 10:6 | `match` | r/- | `0000` - equal-match only
| 6 | `m` | r/- | `1` - trigger enabled when in machine-mode
| 5 | `uncertainen` | r/- | `0` - feature not supported, hardwired to zero
| 4 | `s` | r/- | `0` - supervisor-mode not supported
| 3 | `u` | r/- | `0`/`1` - trigger enabled when in user-mode, set if `U` ISA extension is enabled
| 2 | `execute` | r/w | set to enable trigger matching on instruction address
| 1 | `store` | r/- | `0` - store address/data matching not supported
| 0 | `load` | r/- | `0` - load address/data matching not supported
|=======================
:sectnums!:
===== **`tdata2`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Trigger data register 2
| Address | `0x7a2`
| Reset value | `0x00000000`
| ISA | `Zicsr` & `Sdtrig`
| Description | Since only the "address match trigger" type is supported, this r/w CSR is used to configure the address of the triggering instruction.
Note that the trigger module will fire **before** the instruction at the programmed address gets executed.
|=======================
:sectnums!:
===== **`tinfo`**
[cols="<1,<8"]
[frame="topbot",grid="none"]
|=======================
| Name | Trigger information register
| Address | `0x7a4`
| Reset value | `0x01000006`
| ISA | `Zicsr` & `Sdtrig`
| Description | The CSR shows global trigger information (see below). Any write access is ignored.
|=======================
.Trigger Info CSR (`tinfo`) Bits
[cols="^1,^2,^1,<8"]
[options="header",grid="rows"]
|=======================
| Bit | Name [RISC-V] | R/W | Description
| 31:24 | `version` | r/- | `0x01` - compatible to spec. version v1.0
| 23:15 | reserved | r/- | `0x00` - hardwired to zero
| 15:0 | `info` | r/- | `0x0006` - only type 6 trigger is supported
|=======================

View File

@ -0,0 +1,382 @@
:sectnums:
== Overview
The NEORV32 RISC-V Processor is an open-source RISC-V compatible processor system that is intended as
*ready-to-go* auxiliary processor within a larger SoC designs or as stand-alone custom / customizable
microcontroller.
The system is highly configurable and provides optional common peripherals like embedded memories,
timers, serial interfaces, general purpose IO ports and an external bus interface to connect custom IP like
memories, NoCs and other peripherals. On-line and in-system debugging is supported by an OpenOCD/gdb
compatible on-chip debugger accessible via JTAG.
Special focus is paid on **execution safety** to provide defined and predictable behavior at any time.
Therefore, the CPU ensures that all memory access are acknowledged and no invalid/malformed instructions
are executed. Whenever an unexpected situation occurs, the application code is informed via hardware exceptions.
The software framework of the processor comes with application makefiles, software libraries for all CPU
and processor features, a bootloader, a runtime environment and several example programs - including a port
of the CoreMark MCU benchmark and the official RISC-V architecture test suite. RISC-V GCC is used as
default toolchain (https://github.com/stnolting/riscv-gcc-prebuilt[prebuilt toolchains are also provided]).
Check out the processor's **https://stnolting.github.io/neorv32/ug[online User Guide]**
that provides hands-on tutorials to get you started.
**Structure**
[start=2]
. <<_neorv32_processor_soc>>
. <<_neorv32_central_processing_unit_cpu>>
. <<_software_framework>>
. <<_on_chip_debugger_ocd>>
. <<_legal>>
**Annotations**
[WARNING]
Warning
[IMPORTANT]
Important
[NOTE]
Note
[TIP]
Tip
<<<
// ####################################################################################################################
include::rationale.adoc[]
// ####################################################################################################################
:sectnums:
=== Project Key Features
**Project**
* all-in-one package: **CPU** + **SoC** + **Software Framework & Tooling**
* completely described in behavioral, platform-independent VHDL - no vendor- or technology-specific primitives, attributes, macros, libraries, etc. are used at all
* all-Verilog "version" available (auto-generated netlist)
* extensive configuration options for adapting the processor to the requirements of the application
* highly extensible hardware - on CPU, SoC and system level
* aims to be as small as possible while being as RISC-V-compliant as possible - with a reasonable area-vs-performance trade-off
* FPGA friendly (e.g. all internal memories can be mapped to block RAM - including the register file)
* optimized for high clock frequencies to ease timing closure and integration
* from zero to _"hello world!"_ - completely open source and documented
* easy to use even for FPGA/RISC-V starters intended to _work out of the box_
**NEORV32 CPU (the core)**
* 32-bit RISC-V CPU
* fully compatible to the RISC-V ISA specs. - checked by the https://github.com/stnolting/neorv32-riscof[official RISCOF architecture tests]
* base ISA + privileged ISA + several optional standard and custom ISA extensions
* option to add user-defined RISC-V instructions as custom ISA extension
* rich set of customization options (ISA extensions, design goal: performance / area / energy, tuning options, ...)
* <<_full_virtualization>> capabilities to increase execution safety
* official RISC-V open source architecture ID
**NEORV32 Processor (the SoC)**
* highly-configurable full-scale microcontroller-like processor system
* based on the NEORV32 CPU
* optional standard serial interfaces (UART, TWI, SPI (host and device), 1-Wire)
* optional timers and counters (watchdog, system timer)
* optional general purpose IO and PWM; a native NeoPixel(c)-compatible smart LED interface
* optional embedded memories and caches for data, instructions and bootloader
* optional external memory interface for custom connectivity
* optional execute in-place (XIP) module to execute code directly form an external SPI flash
* optional DMA controller for CPU-independent data transfers
* optional CRC module to check data integrity
* on-chip debugger compatible with OpenOCD and gdb including hardware trigger module
**Software framework**
* GCC-based toolchain - https://github.com/stnolting/riscv-gcc-prebuilt[prebuilt toolchains available]; application compilation based on GNU makefiles
* internal bootloader with serial user interface (via UART)
* core libraries and HAL for high-level usage of the provided functions and peripherals
* processor-specific runtime environment and several example programs
* doxygen-based documentation of the software framework; a deployed version is available at https://stnolting.github.io/neorv32/sw/files.html
* FreeRTOS port + demos available
**Extensibility and Customization**
The NEORV32 processor is designed to ease customization and extensibility and provides several options for adding
application-specific custom hardware modules and accelerators. The three most common options for adding custom
on-chip modules are listed below.
* <<_processor_external_memory_interface_wishbone>> to attach processor-external IP modules
* <<_custom_functions_subsystem_cfs>> for tightly-coupled processor-internal co-processors
* <<_custom_functions_unit_cfu>> for custom RISC-V instructions
[TIP]
A more detailed comparison of the extension/customization options can be found in section
https://stnolting.github.io/neorv32/ug/#_adding_custom_hardware_modules[Adding Custom Hardware Modules]
of the user guide.
<<<
// ####################################################################################################################
:sectnums:
=== Project Folder Structure
...................................
neorv32 - Project home folder
├docs - Project documentation
│├datasheet - AsciiDoc sources for the NEORV32 data sheet
│├figures - Figures and logos
│├references - Data sheets and RISC-V specs
│├sources - Sources for the images in 'figures/'
│└userguide - AsciiDoc sources for the NEORV32 user guide
├rtl - VHDL sources
│├core - Core sources of the CPU & SoC
││└mem - SoC-internal memories (default architectures)
│├legacy - Deprecated/legacy HDL modules
│├processor_templates - Pre-configured SoC wrappers
│├system_integration - System wrappers for advanced connectivity
│└test_setups - Minimal test setup "SoCs" used in the User Guide
├sim - Simulation files (see User Guide)
└-sw - Software framework
├bootloader - Sources of the processor-internal bootloader
├common - Linker script, crt0.S start-up code and central makefile
├example - Example programs for the core and the SoC modules
├lib - Processor core library
│├include - Header files (*.h)
│└source - Source files (*.c)
├image_gen - Helper program to generate NEORV32 executables
├ocd_firmware - Firmware for the on-chip debugger's "park loop"
├openocd - OpenOCD configuration files
└svd - Processor system view description file (CMSIS-SVD)
...................................
<<<
// ####################################################################################################################
:sectnums:
=== VHDL File Hierarchy
All necessary VHDL hardware description files are located in the project's `rtl/core` folder. The top entity
of the entire processor including all the required configuration generics is `neorv32_top.vhd`.
.Compile Order
[IMPORTANT]
Most of the RTL sources use **entity instantiation**. Hence, the RTL compile order might be relevant.
The list below shows the hierarchical compile order srarting at the top.
.VHDL Library
[IMPORTANT]
All core VHDL files from the list below have to be assigned to a **new library** named `neorv32`.
...................................
┌neorv32_package.vhd - Processor/CPU main VHDL package file
├neorv32_clockgate.vhd - Generic clock gating switch
├neorv32_fifo.vhd - Generic FIFO component
│ ┌neorv32_cpu_cp_bitmanip.vhd - Bit-manipulation co-processor (B ext.)
│ ├neorv32_cpu_cp_cfu.vhd - Custom instructions co-processor (Zxcfu ext.)
│ ├neorv32_cpu_cp_cond.vhd - Integer conditional operations (Zicond ext.)
│ ├neorv32_cpu_cp_fpu.vhd - Floating-point co-processor (Zfinx ext.)
│ ├neorv32_cpu_cp_shifter.vhd - Bit-shift co-processor (base ISA)
│ ├neorv32_cpu_cp_muldiv.vhd - Mul/Div co-processor (M ext.)
│ │
│┌neorv32_cpu_alu.vhd - Arithmetic/logic unit
│├neorv32_cpu_pmp.vhd - Physical memory protection unit (Smpmp ext.)
│├neorv32_cpu_lsu.vhd - Load/store unit
││ ┌neorv32_cpu_decompressor.vhd - Compressed instructions decoder (C ext.)
│├neorv32_cpu_control.vhd - CPU control, exception system and CSRs
│├neorv32_cpu_regfile.vhd - Data register file
││
├neorv32_cpu.vhd - NEORV32 CPU TOP ENTITY
├mem/neorv32_dmem.default.vhd - *Default* data memory (architecture-only)
├mem/neorv32_imem.default.vhd - *Default* instruction memory (architecture-only)
│┌neorv32_bootloader_image.vhd - Bootloader ROM memory image
├neorv32_boot_rom.vhd - Bootloader ROM
│┌neor32_application_image.vhd - IMEM application initialization image
├neorv32_imem.entity.vhd - Processor-internal instruction memory (entity-only!)
├neorv32_cfs.vhd - Custom functions subsystem
├neorv32_crc.vhd - Cyclic redundancy check unit
├neorv32_dcache.vhd - Processor-internal data cache
├neorv32_debug_dm.vhd - on-chip debugger: debug module
├neorv32_debug_dtm.vhd - on-chip debugger: debug transfer module
├neorv32_dma.vhd - Direct memory access controller
├neorv32_dmem.entity.vhd - Processor-internal data memory (entity-only!)
├neorv32_gpio.vhd - General purpose input/output port unit
├neorv32_gptmr.vhd - General purpose 32-bit timer
├neorv32_icache.vhd - Processor-internal instruction cache
├neorv32_intercon.vhd - SoC bus infrastructure
├neorv32_mtime.vhd - Machine system timer
├neorv32_neoled.vhd - NeoPixel (TM) compatible smart LED interface
├neorv32_onewire.vhd - One-Wire serial interface controller
├neorv32_pwm.vhd - Pulse-width modulation controller
├neorv32_sdi.vhd - Serial data interface controller (SPI device)
├neorv32_slink.vhd - Stream link interface
├neorv32_spi.vhd - Serial peripheral interface controller (SPI host)
├neorv32_sysinfo.vhd - System configuration information memory
├neorv32_trng.vhd - True random number generator
├neorv32_twi.vhd - Two wire serial interface controller
├neorv32_uart.vhd - Universal async. receiver/transmitter
├neorv32_wdt.vhd - Watchdog timer
├neorv32_wishbone.vhd - External (Wishbone) bus interface
├neorv32_xip.vhd - Execute in place module
├neorv32_xirq.vhd - External interrupt controller
neorv32_top.vhd - NEORV32 PROCESSOR TOP ENTITY
...................................
[NOTE]
The processor-internal instruction and data memories (IMEM and DMEM) are split into two design files each:
a plain entity definition (`neorv32_*mem.entity.vhd`) and the actual architecture definition
(`mem/neorv32_*mem.default.vhd`). The `*.default.vhd` architecture definitions from `rtl/core/mem` provide a _generic_ and
_platform independent_ memory design (inferring embedded memory blocks). You can replace/modify the architecture
source file in order to use platform-specific features (like advanced memory resources) or to improve technology mapping
and/or timing.
<<<
// ####################################################################################################################
:sectnums:
=== FPGA Implementation Results
This section shows **exemplary** FPGA implementation results for the NEORV32 CPU and NEORV32 Processor modules.
[IMPORTANT]
The results are generated by manual synthesis runs. Hence, they might not represent the latest version of the processor.
[discrete]
==== CPU
[cols="<2,<8"]
[grid="topbot"]
|=======================
| HW version: | `1.7.8.5`
| Top entity: | `rtl/core/neorv32_cpu.vhd`
| FPGA: | Intel Cyclone IV E `EP4CE22F17C6`
| Toolchain: | Quartus Prime Lite 21.1
| Constraints: | **no timing constraints**, "balanced optimization", f~max~ from "_Slow 1200mV 0C Model_"
|=======================
[cols="<6,>1,>1,>1,>1,>1"]
[options="header",grid="rows"]
|=======================
| CPU ISA Configuration | LEs | FFs | MEM bits | DSPs | _f~max~_
| `rv32i_Zicsr` | 1223 | 607 | 1024 | 0 | 130 MHz
| `rv32i_Zicsr_Zicntr` | 1578 | 773 | 1024 | 0 | 130 MHz
| `rv32im_Zicsr_Zicntr` | 2087 | 983 | 1024 | 0 | 130 MHz
| `rv32imc_Zicsr_Zicntr` | 2338 | 992 | 1024 | 0 | 130 MHz
| `rv32imcb_Zicsr_Zicntr` | 3175 | 1247 | 1024 | 0 | 130 MHz
| `rv32imcbu_Zicsr_Zicntr` | 3186 | 1254 | 1024 | 0 | 130 MHz
| `rv32imcbu_Zicsr_Zicntr_Zifencei` | 3187 | 1254 | 1024 | 0 | 130 MHz
| `rv32imcbu_Zicsr_Zicntr_Zifencei_Zfinx` | 4450 | 1906 | 1024 | 7 | 123 MHz
| `rv32imcbu_Zicsr_Zicntr_Zifencei_Zfinx_DebugMode` | 4825 | 2018 | 1024 | 7 | 123 MHz
|=======================
.Goal-Driven Optimization
[TIP]
The CPU provides further options to reduce the area footprint or to increase performance.
See section <<_processor_top_entity_generics>> for more information. Also, take a look at the User Guide section
https://stnolting.github.io/neorv32/ug/#_application_specific_processor_configuration[Application-Specific Processor Configuration].
[discrete]
==== Processor - Modules
[cols="<2,<8"]
[grid="topbot"]
|=======================
| HW version: | `1.8.6.7`
| Top entity: | `rtl/core/neorv32_top.vhd`
| FPGA: | Intel Cyclone IV E `EP4CE22F17C6`
| Toolchain: | Quartus Prime Lite 21.1
| Constraints: | **no timing constraints**, "balanced optimization"
|=======================
.Hardware utilization by processor module
[cols="<2,<8,>1,>1,>2,>1"]
[options="header",grid="rows"]
|=======================
| Module | Description | LEs | FFs | MEM bits | DSPs
| BOOT ROM | Bootloader ROM (4kB) | 2 | 2 | 32768 | 0
| Bus switch (core) | _SoC bus infrastructure_ | 28 | 15 | 0 | 0
| Bus switch (DMA) | _SoC bus infrastructure_ | 159 | 9 | 0 | 0
| CFS | Custom functions subsystem footnote:[Resource utilization depends on custom design logic.] | - | - | - | -
| CRC | Cyclic redundancy check unit | 130 | 117 | 0 | 0
| dCACHE | Data cache (4 blocks, 64 bytes per block) | 300 | 167 | 2112 | 0
| DM | On-chip debugger - debug module | 377 | 241 | 0 | 0
| DTM | On-chip debugger - debug transfer module (JTAG) | 262 | 220 | 0 | 0
| DMA | Direct memory access controller | 365 | 291 | 0 | 0
| DMEM | Processor-internal data memory (8kB) | 6 | 2 | 65536 | 0
| Gateway | _SoC bus infrastructure_ | 215 | 91 | 0 | 0
| GPIO | General purpose input/output ports | 102 | 98 | 0 | 0
| GPTMR | General Purpose Timer | 150 | 105 | 0 | 0
| IO Switch | _SoC bus infrastructure_ | 217 | 0 | 0 | 0
| iCACHE | Instruction cache (2x4 blocks, 64 bytes per block) | 458 | 296 | 4096 | 0
| IMEM | Processor-internal instruction memory (16kB) | 7 | 2 | 131072 | 0
| MTIME | Machine system timer | 307 | 166 | 0 | 0
| NEOLED | Smart LED Interface (NeoPixel/WS28128) (FIFO_depth=1) | 171 | 129 | 0 | 0
| ONEWIRE | 1-wire interface | 105 | 77 | 0 | 0
| PWM | Pulse_width modulation controller (4 channels) | 91 | 81 | 0 | 0
| Reservation Set | Reservation set controller for LR/SC instructions | 52 | 33 | 0 | 0
| SDI | Serial data interface | 103 | 77 | 512 | 0
| SLINK | Stream link interface (RX/TX FIFO depth=32) | 96 | 73 | 2048 | 0
| SPI | Serial peripheral interface | 137 | 97 | 1024 | 0
| SYSINFO | System configuration information memory | 11 | 11 | 0 | 0
| TRNG | True random number generator | 140 | 108 | 512 | 0
| TWI | Two-wire interface | 93 | 64 | 0 | 0
| UART0, UART1 | Universal asynchronous receiver/transmitter 0/1 (FIFO_depth=1) | 222 | 142 | 1024 | 0
| WDT | Watchdog timer | 107 | 89 | 0 | 0
| WISHBONE | External memory interface | 122 | 112 | 0 | 0
| XIP | Execute in place module | 369 | 276 | 0 | 0
| XIRQ | External interrupt controller (4 channels) | 35 | 29 | 0 | 0
|=======================
<<<
// ####################################################################################################################
:sectnums:
=== CPU Performance
The performance of the NEORV32 was tested and evaluated using the https://www.eembc.org/coremark/[Core Mark CPU benchmark].
The according sources can be found in the `sw/example/coremark` folder.
The resulting CoreMark score is defined as CoreMark iterations per second per MHz.
.Configuration
[cols="<2,<8"]
[grid="topbot"]
|=======================
| HW version: | `1.5.7.10`
| Hardware: | 32kB int. IMEM, 16kB int. DMEM, no caches, 100MHz clock
| CoreMark: | 2000 iterations, MEM_METHOD is MEM_STACK
| Compiler: | RISCV32-GCC 10.2.0 (compiled with `march=rv32i mabi=ilp32`)
| Compiler flags: | default but with `-O3`, see makefile
|=======================
.CoreMark results
[cols="<5,^1,^1,^1"]
[options="header",grid="rows"]
|=======================
| CPU | CoreMark Score | CoreMarks/MHz | Average CPI
| _small_ (`rv32i_Zicsr_Zifencei`) | 33.89 | **0.3389** | **4.04**
| _medium_ (`rv32imc_Zicsr_Zifencei`) | 62.50 | **0.6250** | **5.34**
| _performance_ (`rv32imc_Zicsr_Zifencei` + perf. options) | 95.23 | **0.9523** | **3.54**
|=======================
The NEORV32 CPU is based on a multi-cycle architecture. Each instruction is executed in a sequence of
several consecutive micro operations. The average CPI (cycles per instruction) depends on the instruction
mix of a specific applications and also on the available CPU extensions. More information regarding the execution
time of each implemented instruction can be found in section <<_instruction_sets_and_extensions>>.

View File

@ -0,0 +1,108 @@
:sectnums:
=== Rationale
[discrete]
==== Why did you make this?
Processor and CPU architecture designs are fascinating things: they are the magic frontier where software meets hardware.
This project started as something like a _journey_ into this magic realm to understand how things actually work
down on this very low level and evolved over time to a capable system on chip.
But there is more: when I started to dive into the emerging RISC-V ecosystem I felt overwhelmed by the complexity.
As a beginner it is hard to get an overview - especially when you want to setup a minimal platform to tinker with...
Which core to use? How to get the right toolchain? What features do I need? How does booting work? How do I
create an actual executable? How to get that into the hardware? How to customize things? **_Where to start???_**
This project aims to provide a _simple to understand_ and _easy to use_ yet _powerful_ and _flexible_ platform
that targets FPGA and RISC-V beginners as well as advanced users.
[discrete]
==== Why a _soft-core_ processor?
As a matter of fact soft-core processors _cannot_ compete with discrete (like FPGA hard-macro) processors in terms
of performance, energy efficiency and size. But they do fill a niche in FPGA design space: for example, soft-core
processors allow to implement the _control flow part_ of certain applications (e.g. communication protocol handling)
using software like plain C. This provides high flexibility as software can be easily changed, re-compiled and
re-uploaded again.
Furthermore, the concept of flexibility applies to all aspects of a soft-core processor. The user can add
_exactly_ the features that are required by the application: additional memories, custom interfaces, specialized
co-processors and even user-defined instructions.
[discrete]
==== Why RISC-V?
image::riscv_logo.png[width=250,align=left]
[quote, RISC-V International, https://riscv.org/about/]
____
RISC-V is a free and open ISA enabling a new era of processor innovation through open standard collaboration.
____
Open-source is a great thing!
While open-source has already become quite popular in _software_, hardware-focused projects still need to catch up.
Admittedly, there has been quite a development, but mainly in terms of _platforms_ and _applications_ (so
schematics, PCBs, etc.). Although processors and CPUs are the heart of almost every digital system, having a true
open-source silicon is still a rarity. RISC-V aims to change that - and even it is _just one approach_, it helps paving
the road for future development.
Furthermore, I highly appreciate the community aspect of RISC-V. The ISA and everything beyond is developed in direct
contact with the community: this includes businesses and professionals but also hobbyist, amateurs and people
that are just curious. Everyone can join discussions and contribute to RISC-V in their very own way.
Finally, I really like the RISC-V ISA itself. It aims to be a clean, orthogonal and "intuitive" ISA that
resembles with the basic concepts of _RISC_: simple yet effective.
[discrete]
==== Yet another RISC-V core? What makes it special?
The NEORV32 is not based on another RISC-V core. It was build entirely from ground up (just following the official
ISA specs). The project does not intend to replace certain RISC-V cores or
just beat existing ones like https://github.com/SpinalHDL/VexRiscv[VexRISC] in terms of performance or
https://github.com/olofk/serv[SERV] in terms of size. It was build having a different design goal in mind.
The project aims to provide _another option_ in the RISC-V / soft-core design space with a different performance
vs. size trade-off and a different focus: _embrace_ concepts like documentation, platform-independence / portability,
RISC-V compatibility, _extensibility & customization_ and _ease of use_ (see the <<_project_key_features>> below).
Furthermore, the NEORV32 pays special focus on _execution safety_ using <<_full_virtualization>>. The CPU aims to
provide fall-backs for _everything that could go wrong_. This includes malformed instruction words, privilege escalations
and even memory accesses that are checked for address space holes and deterministic response times of memory-mapped
devices. Precise exceptions allow a defined and fully-synchronized state of the CPU at every time an in every situation.
[discrete]
==== A multi-cycle architecture?!?!
Most mainstream CPUs out there are pipelined architectures to increase throughput. In contrast, most CPUs used for
teaching are single-cycle designs since they are probably the most easiest to understand. But what about the
multi-cycle architectures?
In terms of energy, throughput, area and maximal clock frequency multi-cycle architectures are somewhere in between
single-cycle and fully-pipelined designs: they provide higher throughput and clock speed when compared to their
single-cycle counterparts while having less hardware complexity (= area) then a fully-pipelined designs. I decided to
use the multi-cycle approach because of the following reasons:
* Multi-cycle architecture are quite small! There is no need for pipeline hazard detection and resolution logic
(e.g. forwarding). Furthermore, you can "re-use" parts of the core to do several tasks (e.g. the ALU is used for the
actual data processing, but also for address generation, branch condition check and branch target computation).
* Single-cycle architectures require memories that can be read asynchronously - a thing that is not feasible to implement
in real world applications (i.e. FPGA block RAM is entirely synchronous). Furthermore, such design usually have a very
long critical path tremendously reducing maximal operating frequency.
* Pipelined designs increase performance by having several instruction "in fly" at the same time. But this also means
there is some kind of "out-of-order" behavior: if an instruction at the end of the pipeline causes an exception
all the instructions in earlier stages have to be invalidated. Potential architecture state changes have to be made _undone_
requiring additional (exception-handling) logic. In a multi-cycle architecture this situation cannot occur because only a
single instruction is "in fly" at a time.
* Having only a single instruction in fly does not only reduce hardware costs, it also simplifies
simulation/verification/debugging, state preservation/restoring during exceptions and extensibility (no need to care
about pipeline hazards) - but of course at the cost of reduced throughput.
To counteract the loss of performance implied by a _pure_ multi-cycle architecture, the NEORV32 CPU uses a _mixed_
approach: instruction fetch (front-end) and instruction execution (back-end) are de-coupled to operate independently
of each other. Data is interchanged via a queue building a simple 2-stage pipeline. Each "pipeline" stage in terms is
implemented as multi-cycle architecture to simplify the hardware and to provide _precise_ state control (e.g. during
exceptions).

View File

@ -0,0 +1,777 @@
// ####################################################################################################################
:sectnums:
== NEORV32 Processor (SoC)
The NEORV32 Processor is based on the NEORV32 CPU. Together with common peripheral
interfaces and embedded memories it provides a RISC-V-based full-scale microcontroller-like SoC platform.
.The NEORV32 Processor (Block Diagram)
image::neorv32_processor.png[align=center]
**Section Structure**
* <<_processor_top_entity_signals>> and <<_processor_top_entity_generics>>
* <<_processor_clocking>> and <<_processor_reset>>
* <<_processor_interrupts>>
* <<_address_space>> and <<_boot_configuration>>
* <<_processor_internal_modules>>
**Key Features**
* _optional_ processor-internal data and instruction memories (<<_data_memory_dmem,**DMEM**>>/<<_instruction_memory_imem,**IMEM**>>)
* _optional_ caches (<<_processor_internal_instruction_cache_icache,**iCACHE**>>/<<_processor_internal_data_cache_dcache,**dCACHE**>>)
* _optional_ internal bootloader (<<_bootloader_rom_bootrom,**BOOTROM**>>) with UART console & SPI flash boot option
* _optional_ machine system timer (<<_machine_system_timer_mtime,**MTIME**>>), RISC-V-compatible
* _optional_ two independent universal asynchronous receivers and transmitters (<<_primary_universal_asynchronous_receiver_and_transmitter_uart0,**UART0**>>,
<<_secondary_universal_asynchronous_receiver_and_transmitter_uart1,**UART1**>>) with optional hardware flow control (RTS/CTS)
* _optional_ serial peripheral interface host controller (<<_serial_peripheral_interface_controller_spi,**SPI**>>) with 8 dedicated CS lines
* _optional_ 8-bit serial data device interface (<<_serial_data_interface_controller_spi,**SDI**>>)
* _optional_ two wire serial interface controller (<<_two_wire_serial_interface_controller_twi,**TWI**>>), compatible to the I²C standard
* _optional_ general purpose parallel IO port (<<_general_purpose_input_and_output_port_gpio,**GPIO**>>), 64xOut, 64xIn
* _optional_ 32-bit external bus interface, Wishbone b4 / AXI4-Lite compatible (<<_processor_external_memory_interface_wishbone,**WISHBONE**>>)
* _optional_ watchdog timer (<<_watchdog_timer_wdt,**WDT**>>)
* _optional_ PWM controller with up to 12 channels & 8-bit duty cycle resolution (<<_pulse_width_modulation_controller_pwm,**PWM**>>)
* _optional_ ring-oscillator-based true random number generator (<<_true_random_number_generator_trng,**TRNG**>>)
* _optional_ custom functions subsystem for custom co-processor extensions (<<_custom_functions_subsystem_cfs,**CFS**>>)
* _optional_ NeoPixel(TM)/WS2812-compatible smart LED interface (<<_smart_led_interface_neoled,**NEOLED**>>)
* _optional_ external interrupt controller with up to 32 channels (<<_external_interrupt_controller_xirq,**XIRQ**>>)
* _optional_ general purpose 32-bit timer (<<_general_purpose_timer_gptmr,**GPTMR**>>) with capture input
* _optional_ execute in-place module (<<_execute_in_place_module_xip,**XIP**>>)
* _optional_ 1-wire serial interface controller (<<_one_wire_serial_interface_controller_onewire,**ONEWIRE**>>), compatible to the 1-wire standard
* _optional_ autonomous direct memory access controller (<<_direct_memory_access_controller_dma,**DMA**>>)
* _optional_ stream link interface (<<_stream_link_interface_slink,**SLINK**>>), AXI4-Stream compatible
* _optional_ cyclic redundancy check unit (<<_cyclic_redundancy_check_crc,**CRC**>>)
* _optional_ on-chip debugger with JTAG TAP (<<_on_chip_debugger_ocd,**OCD**>>)
* system configuration information memory to check HW configuration via software (<<_system_configuration_information_memory_sysinfo,**SYSINFO**>>)
<<<
// ####################################################################################################################
:sectnums:
=== Processor Top Entity - Signals
The following table shows all interface signals of the processor top entity (`rtl/core/neorv32_top.vhd`).
All signals are of type `std_ulogic` or `std_ulogic_vector`, respectively.
.Default Values of Inputs
[NOTE]
All _optional_ input signals provide default values in case they are not explicitly assigned during instantiation.
The weak driver strengths of VHDL (`'L'` and `'H'`) are used to model a pull-down or pull-up resistor.
.Configurable Amount of Channels
[NOTE]
Some peripherals allow to configure the number of channels to-be-implemented by a generic (for example the number
of PWM channels). The according input/output signals have a fixed sized regardless of the actually configured
amount of channels. If less than the maximum number of channels is configured, only the LSB-aligned channels are used:
in case of an _input port_ the remaining bits/channels are left unconnected; in case of an _output port_ the remaining
bits/channels are hardwired to zero.
.Tri-State Interfaces
[NOTE]
Some interfaces (like the TWI and the 1-Wire bus) require tri-state drivers in the designs top module.
.NEORV32 Processor Signal List
[cols="<3,^1,^1,^1,<8"]
[options="header",grid="rows"]
|=======================
| Name | Width | Direction | Default | Description
5+^| **Global Control (<<_processor_clocking>> and <<_processor_reset>>)**
| `clk_i` | 1 | in | none | global clock line, all registers triggering on rising edge
| `rstn_i` | 1 | in | none | global reset, asynchronous, **low-active**
5+^| **JTAG Access Port for <<_on_chip_debugger_ocd>>**
| `jtag_trst_i` | 1 | in | `'H'` | TAP reset, low-active (optional)
| `jtag_tck_i` | 1 | in | `'L'` | serial clock
| `jtag_tdi_i` | 1 | in | `'L'` | serial data input
| `jtag_tdo_o` | 1 | out | - | serial data output
| `jtag_tms_i` | 1 | in | `'L'` | mode select
5+^| **<<_processor_external_memory_interface_wishbone>>**
| `wb_tag_o` | 3 | out | - | tag (access type identifier)
| `wb_adr_o` | 32 | out | - | destination address
| `wb_dat_i` | 32 | in | `'L'` | write data
| `wb_dat_o` | 32 | out | - | read data
| `wb_we_o` | 1 | out | - | write enable ('0' = read transfer)
| `wb_sel_o` | 4 | out | - | byte enable
| `wb_stb_o` | 1 | out | - | strobe
| `wb_cyc_o` | 1 | out | - | valid cycle
| `wb_lock_o` | 1 | out | - | exclusive access request
| `wb_ack_i` | 1 | in | `'L'` | transfer acknowledge
| `wb_err_i` | 1 | in | `'L'` | transfer error
5+^| **<<_stream_link_interface_slink>>**
| `slink_rx_dat_i` | 32 | in | `'L'` | RX data
| `slink_rx_val_i` | 1 | in | `'L'` | RX data valid
| `slink_rx_lst_i` | 1 | in | `'L'` | RX last element of stream
| `slink_rx_rdy_o` | 1 | out | - | RX ready to receive
| `slink_tx_dat_o` | 32 | out | - | TX data
| `slink_tx_val_o` | 1 | out | - | TX data valid
| `slink_tx_lst_o` | 1 | out | - | TX last element of stream
| `slink_tx_rdy_i` | 1 | in | `'L'` | TX allowed to send
5+^| **<<_execute_in_place_module_xip>>**
| `xip_csn_o` | 1 | out | - | chip select, low-active
| `xip_clk_o` | 1 | out | - | serial clock
| `xip_dat_i` | 1 | in | `'L'` | serial data input
| `xip_dat_o` | 1 | out | - | serial data output
5+^| **<<_general_purpose_input_and_output_port_gpio>>**
| `gpio_o` | 64 | out | - | general purpose parallel output
| `gpio_i` | 64 | in | `'L'` | general purpose parallel input
5+^| **<<_primary_universal_asynchronous_receiver_and_transmitter_uart0>>**
| `uart0_txd_o` | 1 | out | - | serial transmitter
| `uart0_rxd_i` | 1 | in | `'L'` | serial receiver
| `uart0_rts_o` | 1 | out | - | RX ready to receive new char
| `uart0_cts_i` | 1 | in | `'L'` | TX allowed to start sending, low-active
5+^| **<<_secondary_universal_asynchronous_receiver_and_transmitter_uart1>>**
| `uart1_txd_o` | 1 | out | - | serial transmitter
| `uart1_rxd_i` | 1 | in | `'L'` | serial receiver
| `uart1_rts_o` | 1 | out | - | RX ready to receive new char
| `uart1_cts_i` | 1 | in | `'L'` | TX allowed to start sending, low-active
5+^| **<<_serial_peripheral_interface_controller_spi>>**
| `spi_clk_o` | 1 | out | - | controller clock line
| `spi_dat_o` | 1 | out | - | serial data output
| `spi_dat_i` | 1 | in | `'L'` | serial data input
| `spi_csn_o` | 8 | out | - | select (low-active)
5+^| **<<_serial_data_interface_controller_sdi>>**
| `sdi_clk_i` | 1 | in | `'L'` | controller clock line
| `sdi_dat_o` | 1 | out | - | serial data output
| `sdi_dat_i` | 1 | in | `'L'` | serial data input
| `sdi_csn_i` | 1 | in | `'H'` | chip select, low-active
5+^| **<<_two_wire_serial_interface_controller_twi>>**
| `twi_sda_i` | 1 | in | `'H'` | serial data line sense input
| `twi_sda_o` | 1 | out | - | serial data line output (pull low only)
| `twi_scl_i` | 1 | in | `'H'` | serial clock line sense input
| `twi_scl_o` | 1 | out | - | serial clock line output (pull low only)
5+^| **<<_one_wire_serial_interface_controller_onewire>>**
| `onewire_i` | 1 | in | `'H'` | 1-wire bus sense input
| `onewire_o` | 1 | out | - | 1-wire bus output (pull low only)
5+^| **<<_pulse_width_modulation_controller_pwm>>**
| `pwm_o` | 12 | out | - | pulse-width modulated channels
5+^| **<<_custom_functions_subsystem_cfs>>**
| `cfs_in_i` | 32 | in | `'L'` | custom CFS input signal conduit
| `cfs_out_o` | 32 | out | - | custom CFS output signal conduit
5+^| **<<_smart_led_interface_neoled>>**
| `neoled_o` | 1 | out | - | asynchronous serial data output
5+^| **<<_machine_system_timer_mtime>>**
| `mtime_time_o` | 64 | out | - | MTIME system time output
5+^| **<<_general_purpose_timer_gptmr>>**
| `gptmr_trig_i` | 1 | in | `'L'` | timer capture input
5+^| **<<_external_interrupt_controller_xirq>>**
| `xirq_i` | 32 | in | `'L'` | external interrupt requests
5+^| **RISC-V Machine-Mode <<_processor_interrupts>>**
| `mtime_irq_i` | 1 | in | `'L'` | machine timer interrupt (RISC-V), high-level-active
| `msw_irq_i` | 1 | in | `'L'` | machine software interrupt (RISC-V), high-level-active
| `mext_irq_i` | 1 | in | `'L'` | machine external interrupt (RISC-V), high-level-active
|=======================
<<<
// ####################################################################################################################
:sectnums:
=== Processor Top Entity - Generics
This section lists all configuration generics of the NEORV32 processor top entity (`rtl/neorv32_top.vhd`).
.Customization
[TIP]
The NEORV32 generics allow to configure the system according to your needs. The generics are
used to control implementation of certain CPU extensions and peripheral modules and even allow to
optimize the system for certain design goals like minimal area or maximum performance.
.Default Values
[NOTE]
All _optional_ configuration generics provide default values in case they are not explicitly assigned during instantiation.
.Software Discovery of Configuration
[TIP]
Software can determine the actual CPU configuration via the <<_misa>> and <<_mxisa>> CSRs. The Soc/Processor
and can be determined via the <<_system_configuration_information_memory_sysinfo, SYSINFO>> memory-mapped registers.
.Excluded Modules and Extensions
[NOTE]
If optional modules (like CPU extensions or peripheral devices) are not enabled the according hardware
will not be synthesized at all. Hence, the disabled modules do not increase area and power requirements
and do not impact timing.
.Table Abbreviations
[NOTE]
The generic type "`suv(x:y)`" is an abbreviation for "`std_ulogic_vector(x downto y)`".
.NEORV32 Processor Generic List
[cols="<3,^2,^2,<8"]
[options="header",grid="rows"]
|=======================
| Name | Type | Default | Description
4+^| **General**
| `CLOCK_FREQUENCY` | natural | - | The clock frequency of the processor's `clk_i` input port in Hertz (Hz).
| `CLOCK_GATING_EN` | boolean | false | Enable clock gating when CPU is in sleep mode (see sections <<_sleep_mode>> and <<_processor_clocking>>).
| `INT_BOOTLOADER_EN` | boolean | false | Implement the processor-internal <<_bootloader_rom_bootrom>>, pre-initialized with the default <<_bootloader>> image.
| `HART_ID` | suv(31:0) | 0x00000000 | The hart thread ID of the CPU (passed to <<_mhartid>> CSR).
| `VENDOR_ID` | suv(31:0) | 0x00000000 | JEDEC ID (passed to <<_mvendorid>> CSR).
4+^| **<<_on_chip_debugger_ocd>>**
| `ON_CHIP_DEBUGGER_EN` | boolean | false | Implement the on-chip debugger and the CPU debug mode.
| `DM_LEGACY_MODE` | boolean | false | Debug module spec. version: `false` = v1.0, `true` = v0.13 (legacy mode).
4+^| **CPU <<_instruction_sets_and_extensions>>**
| `CPU_EXTENSION_RISCV_A` | boolean | false | Enable <<_a_isa_extension>> (atomic memory accesses).
| `CPU_EXTENSION_RISCV_B` | boolean | false | Enable <<_b_isa_extension>> (bit-manipulation).
| `CPU_EXTENSION_RISCV_C` | boolean | false | Enable <<_c_isa_extension>> (compressed instructions).
| `CPU_EXTENSION_RISCV_E` | boolean | false | Enable <<_e_isa_extension>> (reduced register file size).
| `CPU_EXTENSION_RISCV_M` | boolean | false | Enable <<_m_isa_extension>> (hardware-based integer multiplication and division).
| `CPU_EXTENSION_RISCV_U` | boolean | false | Enable <<_u_isa_extension>> (less-privileged user mode).
| `CPU_EXTENSION_RISCV_Zfinx` | boolean | false | Enable <<_zfinx_isa_extension>> (single-precision floating-point unit).
| `CPU_EXTENSION_RISCV_Zicntr` | boolean | true | Enable <<_zicntr_isa_extension>> (CPU base counters).
| `CPU_EXTENSION_RISCV_Zicond` | boolean | false | Enable <<_zicond_isa_extension>> (integer conditional operations).
| `CPU_EXTENSION_RISCV_Zihpm` | boolean | false | Enable <<_zihpm_isa_extension>> (hardware performance monitors).
| `CPU_EXTENSION_RISCV_Zmmul` | boolean | false | Enable <<_zmmul_isa_extension>> (hardware-based integer multiplication).
| `CPU_EXTENSION_RISCV_Zxcfu` | boolean | false | Enable NEORV32-specific <<_zxcfu_isa_extension>> (custom RISC-V instructions).
4+^| **CPU <<_architecture>> Tuning Options**
| `FAST_MUL_EN` | boolean | false | Implement fast but large full-parallel multipliers (trying to infer DSP blocks); see section <<_cpu_arithmetic_logic_unit>>.
| `FAST_SHIFT_EN` | boolean | false | Implement fast but large full-parallel barrel shifters; see section <<_cpu_arithmetic_logic_unit>>.
| `REGFILE_HW_RST` | boolean | false | Implement full hardware reset for register file (prevent inferring of BRAM); see section <<_cpu_register_file>>.
4+^| **Physical Memory Protection (<<_pmp_isa_extension>>)**
| `PMP_NUM_REGIONS` | natural | 0 | Number of implemented PMP regions (0..16).
| `PMP_MIN_GRANULARITY` | natural | 4 | Minimal region granularity in bytes. Has to be a power of two, min 4.
| `PMP_TOR_MODE_EN` | boolean | true | Implement support for top-of-region (TOR) mode.
| `PMP_NAP_MODE_EN` | boolean | true | Implement support for naturally-aligned power-of-two (NAPOT & NA4) modes.
4+^| **Hardware Performance Monitors (<<_zihpm_isa_extension>>)**
| `HPM_NUM_CNTS` | natural | 0 | Number of implemented hardware performance monitor counters (0..13).
| `HPM_CNT_WIDTH` | natural | 40 | Total LSB-aligned size of each HPM counter. Min 0, max 64.
4+^| **Atomic Memory Access Reservation Set Granularity (<<_a_isa_extension>>)**
| `AMO_RVS_GRANULARITY` | natural | 4 | Size in bytes, has to be a power of 2, min 4.
4+^| **Internal <<_instruction_memory_imem>>**
| `MEM_INT_IMEM_EN` | boolean | false | Implement the processor-internal instruction memory.
| `MEM_INT_IMEM_SIZE` | natural | 16*1024 | Size in bytes of the processor internal instruction memory (use a power of 2).
4+^| **Internal <<_data_memory_dmem>>**
| `MEM_INT_DMEM_EN` | boolean | false | Implement the processor-internal data memory.
| `MEM_INT_DMEM_SIZE` | natural | 8*1024 | Size in bytes of the processor-internal data memory (use a power of 2).
4+^| **<<_processor_internal_instruction_cache_icache>>**
| `ICACHE_EN` | boolean | false | Implement the instruction cache.
| `ICACHE_NUM_BLOCKS` | natural | 4 | Number of blocks ("pages" or "lines") Has to be a power of two.
| `ICACHE_BLOCK_SIZE` | natural | 64 | Size in bytes of each block. Has to be a power of two.
| `ICACHE_ASSOCIATIVITY` | natural | 1 | Associativity (number of sets). Allowed configurations: `1` = 1 set, direct mapped; `2` = 2-way set-associative.
4+^| **<<_processor_internal_data_cache_dcache>>**
| `DCACHE_EN` | boolean | false | Implement the data cache.
| `DCACHE_NUM_BLOCKS` | natural | 4 | Number of blocks ("pages" or "lines"). Has to be a power of two.
| `DCACHE_BLOCK_SIZE` | natural | 64 | Size in bytes of each block. Has to be a power of two.
4+^| **<<_processor_external_memory_interface_wishbone>>**
| `MEM_EXT_EN` | boolean | false | Implement the external bus interface.
| `MEM_EXT_TIMEOUT` | natural | 255 | Clock cycles after which a pending external bus access will auto-terminate and raise a bus fault exception.
| `MEM_EXT_PIPE_MODE` | boolean | false | Use _standard_ ("classic") Wishbone protocol when false. Use _pipelined_ Wishbone protocol when true.
| `MEM_EXT_BIG_ENDIAN` | boolean | false | Use BIG endian data order interface for external bus.
| `MEM_EXT_ASYNC_RX` | boolean | false | Disable input registers when true.
| `MEM_EXT_ASYNC_TX` | boolean | false | Disable output registers when true.
4+^| **<<_execute_in_place_module_xip>>**
| `XIP_EN` | boolean | false | Implement the execute in-place module.
| `XIP_CACHE_EN` | boolean | false | Implement XIP cache.
| `XIP_CACHE_NUM_BLOCKS` | natural | 8 | Number of blocks in XIP cache. Has to be a power of two.
| `XIP_CACHE_BLOCK_SIZE` | natural | 256 | Number of bytes per XIP cache block. Has to be a power of two, min 4.
4+^| **<<_external_interrupt_controller_xirq>>**
| `XIRQ_NUM_CH` | natural | 0 | Number of channels of the external interrupt controller. Valid values are 0..32.
| `XIRQ_TRIGGER_TYPE` | suv(31:0) | 0xFFFFFFFF | Trigger type (one bit per channel): `0` = level-triggered, '1' = edge triggered.
| `XIRQ_TRIGGER_POLARITY` | suv(31:0) | 0xFFFFFFFF | Trigger polarity (one bit per channel): `0` = low-level/falling-edge, '1' = high-level/rising-edge.
4+^| **Peripheral/IO Modules**
| `IO_GPIO_NUM` | natural | 0 | Number of general purpose input/output pairs of the <<_general_purpose_input_and_output_port_gpio>>.
| `IO_MTIME_EN` | boolean | false | Implement the <<_machine_system_timer_mtime>>.
| `IO_UART0_EN` | boolean | false | Implement the <<_primary_universal_asynchronous_receiver_and_transmitter_uart0>>.
| `IO_UART0_RX_FIFO` | natural | 1 | UART0 RX FIFO depth, has to be a power of two, minimum value is 1, max 32768.
| `IO_UART0_TX_FIFO` | natural | 1 | UART0 TX FIFO depth, has to be a power of two, minimum value is 1, max 32768.
| `IO_UART1_EN` | boolean | false | Implement the <<_secondary_universal_asynchronous_receiver_and_transmitter_uart1>>.
| `IO_UART1_RX_FIFO` | natural | 1 | UART1 RX FIFO depth, has to be a power of two, minimum value is 1, max 32768.
| `IO_UART1_TX_FIFO` | natural | 1 | UART1 TX FIFO depth, has to be a power of two, minimum value is 1, max 32768.
| `IO_SPI_EN` | boolean | false | Implement the <<_serial_peripheral_interface_controller_spi>>.
| `IO_SPI_FIFO` | natural | 1 | Depth of the <<_serial_peripheral_interface_controller_spi>> FIFO. Has to be a power of two, min 1, max 32768.
| `IO_SDI_EN` | boolean | false | Implement the <<_serial_data_interface_controller_sdi>>.
| `IO_SDI_FIFO` | natural | 1 | Depth of the <<_serial_data_interface_controller_sdi>> FIFO. Has to be a power of two, min 1, max 32768.
| `IO_TWI_EN` | boolean | false | Implement the <<_two_wire_serial_interface_controller_twi>>.
| `IO_PWM_NUM_CH` | natural | 0 | Number of channels of the <<_pulse_width_modulation_controller_pwm>> to implement (0..12).
| `IO_WDT_EN` | boolean | false | Implement the <<_watchdog_timer_wdt>>.
| `IO_TRNG_EN` | boolean | false | Implement the <<_true_random_number_generator_trng>>.
| `IO_TRNG_FIFO` | natural | 1 | Depth of the TRNG data FIFO. Has to be a power of two, min 1, max 32768.
| `IO_CFS_EN` | boolean | false | Implement the <<_custom_functions_subsystem_cfs>>.
| `IO_CFS_CONFIG` | suv(31:0) | 0x00000000 | "Conduit" generic to pass user-defined flags to the <<_custom_functions_subsystem_cfs>>.
| `IO_CFS_IN_SIZE` | natural | 32 | Size of the <<_custom_functions_subsystem_cfs>> input signal conduit (`cfs_in_i`).
| `IO_CFS_OUT_SIZE` | natural | 32 | Size of the <<_custom_functions_subsystem_cfs>> output signal conduit (`cfs_out_o`).
| `IO_NEOLED_EN` | boolean | false | Implement the <<_smart_led_interface_neoled>>.
| `IO_NEOLED_TX_FIFO` | natural | 1 | TX FIFO depth of the the <<_smart_led_interface_neoled>>. Has to be a power of two, min 1, max 32768.
| `IO_GPTMR_EN` | boolean | false | Implement the <<_general_purpose_timer_gptmr>>.
| `IO_ONEWIRE_EN` | boolean | false | Implement the <<_one_wire_serial_interface_controller_onewire>>.
| `IO_DMA_EN` | boolean | false | Implement the <<_direct_memory_access_controller_dma>>.
| `IO_SLINK_EN` | boolean | false | Implement the <<_stream_link_interface_slink>>.
| `IO_SLINK_RX_FIFO` | natural | 1 | SLINK RX FIFO depth, has to be a power of two, minimum value is 1, max 32768.
| `IO_SLINK_TX_FIFO` | natural | 1 | SLINK TX FIFO depth, has to be a power of two, minimum value is 1, max 32768.
| `IO_CRC_EN` | boolean | false | Implement the <<_cyclic_redundancy_check_crc>> unit.
|=======================
<<<
// ####################################################################################################################
:sectnums:
=== Processor Clocking
The processor is implemented as fully-synchronous logic design using a single clock domain that is driven entirely by the
top's `clk_i` signal. This clock signal is used by all internal registers and memories, which trigger on the rising edge of
this clock signal - except for the <<_processor_reset>> and the clock switching gate that trigger on a falling edge.
External "clocks" like the OCD's JTAG clock or the SDI's serial clock are synchronized into the processor's clock domain
before being further processed.
==== Clock Gating
The single clock domain of the processor can be split into an always-on clock domain and a switchable clock domain.
The switchable clock domain is used to clock the CPU core, the CPU's bus switch and - if implemented - the caches.
This domain can be deactivated to reduce power consumption. The always-on clock domain is used to clock all other
processor modules like peripherals, memories and IO devices. Hence, these modules can continue operation (e.g. a
timer keeps running) even if the CPU is shut down.
The splitting into two clock domain is enabled by the `CLOCK_GATING_EN` generic (<<_processor_top_entity_generics>>).
When enabled, a generic clock switching gate is added to decouple the switchable clock from the always-on clock domain
(VHDL file `neorv32_clockgate.vhd`). Whenever the CPU enters <<_sleep_mode>> the CPU clock domain ist shut down.
.Clock Switch Hardware
[NOTE]
By default, a generic clock gate is used (`rtl/core/neorv32_clockgate.vhd`) to shut down the CPU clock.
Especially for FPGA setups it is highly recommended to replace this default version by a technology-specific primitive
or macro wrapper to improve efficiency (clock skew, global clock tree usage, etc.).
==== Peripheral Clocks
Many processor modules like the UARTs or the timers provide a programmable time base for operations. In order to simplify
the hardware, the processor implements a global "clock generator" that provides _clock enables_ for certain frequencies that
are derived from the man clock. Hence, these clock enable signals are synchronous to the system's main clock and will be high
for only a single cycle. The processor modules can use these enables for sub-main-clock operations while still providing a single
clock domain only.
In total, 8 sub-main-clock signals are available. All processor modules, which feature a time-based configuration, provide a
programmable three-bit prescaler select in their control register to select one of the 8 available clocks. The
mapping of the prescaler select bits to the according clock source is shown in the table below. Here, _f_ represents the
processor main clock from the top entity's `clk_i` signal.
[cols="<3,^1,^1,^1,^1,^1,^1,^1,^1"]
[grid="rows"]
|=======================
| Prescaler bits: | `0b000` | `0b001` | `0b010` | `0b011` | `0b100` | `0b101` | `0b110` | `0b111`
| Resulting clock: | _f/2_ | _f/4_ | _f/8_ | _f/64_ | _f/128_ | _f/1024_| _f/2048_| _f/4096_
|=======================
.Power Saving
[TIP]
If no peripheral modules requires a clock signal from the internal clock generator (all according modules are disabled by
clearing the enable bit in the according module's control register) the generator is automatically deactivated to reduce
dynamic power consumption.
<<<
// ####################################################################################################################
:sectnums:
=== Processor Reset
.Processor Reset Signal
[IMPORTANT]
Always make sure to connect the processor's reset signal `rstn_i` to a valid reset source (a button, the "locked"
signal of a PLL, a dedicated reset controller, etc.).
The processor-wide reset can be triggered by any of the following sources:
* the asynchronous low-active `rstn_i` top entity input signal
* the <<_on_chip_debugger_ocd>>
* the <<_watchdog_timer_wdt>>
.Reset Cause
[TIP]
The actual reset cause can be determined via the <<_watchdog_timer_wdt>>.
If any of these sources trigger a reset, the internal reset will be triggered for at least 4 clock cycles ensuring
a valid reset of the entire processor. The internal global reset is asserted _aysynchronoulsy_ if triggered by the external
`rstn_i` signal. For internal reset sources, the global reset is asserted _synchronously_. If the reset cause gets inactive
the internal reset is de-asserted _synchronously_ at a falling clock edge.
Internally, **all registers** that are not meant for mapping to blockRAM (like the register file) do provide a dedicated and
low-active **asynchronous hardware reset**. This asynchronous reset ensures that the entire processor logic is reset to a
defined state even if the main clock is not operational yet.
[NOTE]
The system reset will only reset the control registers of each implemented IO/peripheral module. This control register
reset will also reset the according "module enable flag" to zero, which - in turn - will cause a _synchronous_
module-internal reset of the remaining logic.
<<<
// ####################################################################################################################
:sectnums:
=== Processor Interrupts
The NEORV32 Processor provides several interrupt request signals (IRQs) for custom platform use.
:sectnums:
==== RISC-V Standard Interrupts
The processor setup features the standard machine-level RISC-V interrupt lines for "machine timer interrupt", "machine
software interrupt" and "machine external interrupt". Their usage is defined by the RISC-V privileged architecture
specifications. However, bare-metal system can also repurpose these interrupts. See CPU section
<<_traps_exceptions_and_interrupts>> for more information.
[cols="<4,<10"]
[options="header",grid="rows"]
|=======================
| Top signal | Description
| `mtime_irq_i` | Machine timer interrupt from _processor-external_ MTIME unit (`MTI`). This IRQ is only available if the processor-internal <<_machine_system_timer_mtime>> unit is not implemented.
| `msw_irq_i` | Machine software interrupt (`MSI`). This interrupt is used for inter-processor interrupts in multi-core systems. However, it can also be used for any custom purpose.
| `mext_irq_i` | Machine external interrupt (`MEI`). This interrupt is used for any processor-external interrupt source (like a platform interrupt controller).
|=======================
.Trigger Type
[IMPORTANT]
The RISC-V standard interrupts are **level-triggered and high-active**. Once set, the signal has to remain high until
the interrupt request is explicitly acknowledged (e.g. writing to a memory-mapped register). The RISC-V standard interrupts
**CANNOT** be acknowledged/cleared by writing zero to the according <<_mip>> CSR bit.
:sectnums:
==== NEORV32-Specific Fast Interrupt Requests
As part of the NEORV32-specific CPU extensions, the processor core features 16 fast interrupt request signals
(`FIRQ0` - `FIRQ15`) providing dedicated bits in the <<_mip>> and <<_mie>> CSRs and custom <<_mcause>> trap codes.
The FIRQ signals are reserved for _processor-internal_ modules only (for example for the communication
interfaces to signal "available incoming data" or "ready to send new data").
The mapping of the 16 FIRQ channels to the according processor-internal modules is shown in the following
table (the channel number also corresponds to the according FIRQ priority: 0 = highest, 15 = lowest):
.NEORV32 Fast Interrupt Request (FIRQ) Mapping
[cols="^2,<2,<6"]
[options="header",grid="rows"]
|=======================
| Channel | Source | Description
| 0 | <<_watchdog_timer_wdt,WDT>> | watchdog timeout interrupt
| 1 | <<_custom_functions_subsystem_cfs,CFS>> | custom functions subsystem (CFS) interrupt (user-defined)
| 2 | <<_primary_universal_asynchronous_receiver_and_transmitter_uart0,UART0>> | UART0 RX interrupt
| 3 | <<_primary_universal_asynchronous_receiver_and_transmitter_uart0,UART0>> | UART0 TX interrupt
| 4 | <<_secondary_universal_asynchronous_receiver_and_transmitter_uart1,UART1>> | UART1 RX interrupt
| 5 | <<_secondary_universal_asynchronous_receiver_and_transmitter_uart1,UART1>> | UART1 TX interrupt
| 6 | <<_serial_peripheral_interface_controller_spi,SPI>> | SPI interrupt
| 7 | <<_two_wire_serial_interface_controller_twi,TWI>> | TWI transmission done interrupt
| 8 | <<_external_interrupt_controller_xirq,XIRQ>> | External interrupt controller interrupt
| 9 | <<_smart_led_interface_neoled,NEOLED>> | NEOLED TX buffer interrupt
| 10 | <<_direct_memory_access_controller_dma,DMA>> | DMA transfer done interrupt
| 11 | <<_serial_data_interface_controller_sdi,SDI>> | SDI interrupt
| 12 | <<_general_purpose_timer_gptmr,GPTMR>> | General purpose timer interrupt
| 13 | <<_one_wire_serial_interface_controller_onewire,ONEWIRE>> | 1-wire operation done interrupt
| 14 | <<_stream_link_interface_slink,SLINK>> | SLINK FIFO level interrupt
| 15 | <<_true_random_number_generator_trng,TRNG>> | TRNG FIFO level interrupt
|=======================
.Trigger Type
[IMPORTANT]
The fast interrupt request channels become pending after being triggering by one-cycle-high signal.
A pending FIRQ has to be explicitly cleared by writing zero to the according <<_mip>> CSR bit.
<<<
// ####################################################################################################################
:sectnums:
=== Address Space
As a 32-bit architecture the NEORV32 can access a 4GB physical address space. By default, this address space is
split into six main regions. Each region provides specific _physical memory attributes_ ("PMAs") that define
the access capabilities (`rwxac`; `r` = read permission, `w` = write permission, `x` - execute permission,
`a` = atomic access support, `c` = cached CPU access).
.NEORV32 Processor Address Space (Default Configuration)
image::address_space.png[900]
.Main Address Regions
[cols="<1,^4,^2,<7"]
[options="header",grid="rows"]
|=======================
| # | Region | PMAs | Description
| 1 | Internal IMEM address space | `rwxac` | For instructions (=code) and constants; mapped to the internal <<_instruction_memory_imem>>.
| 2 | Internal DMEM address space | `rwxac` | For application runtime data (heap, stack, etc.); mapped to the internal <<_data_memory_dmem>>).
| 3 | Memory-mapped XIP flash | `r-xac` | Memory-mapped access to the <<_execute_in_place_module_xip>> SPI flash.
| 4 | Bootloader address space | `r-xa-` | Read-only memory for the internal <<_bootloader_rom_bootrom>> containing the default <<_bootloader>>.
| 5 | IO/peripheral address space | `rwxa-` | Processor-internal peripherals / IO devices.
| 6 | The "**void**" | `rwxac` | Unmapped address space. All accesses to this region(s) are redirected to the <<_processor_external_memory_interface_wishbone>> (if implemented).
|=======================
.Custom PMAs
[NOTE]
Physical memory attributes can be customized (constrained) using the CPU's <<_pmp_isa_extension>>.
The CPU can access all of the 32-bit address space from the instruction fetch interface and also from the data access
interface. Both interfaces can be equipped with optional caches (<<_processor_internal_data_cache_dcache>> and
<<_processor_internal_instruction_cache_icache>>). The two CPU interfaces are multiplexed by a simple bus switch into
a single processor-internal bus. Optionally, this bus is further switched by another instance of the bus switch so the
<<_direct_memory_access_controller_dma>> controller can also access the entire address space. Accesses via the
resulting SoC bus are split by the <<_bus_gateway>> that redirects accesses to the according main address regions.
Accesses to the processor-internal IO/peripheral devices are further redirected via a dedicated <<_io_switch>>.
.Processor-Internal Bus Architecture
image::neorv32_bus.png[1300]
.Bus Interface
[TIP]
See sections CPU <<_architecture>> and <<_bus_interface>> for more information regarding the CPU bus accesses.
:sectnums:
==== Bus Gateway
The central bus gateway serves two purposes: **redirect** core accesses to the according modules (e.g. memory accesses
vs. memory-mapped IO accesses) and **monitor** all bus transactions. The redirection of access request is based on a
customizable memory map implemented via VHDL constants in the main package file (`rtl/core/neorv323_package.vhd`):
.Main Address Regions Configuration in the VHDL Package File
[source,vhdl]
----
-- Main Address Regions ---
constant mem_imem_base_c : std_ulogic_vector(31 downto 0) := x"00000000"; -- IMEM size via generic
constant mem_dmem_base_c : std_ulogic_vector(31 downto 0) := x"80000000"; -- DMEM size via generic
constant mem_xip_base_c : std_ulogic_vector(31 downto 0) := x"e0000000";
constant mem_xip_size_c : natural := 256*1024*1024;
constant mem_boot_base_c : std_ulogic_vector(31 downto 0) := x"ffffc000";
constant mem_boot_size_c : natural := 8*1024;
constant mem_io_base_c : std_ulogic_vector(31 downto 0) := x"ffffe000";
constant mem_io_size_c : natural := 8*1024;
----
Besides the delegation of bus requests the gateway also implements a bus monitor (aka "the bus keeper") that tracks all
active bus transactions to ensure _safe_ and _deterministic_ operations.
Whenever a memory-mapped device is accessed (a real memory, a memory-mapped IO or some processor-external module) the bus
monitor starts an internal timer. The accessed module has to respond ("ACK") to the bus request within a specific
**time window**. This time window is defined by a global constant in the processor's VHDL package file
(`rtl/core/neorv323_package.vhd`).
.Internal Bus Timeout Configuration
[source,vhdl]
----
constant bus_timeout_c : natural := 15;
----
This constant defines the _maximum_ number of cycles after which a non-responding bus request (i.e. no `ack`
and no `err` signal) will time out raising a bus access fault exception. For example this can happen when accessing
"address space holes" - addresses that are not mapped to any physical module. The resulting exception type corresponds
to the according access type, i.e. instruction fetch access exception, load access exception or store access exception.
.XIP Timeout
[NOTE]
Accesses to the memory-mapped XIP flash (via the <<_execute_in_place_module_xip>>) will _never_ time out.
.External Bus Interface Timeout
[NOTE]
Accesses that are delegated to the external bus interface have a different maximum timeout value that is defined by an
explicit specific processor generic. See section <<_processor_external_memory_interface_wishbone>> for more information.
:sectnums:
==== Reservation Set Controller
The reservation set controller is responsible for handling the load-reservate and store-conditional bus transaction that
are triggered by the `lr.w` (LR) and `sc.w` (SC) instructions from the CPU's <<_a_isa_extension>>.
A "reservation" defines an address or address range that provides a guarding mechanism to support atomic accesses. A new
reservation is registered by the LR instruction. The address provided by this instruction defines the memory location
that is now monitored for atomic accesses. The according SC instruction evaluates the state of this reservation. If
the reservation is still valid the write access triggered by the SC instruction is finally executed and the instruction
return a "success" state (`rd` = 0). If the reservation has been invalidated the SC instruction will not write to memory
and will return a "failed" state (`rd` = 1).
The reservation is invalidated if...
* an SC instruction is executed that accesses an address **outside** of the reservation set of the previous LR instruction.
This SC instruction will **fail** (not writing to memory).
* an SC instruction is executed that accesses an address **inside** of the reservation set of the previous LR instruction.
This SC instruction will **succeed** (finally writing to memory).
* a normal store operation accesses an address **inside** of the current reservation set (by the CPU or by the DMA).
* a hardware reset is triggered.
.Consecutive LR Instructions
[NOTE]
If an LR instruction is followed by another LR instruction the reservation set of the former one is overridden
by the reservation set of the latter one.
.Bus Access Errors
[IMPORTANT]
If the LR operation causes a bus access error (raising a load access exception) the reservation **is registered anyway**.
If the SC operation causes a bus access error (raising a store access exception) an already registered reservation set
**is invalidated anyway**.
.Strong Semantic
[IMPORTANT]
The LR/SC mechanism follows the _strong semantic_ approach: the LR/SC instruction pair fails only if there is a write
access to the referenced memory location between the LR and SC instructions (by the CPU itself or by the DMA).
Context changes, interrupts, traps, etc. do not effect nor invalidate the reservation state at all.
The controller supports only a single global reservation set. By default this reservation set "monitors" a word-aligned
4-byte granule. However, the granularity can be customized via the `AMO_RVS_GRANULARITY` top entity generic (see
<<_processor_top_entity_generics>>) to cover an arbitrarily large naturally aligned address region. The only constraint is
that the size of the address region has to be a power of two. The configured granularity can be determined by software via
the <<_system_configuration_information_memory_sysinfo>> module.
.Physical Memory Attributes
[NOTE]
The reservation set can be set for _any_ address (only constrained by the configured granularity). This also
includes cached memory, memory-mapped IO devices and processor-external address spaces.
Bus transactions triggered by the LR instruction register a new reservation set and are delegated to the adressed
memory/device. Bus transactions triggered by the SC remove a reservation set and are forwarded to the adressed
memory/device only if the SC operations succeeds. Otherwise, the access request is not forwarded and a local ACK is
generated to terminate the bus transaction.
.LR/SC Bus Protocol
[NOTE]
More information regarding the LR/SC bus transactions and the the according protocol can be found in section
<<_bus_interface>> / <<_atomic_accesses>>.
.Cache Coherency
[IMPORTANT]
Atomic operations **always bypass** the cache using direct/uncached accesses. Care must be taken
to maintain data cache coherency (e.g. by using the `fence` instruction).
:sectnums:
==== IO Switch
The IO switch further decodes the address when accessing the processor-internal IO/peripheral devices and forwards
the access request to the according module. Note that a total address space size of 256 bytes is assigned to each
IO module in order to simplify address decoding. The IO-specific address map is also defined in the main VHDL
package file (`rtl/core/neorv323_package.vhd`).
.Exemplary Cut-Out from the IO Address Map
[source,vhdl]
----
-- IO Address Map --
constant iodev_size_c : natural := 256; -- size of a single IO device (bytes)
constant base_io_cfs_c : std_ulogic_vector(31 downto 0) := x"ffffeb00";
constant base_io_slink_c : std_ulogic_vector(31 downto 0) := x"ffffec00";
constant base_io_dma_c : std_ulogic_vector(31 downto 0) := x"ffffed00";
----
:sectnums:
==== Boot Configuration
Due to the flexible memory configuration, the NEORV32 Processor provides several different boot scenarios.
The following section illustrates the two most common boot scenarios.
.NEORV32 Boot Configurations
image::neorv32_boot_configurations.png[800]
There are two general boot scenarios: _Indirect Boot_ (1a and 1b) and _Direct Boot_ (2a and 2b) configured via the
`INT_BOOTLOADER_EN` generic. If this generic is `true` the _indirect boot scenario_ is used. This is also the
default boot configuration of the processor. If `INT_BOOTLOADER_EN` is `*false` the _direct boot scenario_ is used.
:sectnums!:
===== Indirect Boot
The indirect_boot scenarios **1a** and **1b** are based on the processor-internal <<_bootloader>>. This boot setup is enabled
by setting the `INT_BOOTLOADER_EN` generic to `true`, which will implement the processor-internal <<_bootloader_rom_bootrom>>.
This read-only memory is pre-initialized during synthesis with the default bootloader firmware. The bootloader provides several
options to upload an executable copying it to the beginning of the _instruction address space_ so the CPU can execute it.
Boot scenario **1a** uses the processor-internal IMEM. This scenario implements the internal <<_instruction_memory_imem>>
as non-initialized RAM so the bootloader can copy the actual executable to it.
Boot scenario **1b** uses a processor-external IMEM that is connected via the processor's bus interface. In this scenario
the internal <<_instruction_memory_imem>> is not implemented at all and the bootloader will copy the executable to the
processor-external memory. Hence, the external memory has to be implemented as RAM.
:sectnums!:
===== Direct Boot
The direct boot scenarios **2a** and **2b** do not use the processor-internal bootloader since the `INT_BOOTLOADER_EN`
generic is set `false`. In this configuration the <<_bootloader_rom_bootrom>> is not implemented at all and the CPU will
directly begin executing code from the beginning of the instruction address space after reset. An application-specific
"pre-initialization" mechanism is required in order to provide an executable inside the memory.
Boot scenario **2a** uses the processor-internal IMEM implemented as _read-only memory_ in this scenario.
It is pre-initialized (by the bitstream) with the actual application executable during synthesis.
In contrast, boot scenario **2b** uses a processor-external IMEM. In this scenario the system designer is responsible for
providing an initialized external memory that contains the actual application to be executed.
<<<
// ####################################################################################################################
:sectnums:
=== Processor-Internal Modules
.Module Address Space Mapping
[IMPORTANT]
The base address of each component/module has to be aligned to the total size of the module's occupied address space.
The occupied address space has to be a power of two (minimum 4 bytes). Addresses of peripheral modules must not overlap.
.Full-Word Write Accesses Only
[IMPORTANT]
All peripheral/IO devices should only be written in full-word mode (i.e. 32-bit). Byte or half-word (8/16-bit) write accesses
might cause undefined behavior.
.IO Module's Address Space
[IMPORTANT]
Each peripheral/IO module occupies an address space of 256 bytes (64 words). Most devices do not fully utilize this address
space and will simply _mirror_ the available interface registers across the entire 256 bytes of address space.
.Unimplemented Modules / Address Holes
[NOTE]
When accessing an IO device that hast not been implemented (disabled via the according generic)
or when accessing an address that is actually unused, a load/store access fault exception is raised.
.Module Interrupts
[NOTE]
Most peripheral/IO devices provide some kind of interrupt (for example to signal available incoming data). These
interrupts are entirely mapped to the CPU's <<_custom_fast_interrupt_request_lines>>.
See section <<_processor_interrupts>> for more information.
.CMSIS System Description View (SVD)
[TIP]
A CMSIS-SVD-compatible **System View Description (SVD)** file including all peripherals is available in `sw/svd`.
include::soc_imem.adoc[]
include::soc_dmem.adoc[]
include::soc_bootrom.adoc[]
include::soc_icache.adoc[]
include::soc_dcache.adoc[]
include::soc_dma.adoc[]
include::soc_wishbone.adoc[]
include::soc_slink.adoc[]
include::soc_gpio.adoc[]
include::soc_crc.adoc[]
include::soc_wdt.adoc[]
include::soc_mtime.adoc[]
include::soc_uart.adoc[]
include::soc_spi.adoc[]
include::soc_sdi.adoc[]
include::soc_twi.adoc[]
include::soc_onewire.adoc[]
include::soc_pwm.adoc[]
include::soc_trng.adoc[]
include::soc_cfs.adoc[]
include::soc_neoled.adoc[]
include::soc_xirq.adoc[]
include::soc_gptmr.adoc[]
include::soc_xip.adoc[]
include::soc_sysinfo.adoc[]

View File

@ -0,0 +1,28 @@
<<<
:sectnums:
==== Bootloader ROM (BOOTROM)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_boot_rom.vhd |
| Software driver file(s): | none |
| Top entity port: | none |
| Configuration generics: | `INT_BOOTLOADER_EN` | implement processor-internal bootloader when `true`
| CPU interrupts: | none |
|=======================
This boot ROM module provides a read-only memory that contain the executable image of the default NEORV32
<<_bootloader>>. If the internal bootloader is enabled via the `INT_BOOTLOADER_EN` generic the CPU's boot address
is automatically set to the beginning of the bootloader ROM. See sections <<_address_space>> and
<<_boot_configuration>> for more information regarding the processor's different boot scenarios.
.Memory Size
[IMPORTANT]
If the configured boot ROM size is **not** a power of two the actual memory size will be auto-adjusted to
the next power of two (e.g. configuring a memory size of 6kB will result in a physical memory size of 8kB).
.Bootloader Image
[IMPORTANT]
The boot ROM is initialized during synthesis with the default bootloader image
(`rtl/core/neorv32_bootloader_image.vhd`).

View File

@ -0,0 +1,101 @@
<<<
:sectnums:
==== Custom Functions Subsystem (CFS)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_cfs.vhd |
| Software driver file(s): | neorv32_cfs.c |
| | neorv32_cfs.h |
| Top entity port: | `cfs_in_i` | custom input conduit
| | `cfs_out_o` | custom output conduit
| Configuration generics: | `IO_CFS_EN` | implement CFS when `true`
| | `IO_CFS_CONFIG` | custom generic conduit
| | `IO_CFS_IN_SIZE` | size of `cfs_in_i`
| | `IO_CFS_OUT_SIZE` | size of `cfs_out_o`
| CPU interrupts: | fast IRQ channel 1 | CFS interrupt (see <<_processor_interrupts>>)
|=======================
**Theory of Operation**
The custom functions subsystem is meant for implementing custom tightly-coupled co-processors or interfaces.
IT provides up to 64 32-bit memory-mapped read/write registers (`REG`, see register map below) that can be
accessed by the CPU via normal load/store operations. The actual functionality of these register has to be
defined by the hardware designer. Furthermore, the CFS provides two IO conduits to implement custom on-chip
or off-chip interfaces.
Just like any other externally-connected IP, logic implemented within the custom functions subsystem can operate
_independently_ of the CPU providing true parallel processing capabilities. Potential use cases might include
dedicated hardware accelerators for en-/decryption (AES), signal processing (FFT) or AI applications
(CNNs) as well as custom IO systems like fast memory interfaces (DDR) and mass storage (SDIO), networking (CAN)
or real-time data transport (I2S).
[TIP]
If you like to implement _custom instructions_ that are executed right within the CPU's ALU
see the <<_zxcfu_isa_extension>> and the according <<_custom_functions_unit_cfu>>.
[TIP]
Take a look at the template CFS VHDL source file (`rtl/core/neorv32_cfs.vhd`). The file is highly
commented to illustrate all aspects that are relevant for implementing custom CFS-based co-processor designs.
[TIP]
The CFS can also be used to _replicate_ existing NEORV32 modules - for example to implement several TWI controllers.
**CFS Software Access**
The CFS memory-mapped registers can be accessed by software using the provided C-language aliases (see
register map table below). Note that all interface registers are defined as 32-bit words of type `uint32_t`.
.CFS Software Access Example
[source,c]
----
// C-code CFS usage example
NEORV32_CFS->REG[0] = (uint32_t)some_data_array(i); // write to CFS register 0
int temp = (int)NEORV32_CFS->REG[20]; // read from CFS register 20
----
**CFS Interrupt**
The CFS provides a single high-level-triggered interrupt request signal mapped to the CPU's fast interrupt channel 1.
Once triggered, the interrupt becomes pending (if enabled in the <<_mie>> CSR) and has to be explicitly cleared again by
writing zero to the according <<_mip>> CSR bit. See section <<_processor_interrupts>> for more information.
**CFS Configuration Generic**
By default, the CFS provides a single 32-bit `std_ulogic_vector` configuration generic `IO_CFS_CONFIG`
that is available in the processor's top entity. This generic can be used to pass custom configuration options
from the top entity directly down to the CFS. The actual definition of the generic and it's usage inside the
CFS is left to the hardware designer.
**CFS Custom IOs**
By default, the CFS also provides two unidirectional input and output conduits `cfs_in_i` and `cfs_out_o`.
These signals are directly propagated to the processor's top entity. These conduits can be used to implement
application-specific interfaces like memory or peripheral connections. The actual use case of these signals
has to be defined by the hardware designer.
The size of the input signal conduit `cfs_in_i` is defined via the top's `IO_CFS_IN_SIZE` configuration
generic (default = 32-bit). The size of the output signal conduit `cfs_out_o` is defined via the top's
`IO_CFS_OUT_SIZE` configuration generic (default = 32-bit). If the custom function subsystem is not implemented
(`IO_CFS_EN` = false) the `cfs_out_o` signal is tied to all-zero.
**Register Map**
.CFS register map (`struct NEORV32_CFS`)
[cols="^4,<2,^2,^2,<6"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s) | R/W | Function
| `0xffffeb00` | `REG[0]` |`31:0` | (r)/(w) | custom CFS register 0
| `0xffffeb04` | `REG[1]` |`31:0` | (r)/(w) | custom CFS register 1
| ... | ... |`31:0` | (r)/(w) | ...
| `0xffffebf8` | `REG[62]` |`31:0` | (r)/(w) | custom CFS register 62
| `0xffffebfc` | `REG[63]` |`31:0` | (r)/(w) | custom CFS register 63
|=======================

View File

@ -0,0 +1,71 @@
<<<
:sectnums:
==== Cyclic Redundancy Check (CRC)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_crc.vhd |
| Software driver file(s): | neorv32_crc.c |
| | neorv32_crc.h |
| Top entity port: | none |
| Configuration generics: | `IO_CRC_EN` | implement CRC module when `true`
| CPU interrupts: | none |
|=======================
**Overview**
The cyclic redundancy check unit provides a programmable checksum computation module. The unit operates on
single bytes and can either compute CRC8, CRC16 or CRC32 checksums based on an arbitrary polynomial and
start value.
.DMA Demo Program
[TIP]
A CRC example program (also using CPU-independent DMA transfers) can be found in `sw/example/crc_dma`.
.CPU-Independent Operation
[TIP]
The CRC unit can compute a checksum for an arbitrary memory array without any CPU overhead
by using the processor's <<_direct_memory_access_controller_dma>>.
**Theory of Operation**
The module provides four interface registers:
* `MODE`: selects either CRC8-, CRC16- or CRC32-mode
* `POLY`: programmable polynomial
* `DATA`: data input register (single bytes only)
* `SREG`: the CRC shift register; this register is used to define the start value and to obtain
the final processing result
The `MODE`, `POLY` and `SREG` registers need to be programmed before the actual processing can be started.
Writing a byte to `DATA` will update the current checksum in `SREG`.
.Access Latency
[NOTE]
Write access to the CRC module have an increased latency of 8 clock cycles. This additional latency
ensures that the internal bit-serial processing of the current data byte has also been completed when the
transfer is completed.
.Data Size
[NOTE]
For CRC8-mode only bits `7:0` of `POLY` and `SREG` are relevant; for CRC16-mode only bits `15:0` are used
and for CRC32-mode the entire 32-bit of `POLY` and `SREG` are used.
**Register Map**
.CRC Register Map (`struct NEORV32_CRC`)
[cols="<2,<1,<4,^1,<7"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.2+<| `0xffffee00` .2+<| `CTRL` <|`1:0` ^| r/w <| CRC mode select (`00` CRC8, `01`: CRC16, `10`: CRC32)
<|`31:2` ^| r/- <| _reserved_, read as zero
| `0xffffee04` | `POLY` |`31:0` | r/w | CRC polynomial
.2+<| `0xffffee08` .2+<| `DATA` <|`7:0` ^| r/w <| data input (single byte)
<|`31:8` ^| r/- <| _reserved_, read as zero, writes are ignored
| `0xffffee0c` | `SREG` |`32:0` | r/w | current CRC shift register value (set start value on write)
|=======================

View File

@ -0,0 +1,53 @@
<<<
:sectnums:
==== Processor-Internal Data Cache (dCACHE)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_dcache.vhd |
| Software driver file(s): | none | _implicitly used_
| Top entity port: | none |
| Configuration generics: | `DCACHE_EN` | implement processor-internal data cache when `true`
| | `DCACHE_NUM_BLOCKS` | number of cache blocks (pages/lines)
| | `DCACHE_BLOCK_SIZE` | size of a cache block in bytes
| CPU interrupts: | none |
|=======================
The processor features an optional data cache to improve performance when using memories with high
access latencies. The cache is directly connected to the CPU's data access interface and provides
full-transparent buffering.
The cache is implemented if the `DCACHE_EN` generic is `true`. The size of the cache memory is defined via the
`DCACHE_BLOCK_SIZE` (the size of a single cache block/page/line in bytes; has to be a power of two and greater than or
equal to 4 bytes) and `DCACHE_NUM_BLOCKS` (the total amount of cache blocks; has to be a power of two and greater than or
equal to 1) generics. The data cache provides only a single set, hence it is direct-mapped.
**Cached/Uncached Accesses**
The data cache provides direct accesses (= uncached) to memory in order to access memory-mapped IO (like the
processor-internal IO/peripheral modules). All accesses that target the address range from `0xF0000000` to `0xFFFFFFFF`
will not be cached at all (see section <<_address_space>>).
.Caching Internal Memories
[NOTE]
The data cache is intended to accelerate data access to **processor-external** memories
(via the external bus interface or via the XIP module). The cache(s) should not be implemented
when using only processor-internal data and instruction memories.
.Manual Cache Clear/Reload
[NOTE]
By executing the `fence(.i)` instruction the cache is cleared and a reload from main memory is triggered.
.Retrieve Cache Configuration from Software
[TIP]
Software can retrieve the cache configuration/layout from the <<_sysinfo_cache_configuration>> register.
.Bus Access Fault Handling
[NOTE]
The cache always loads a complete cache block (aligned to the block size) every time a
cache miss is detected. Each cached word from this block provides a single status bit that indicates if the
according bus access was successful or caused a bus error. Hence, the whole cache block remains valid even
if certain addresses inside caused a bus error. If the CPU accesses any of the faulty cache words, a
data bus error exception is raised.

View File

@ -0,0 +1,155 @@
<<<
:sectnums:
==== Direct Memory Access Controller (DMA)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_dma.vhd |
| Software driver file(s): | neorv32_dma.c |
| | neorv32_dma.h |
| Top entity port: | none |
| Configuration generics: | `IO_DMA_EN` | implement DMA when `true`
| CPU interrupts: | fast IRQ channel 10 | DMA transfer done (see <<_processor_interrupts>>)
|=======================
**Overview**
The NEORV32 DMA provides a small-scale scatter/gather direct memory access controller that allows to transfer and
modify data independently of the CPU. A single read/write transfer channel is implemented that is configured via
memory-mapped registers. a configured transfer can either be triggered manually or by a programmable CPU FIRQ interrupt
(see <<_neorv32_specific_fast_interrupt_requests>>).
The DMA is connected to the central processor-internal bus system (see section <<_address_space>>) and can access the same
address space as the CPU core. It uses _interleaving mode_ accessing the central processor bus only if the CPU does not
currently request and bus access.
The controller can handle different data quantities (e.g. read bytes and write them back as sign-extend words) and can
also change the Endianness of data while transferring.
.DMA Demo Program
[TIP]
A DMA example program can be found in `sw/example/demo_dma`.
**Theory of Operation**
The DMA provides four memory-mapped interface registers: A status and control register `CTRL` and three registers for
configuring the actual DMA transfer. The base address of the source data is programmed via the `SRC_BASE` register.
Vice versa, the base address of the destination data is programmed via the `DST_BASE`. The third configuration register
`TTYPE` is use to configure the actual transfer type and the number of elements to transfer.
The DMA is enabled by setting the `DMA_CTRL_EN` bit of the control register. Manual trigger mode (i.e. the DMA transfer is
triggered by writing to the `TTYPE` register) is selected if `DMA_CTRL_AUTO` is cleared. Alternatively, the DMA transfer can
be triggered by a processor internal FIRQ signal if `DMA_CTRL_AUTO` is set (see section below).
The DMA uses a load-modify-write data transfer process. Data is read from the bus system, internally modified and then written
back to the bus system. This combination is implemented as an atomic progress, so canceling the current transfer by clearing the
`DMA_CTRL_EN` bit will stop the DMA right after the current load-modify-write operation.
If the DMA controller detects a bus error during operation, it will set either the `DMA_CTRL_ERROR_RD` (error during
last read access) or `DMA_CTRL_ERROR_WR` (error during last write access) and will terminate the current transfer.
Software can read the `SRC_BASE` or `DST_BASE` register to retrieve the address that caused the according error.
Alternatively, software can read back the `NUM` bits of the control register to determine the index of the element
that caused the error. The error bits are automatically cleared when starting a new transfer.
When the `DMA_CTRL_DONE` flag is set the DMA has actually executed a transfer. However, the `DMA_CTRL_ERROR_*` flags
should also be checked to verify that the executed transfer completed without errors. The `DMA_CTRL_DONE` flag is
automatically cleared when writing the `CTRL` register.
.DMA Access Privilege Level
[WARNING]
Transactions performed by the DMA are executed as bus transactions with elevated **machine-mode** privilege level.
Additionally, all physical memory protection rules (<<_pmp_isa_extension>>) defined by the CPU are **bypassed**.
**Transfer Configuration**
If the DMA is set to **manual trigger mode** (`DMA_CTRL_AUTO` = 0) writing the `TTRIG` register will start the
programmed DMA transfer. Once started, the DMA will read one data quantity from the source address, processes it internally
and then will write it back to the destination address. The `DMA_TTYPE_NUM` bits of the `TTYPE` register define how many
times this process is repeated by specifying the number of elements to transfer.
Optionally, the source and/or destination addresses can be increments according to the data quantities
automatically by setting the according `DMA_TTYPE_SRC_INC` and/or `DMA_TTYPE_DST_INC` bit.
Four different transfer quantities are available, which are configured via the `DMA_TTYPE_QSEL` bits:
* `00`: Read source data as byte, write destination data as byte
* `01`: Read source data as byte, write destination data as zero-extended word
* `10`: Read source data as byte, write destination data as sign-extended word
* `11`: Read source data as word, write destination data as word
Optionally, the DMA controller can automatically convert Endianness of the transferred data if the `DMA_TTYPE_ENDIAN`
bit is set.
.Address Alignment
[IMPORTANT]
Make sure to align the source and destination base addresses to the according transfer data quantities. For instance,
word-to-word transfers require that the two LSB of `SRC_BASE` and `DST_BASE` are cleared.
.Writing to IO Device
[IMPORTANT]
When writing data to IO / peripheral devices (for example to the <<_cyclic_redundancy_check_crc>>) the destination
data quantity has to be set to **word** (32-bit) since all IO registers can only be written in full 32-bit word mode.
**Automatic Trigger**
As an alternative to the manual trigger mode, the DMA can be configured to **automatic trigger mode** starting a pre-configured
transfer if a specific processor-internal peripheral issues an interrupt request. The automatic trigger mode is enabled by
setting the `CTRL` register's `DMA_CTRL_AUTO` bit. In this configuration _no_ transfer is started when writing to the DMA's
`TTYPE` register.
The actual trigger is configured via the control register `DMA_CTRL_FIRQ_MASK`. These bits reflect the state of the CPU's
<<_mip>> CSR showing any pending fast interrupt requests (for a full list see <<_neorv32_specific_fast_interrupt_requests>>).
The same bit definitions/locations as for the <<_mip>> and <<_mie>> CPU CSRs are used.
If any of the enabled sources issues an interrupt the DMA will start the pre-configured transfer (note that all enabled
sources are logically OR-ed).
.FIRQ Trigger
[NOTE]
The DMA transfer will start if a **rising edge** is detected on _any_ of the enabled FIRQ source channels.
**Memory Barrier / Fence Operation**
Optionally, the DMA can issue a FENCE request to the downstream memory system when a transfer has been completed
without errors. This can be used to re-sync caches (flush and reload) and buffers to maintain data coherency.
This automatic fencing is enabled by the setting the control register's `DMA_CTRL_FENCE` bit.
**DMA Interrupt**
The DMA features a single CPU interrupt that is triggered when the programmed transfer has completed. This
interrupt is also triggered if the DMA encounters a bus error during operation. An active DMA interrupt has to be
explicitly cleared again by writing zero to the according <<_mip>> CSR bit.
**Register Map**
.DMA Register Map (`struct NEORV32_DMA`)
[cols="<2,<1,<4,^1,<7"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.10+<| `0xffffed00` .10+<| `CTRL` <|`0` `DMA_CTRL_EN` ^| r/w <| DMA module enable
<|`1` `DMA_CTRL_AUTO` ^| r/w <| Enable automatic mode (FIRQ-triggered)
<|`2` `DMA_CTRL_FENCE` ^| r/w <| Issue a downstream FENCE operation when DMA transfer completes (without errors)
<|`7:3` _reserved_ ^| r/- <| reserved, read as zero
<|`8` `DMA_CTRL_ERROR_RD` ^| r/- <| Error during read access, clears when starting a new transfer
<|`9` `DMA_CTRL_ERROR_WR` ^| r/- <| Error during write access, clears when starting a new transfer
<|`10` `DMA_CTRL_BUSY` ^| r/- <| DMA transfer in progress
<|`11` `DMA_CTRL_DONE` ^| r/c <| Set if a transfer was executed; auto-clears on write-access
<|`15:12` _reserved_ ^| r/- <| reserved, read as zero
<|`31:16` `DMA_CTRL_FIRQ_MASK_MSB : DMA_CTRL_FIRQ_MASK_LSB` ^| r/w <| FIRQ trigger mask (same bits as in <<_mip>>)
| `0xffffed04` | `SRC_BASE` |`31:0` | r/w | Source base address (shows the last-accessed source address when read)
| `0xffffed08` | `DST_BASE` |`31:0` | r/w | Destination base address (shows the last-accessed destination address when read)
.6+<| `0xffffed0c` .6+<| `TTYPE` <|`23:0` `DMA_TTYPE_NUM_MSB : DMA_TTYPE_NUM_LSB` ^| r/w <| Number of elements to transfer (shows the last-transferred element index when read)
<|`26:24` _reserved_ ^| r/- <| reserved, read as zero
<|`28:27` `DMA_TTYPE_QSEL_MSB : DMA_TTYPE_QSEL_LSB` ^| r/w <| Source data quantity select (`00` = byte, `01` = half-word, `10` = word)
<|`29` `DMA_TTYPE_SRC_INC` ^| r/w <| Constant (`0`) or incrementing (`1`) source address
<|`30` `DMA_TTYPE_DST_INC` ^| r/w <| Constant (`0`) or incrementing (`1`) destination address
<|`31` `DMA_TTYPE_ENDIAN` ^| r/w <| Swap Endianness when set
|=======================

View File

@ -0,0 +1,41 @@
<<<
:sectnums:
==== Data Memory (DMEM)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_dmem.entity.vhd | entity-only definition
| | mem/neorv32_dmem.default.vhd | default _platform-agnostic_ memory architecture
| | mem/neorv32_dmem.legacy.vhd | alternative legacy-style memory architecture
| Software driver file(s): | none | _implicitly used_
| Top entity port: | none |
| Configuration generics: | `MEM_INT_DMEM_EN` | implement processor-internal DMEM when `true`
| | `MEM_INT_DMEM_SIZE` | DMEM size in bytes (use a power of 2)
| CPU interrupts: | none |
|=======================
Implementation of the processor-internal data memory is enabled by the processor's `MEM_INT_DMEM_EN`
generic. The total memory size in bytes is defined via the `MEM_INT_DMEM_SIZE` generic. Note that this
size should be a power of two to optimize physical implementation. If the DMEM is implemented,
it is mapped to base address `0x80000000` by default (see section <<_address_space>>).
The DMEM is always implemented as true RAM.
.Memory Size
[IMPORTANT]
If the configured memory size (via the `MEM_INT_IMEM_SIZE` generic) is **not** a power of two the actual memory
size will be auto-adjusted to the next power of two (e.g. configuring a memory size of 60kB will result in a
physical memory size of 64kB).
.VHDL Source File
[NOTE]
The actual DMEM is split into two design files: a plain entity definition `neorv32_dmem.entity.vhd` and the actual
architecture definition `mem/neorv32_dmem.default.vhd`. This **default architecture** provides a _generic_ and
_platform independent_ memory design that infers embedded memory blocks (blockRAM). The default architecture can
be replaced by platform-specific modules in order to use platform-specific features or to improve technology mapping
and/or timing. A "legacy-style" memory architecture is provided in `rtl/mem` that can be used if the synthesis does
not correctly infer blockRAMs.
.Execute from RAM
[TIP]
The CPU is capable of executing code also from arbitrary data memory.

View File

@ -0,0 +1,44 @@
<<<
:sectnums:
==== General Purpose Input and Output Port (GPIO)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_gpio.vhd |
| Software driver file(s): | neorv32_gpio.c |
| | neorv32_gpio.h |
| Top entity port: | `gpio_o` | 64-bit parallel output port
| | `gpio_i` | 64-bit parallel input port
| Configuration generics: | `IO_GPIO_NUM` | number of input/output pairs to implement (0..64)
| CPU interrupts: | none |
|=======================
The general purpose parallel IO unit provides a simple parallel input and output port. These ports can be used
chip-externally (for example to drive status LEDs, connect buttons, etc.) or chip-internally to provide control
signals for other IP modules.
The actual number of input/output pairs is defined by the `IO_GPIO_NUM` generic. When set to zero, the GPIO module
is excluded from synthesis and the output port `gpio_o` is tied to all-zero. If `IO_GPIO_NUM` is less than the
maximum value of 64, only the LSB-aligned bits in `gpio_o` and `gpio_i` are actually connected while the remaining
bits are tied to zero or are left unconnected, respectively.
.Access Atomicity
[NOTE]
The GPIO modules uses two memory-mapped registers (each 32-bit) each for accessing the input and
output signals. Since the CPU can only process 32-bit "at once" updating the entire output cannot
be performed within a single clock cycle.
**Register Map**
.GPIO unit register map (`struct NEORV32_GPIO`)
[cols="<2,<2,^1,^1,<6"]
[options="header",grid="rows"]
|=======================
| Address | Name [C] | Bit(s) | R/W | Function
| `0xfffffc00` | `INPUT_LO` | 31:0 | r/- | parallel input port pins 31:0
| `0xfffffc04` | `INPUT_HI` | 31:0 | r/- | parallel input port pins 63:32
| `0xfffffc08` | `OUTPUT_LO` | 31:0 | r/w | parallel output port pins 31:0
| `0xfffffc0c` | `OUTPUT_HI` | 31:0 | r/w | parallel output port pins 63:32
|=======================

View File

@ -0,0 +1,93 @@
<<<
:sectnums:
==== General Purpose Timer (GPTMR)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_gptmr.vhd |
| Software driver file(s): | neorv32_gptmr.c |
| | neorv32_gptmr.h |
| Top entity port: | `gptmr_trig_i` | timer capture input
| Configuration generics: | `IO_GPTMR_EN` | implement general purpose timer when `true`
| CPU interrupts: | fast IRQ channel 12 | timer interrupt (see <<_processor_interrupts>>)
|=======================
**Overview**
The general purpose timer module implements a simple yet universal 32-bit timer. It is implemented if the processor's
`IO_GPTMR_EN` top generic is set `true`. The timer provides a pre-scaled counter register that can trigger an interrupt
when reaching a programmable threshold value. Additionally, a timer-capture feature is implemented that copies the current
counter value to a dedicated register if a programmable edge occurs at the `gptmr_trig_i` input signal.
Four interface registers are available: a control register (`CTRL`), a 32-bit counter register (`COUNT`), a 32-bit
threshold register (`THRES`) and a 32-bit read-only capture register (`CAPTURE`). The timer is globally enabled by setting the
`GPTMR_CTRL_EN` bit in the device's control register `CTRL`. When the timer is enable the `COUNT` register will start
incrementing at a programmable rate, which scales the main processor clock. The pre-scaler value is configured via the
three `GPTMR_CTRL_PRSCx` control register bits:
.GPTMR prescaler configuration
[cols="<4,^1,^1,^1,^1,^1,^1,^1,^1"]
[options="header",grid="rows"]
|=======================
| **`GPTMR_CTRL_PRSCx`** | `0b000` | `0b001` | `0b010` | `0b011` | `0b100` | `0b101` | `0b110` | `0b111`
| Resulting `clock_prescaler` | 2 | 4 | 8 | 64 | 128 | 1024 | 2048 | 4096
|=======================
[NOTE]
Disabling the timer will not clear the `COUNT` register. However, it can be manually reset at any time by
writing zero to it.
**Interval Timer**
Whenever the counter register `COUNT` reaches the programmable threshold value `THRES` the counter register
is reset to zero and the _timer-match_ flag `GPTMR_CTRL_TRIGM` gets set. This flag has to be cleared manually
by writing zero to it. Optionally, an interrupt can be triggered if the `GPTMR_CTRL_IRQM` bit is set.
**Timer Capture**
In addition to the the internal timer, the GPTMR provides a timer-capture feature. Whenever an edge is detected
at the `gptmr_trig_i` input signal the current `COUNT` value is copied to the read-only `CAPTURE` register and the
_capture-trigger_ flag `GPTMR_CTRL_TRIGC` gets set. This flag has to be cleared manually by writing zero to it.
Optionally, an interrupt can be triggered if the `GPTMR_CTRL_IRQC` bit is set.
The triggering edge can be a rising-edge (if `GPTMR_CTRL_RISE` is set), a falling-edge (if `GPTMR_CTRL_FALL` is
set) or even both. By default, the `gptmr_trig_i` is sampled two times at the processor clock for checking for
edges. This simple edge detection is sufficient for trigger signals that are generated by (on-chip) digital logic.
For sampling chip-external signals an optional filtering mode is available that can be enabled by the
`GPTMR_CTRL_FILTER` bit. If this bit is set, the `gptmr_trig_i` is sampled at a reduced clock speed (1/4 of the
processor clock) and the signal has to be stable for at lest 4 sample clock in order to be considered high or low.
This stabilized signal is then fed to the edge detection logic.
.Timer Interrupt
[NOTE]
Once triggered, the timer interrupt remains pending within the CPU until it explicitly cleared by writing zero
to the according <<_mip>> CSR bit.
**Register Map**
.GPTMR register map (`struct NEORV32_GPTMR`)
[cols="<4,<2,<4,^1,<7"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.10+<| `0xfffff100` .10+<| `CTRL` <|`0` `GPTMR_CTRL_EN` ^| r/w <| Timer enable flag
<|`3:1` `GPTMR_CTRL_PRSC2 : GPTMR_CTRL_PRSC0` ^| r/w <| 3-bit clock prescaler select
<|`4` `GPTMR_CTRL_IRQM` ^| r/w <| Enable interrupt on timer-match
<|`5` `GPTMR_CTRL_IRQC` ^| r/w <| Enable interrupt on capture-trigger
<|`6` `GPTMR_CTRL_RISE` ^| r/w <| Capture on rising edge
<|`7` `GPTMR_CTRL_FALL` ^| r/w <| Capture on falling edge
<|`8` `GPTMR_CTRL_FILTER` ^| r/w <| Filter capture input
<|`29:9` - ^| r/- <| _reserved_, read as zero
<|`30` `GPTMR_CTRL_TRIGM` ^| r/c <| Timer-match has fired, cleared by writing 0
<|`31` `GPTMR_CTRL_TRIGC` ^| r/c <| Capture-trigger has fired, cleared by writing 0
| `0xfffff104` | `THRES` |`31:0` | r/w | Threshold value register
| `0xfffff108` | `COUNT` |`31:0` | r/w | Counter register
| `0xfffff10C` | `CAPTURE` |`31:0` | r/- | Capture register
|=======================

View File

@ -0,0 +1,56 @@
<<<
:sectnums:
==== Processor-Internal Instruction Cache (iCACHE)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_icache.vhd |
| Software driver file(s): | none | _implicitly used_
| Top entity port: | none |
| Configuration generics: | `ICACHE_EN` | implement processor-internal instruction cache when `true`
| | `ICACHE_NUM_BLOCKS` | number of cache blocks (pages/lines)
| | `ICACHE_BLOCK_SIZE` | size of a cache block in bytes
| | `ICACHE_ASSOCIATIVITY` | associativity / number of sets
| CPU interrupts: | none |
|=======================
The processor features an optional instruction cache to improve performance when using memories with high
access latencies. The cache is directly connected to the CPU's instruction fetch interface and provides
full-transparent buffering of instruction fetch accesses to the **entire address space**.
The cache is implemented if the `ICACHE_EN` generic is `true`. The size of the cache memory is defined via
`ICACHE_BLOCK_SIZE` (the size of a single cache block/page/line in bytes; has to be a power of two and greater than or
equal to 4 bytes), `ICACHE_NUM_BLOCKS` (the total amount of cache blocks; has to be a power of two and greater than or
equal to 1) and the actual cache associativity `ICACHE_ASSOCIATIVITY` (number of sets; 1 = direct-mapped, 2 = 2-way
set-associative) generics. If the cache associativity is greater than one the LRU replacement policy (least recently
used) is used.
**Cached/Uncached Accesses**
The data cache provides direct accesses (= uncached) to memory in order to access memory-mapped IO (like the
processor-internal IO/peripheral modules). All accesses that target the address range from `0xF0000000` to `0xFFFFFFFF`
will not be cached at all (see section <<_address_space>>).
.Caching Internal Memories
[NOTE]
The instruction cache is intended to accelerate instruction fetches from **processor-external** memories
(via the external bus interface or via the XIP module). The cache(s) should not be implemented
when using only processor-internal data and instruction memories.
.Manual Cache Clear/Reload
[NOTE]
By executing the `fence(.i)` instruction the cache is cleared and a reload from main memory is triggered.
.Retrieve Cache Configuration from Software
[TIP]
Software can retrieve the cache configuration/layout from the <<_sysinfo_cache_configuration>> register.
.Bus Access Fault Handling
[NOTE]
The cache always loads a complete cache block (aligned to the block size) every time a
cache miss is detected. Each cached word from this block provides a single status bit that indicates if the
according bus access was successful or caused a bus error. Hence, the whole cache block remains valid even
if certain addresses inside caused a bus error. If the CPU accesses any of the faulty cache words, an
instruction bus error exception is raised.

View File

@ -0,0 +1,52 @@
<<<
:sectnums:
==== Instruction Memory (IMEM)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_imem.entity.vhd | entity-only definition
| | mem/neorv32_imem.default.vhd | default _platform-agnostic_ memory architecture
| | mem/neorv32_imem.legacy.vhd | alternative legacy-style memory architecture
| Software driver file(s): | none | _implicitly used_
| Top entity port: | none |
| Configuration generics: | `MEM_INT_IMEM_EN` | implement processor-internal IMEM when `true`
| | `MEM_INT_IMEM_SIZE` | IMEM size in bytes (use a power of 2)
| | `INT_BOOTLOADER_EN` | use internal bootloader when `true` (implements IMEM as _uninitialized_ RAM, otherwise the IMEM is implemented an _pre-intialized_ ROM)
| CPU interrupts: | none |
|=======================
Implementation of the processor-internal instruction memory is enabled by the processor's
`MEM_INT_IMEM_EN` generic. The total memory size in bytes is defined via the `MEM_INT_IMEM_SIZE` generic.
Note that this size should be a power of two to optimize physical implementation. If the IMEM is implemented,
it is mapped to base address `0x00000000` by default (see section <<_address_space>>).
By default the IMEM is implemented as true RAM so the content can be modified during run time. This is
required when using the bootloader (or the on-chip debugger) so it can update the content of the IMEM at
any time. If this feature is not required the IMEM can be implemented as _pre-intialized_ ROM so that the
application code permanently resides in memory. This is automatically implemented when the
processor-internal bootloader is disabled (`INT_BOOTLOADER_EN` = `false`).
When the IMEM is implemented as ROM, it will be initialized during synthesis with the actual application program
image. The compiler toolchain provides an option to generate and override the default VHDL initialization file
`rtl/core/neorv32_application_image.vhd`, which is automatically inserted into the IMEM. If the IMEM is implemented
as RAM (default), the memory block will **not be initialized at all**.
.Memory Size
[IMPORTANT]
If the configured memory size (via the `MEM_INT_IMEM_SIZE` generic) is **not** a power of two the actual memory
size will be auto-adjusted to the next power of two (e.g. configuring a memory size of 60kB will result in a
physical memory size of 64kB).
.VHDL Source File
[NOTE]
The actual IMEM is split into two design files: a plain entity definition (`neorv32_imem.entity.vhd`) and the actual
architecture definition `mem/neorv32_imem.default.vhd`. This **default architecture** provides a _generic_ and
_platform independent_ memory design that infers embedded memory blocks (blockRAM). The default architecture can
be replaced by platform-specific modules in order to use platform-specific features or to improve technology mapping
and/or timing. A "legacy-style" memory architecture is provided in `rtl/mem` that can be used if the synthesis does
not correctly infer blockRAMs.
.Read-Only Access
[NOTE]
If the IMEM is implemented as true ROM any write attempt to it will raise a _store access fault_ exception.

View File

@ -0,0 +1,47 @@
<<<
:sectnums:
==== Machine System Timer (MTIME)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_mtime.vhd |
| Software driver file(s): | neorv32_mtime.c |
| | neorv32_mtime.h |
| Top entity port: | `mtime_irq_i` | RISC-V machine timer IRQ if internal one is **not** implemented
| | `mtime_time_o` | Current system time (`TIME` register)
| Configuration generics: | `IO_MTIME_EN` | implement machine timer when `true`
| CPU interrupts: | `MTI` | machine timer interrupt (see <<_processor_interrupts>>)
|=======================
The MTIME module implements a memory-mapped machine system timer that is compatible to the RISC-V
privileged specifications. The 64-bit system time is accessed via individual `TIME_LO` and
`TIME_HI` registers. A 64-bit time compare register, which is accessible via individual `TIMECMP_LO`
and `TIMECMP_HI` registers, can be used to configure the CPU's machine timer interrupt (`MTI`)). The interrupt
is triggered whenever `TIME` (high & low part) is greater than or equal to `TIMECMP` (high & low part).
The interrupt remains active (=pending) until `TIME` becomes less than `TIMECMP` again (either by modifying
`TIME` or `TIMECMP`). The current system time is available for other SoC modules via the top's `mtime_time_o` signal.
.Hardware Reset
[NOTE]
After a hardware reset the `TIME` and `TIMECMP` register are reset to all-zero.
.External MTIME Interrupt
[NOTE]
If the internal MTIME module is disabled (`IO_MTIME_EN` = `false`) the machine timer interrupt becomes available
as external signal. The `mtime_irq_i` signal is level-triggered and high-active. Once set the signal has to stay
high until the interrupt request is explicitly acknowledged (e.g. writing to a user-defined memory-mapped register).
**Register Map**
.MTIME register map (`struct NEORV32_MTIME`)
[cols="<3,<3,^1,^1,<6"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bits | R/W | Function
| `0xfffff400` | `TIME_LO` | 31:0 | r/w | system time, low word
| `0xfffff404` | `TIME_HI` | 31:0 | r/w | system time, high word
| `0xfffff408` | `TIMECMP_LO` | 31:0 | r/w | time compare, low word
| `0xfffff40c` | `TIMECMP_HI` | 31:0 | r/w | time compare, high word
|=======================

View File

@ -0,0 +1,203 @@
<<<
:sectnums:
==== Smart LED Interface (NEOLED)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_neoled.vhd |
| Software driver file(s): | neorv32_neoled.c |
| | neorv32_neoled.h |
| Top entity port: | `neoled_o` | 1-bit serial data output
| Configuration generics: | `IO_NEOLED_EN` | implement NEOLED controller when `true`
| | `IO_NEOLED_TX_FIFO` | TX FIFO depth, has to be a power of 2, min 1
| CPU interrupts: | fast IRQ channel 9 | configurable NEOLED data FIFO interrupt (see <<_processor_interrupts>>)
|=======================
**Overview**
The NEOLED module provides a dedicated interface for "smart RGB LEDs" like WS2812, WS2811 or any other compatible
LEDs. These LEDs provide a single-wire interface that uses an asynchronous serial protocol for transmitting color
data. Using the NEOLED module allows CPU-independent operation of an arbitrary number of smart LEDs. A configurable data
buffer (FIFO) allows to utilize block transfer operation without requiring the CPU.
[NOTE]
The NEOLED interface is compatible to the "Adafruit Industries NeoPixel(TM)" products, which feature
WS2812 (or older WS2811) smart LEDs. Other LEDs might be compatible as well when adjusting the controller's programmable
timing configuration.
The interface provides a single 1-bit output `neoled_o` to drive an arbitrary number of cascaded LEDs. Since the
NEOLED module provides 24-bit and 32-bit operating modes, a mixed setup with RGB LEDs (24-bit color)
and RGBW LEDs (32-bit color including a dedicated white LED chip) is possible.
**Theory of Operation**
The NEOLED modules provides two accessible interface registers: the control register `CTRL` and the write-only
TX data register `DATA`. The NEOLED module is globally enabled via the control register's
`NEOLED_CTRL_EN` bit. Clearing this bit will terminate any current operation, clear the TX buffer, reset the module
and set the `neoled_o` output to zero. The precise timing (e.g. implementing the **WS2812** protocol) and transmission
mode are fully programmable via the `CTRL` register to provide maximum flexibility.
**RGB / RGBW Configuration**
NeoPixel(TM) LEDs are available in two "color" version: LEDs with three chips providing RGB color and LEDs with
four chips providing RGB color plus a dedicated white LED chip (= RGBW). Since the intensity of every
LED chip is defined via an 8-bit value the RGB LEDs require a frame of 24-bit per module and the RGBW
LEDs require a frame of 32-bit per module.
The data transfer quantity of the NEOLED module can be programmed via the `NEOLED_MODE_EN` control
register bit. If this bit is cleared, the NEOLED interface operates in 24-bit mode and will transmit bits `23:0` of
the data written to `DATA` to the LEDs. If `NEOLED_MODE_EN` is set, the NEOLED interface operates in 32-bit
mode and will transmit bits `31:0` of the data written to `DATA` to the LEDs.
The mode bit can be reconfigured before writing a new data word to `DATA` in order to support an arbitrary setup/mixture
of RGB and RGBW LEDs.
**Protocol**
The interface of the WS2812 LEDs uses an 800kHz carrier signal. Data is transmitted in a serial manner
starting with LSB-first. The intensity for each R, G & B (& W) LED chip (= color code) is defined via an 8-bit
value. The actual data bits are transferred by modifying the duty cycle of the signal (the timings for the
WS2812 are shown below). A RESET command is "send" by pulling the data line LOW for at least 50μs.
.WS2812 bit-level protocol - taken from the "Adafruit NeoPixel(TM) Überguide"
image::neopixel.png[align=center]
.WS2812 interface timing
[cols="<2,<2,<6"]
[grid="all"]
|=======================
| T~total~ (T~carrier~) | 1.25μs +/- 300ns | period for a single bit
| T~0H~ | 0.4μs +/- 150ns | high-time for sending a `1`
| T~0L~ | 0.8μs +/- 150ns | low-time for sending a `1`
| T~1H~ | 0.85μs +/- 150ns | high-time for sending a `0`
| T~1L~ | 0.45μs +/- 150 ns | low-time for sending a `0`
| RESET | Above 50μs | low-time for sending a RESET command
|=======================
**Timing Configuration**
The basic carrier frequency (800kHz for the WS2812 LEDs) is configured via a 3-bit main clock prescaler
(`NEOLED_CTRL_PRSC*`, see table below) that scales the main processor clock f~main~ and a 5-bit cycle
multiplier `NEOLED_CTRL_T_TOT_*`.
.NEOLED Prescaler Configuration
[cols="<4,^1,^1,^1,^1,^1,^1,^1,^1"]
[options="header",grid="rows"]
|=======================
| **`NEOLED_CTRL_PRSCx`** | `0b000` | `0b001` | `0b010` | `0b011` | `0b100` | `0b101` | `0b110` | `0b111`
| Resulting `clock_prescaler` | 2 | 4 | 8 | 64 | 128 | 1024 | 2048 | 4096
|=======================
The duty-cycles (or more precisely: the high- and low-times for sending either a '1' bit or a '0' bit) are
defined via the 5-bit `NEOLED_CTRL_T_ONE_H_*` and `NEOLED_CTRL_T_ZERO_H_*` values, respectively. These programmable
timing constants allow to adapt the interface for a wide variety of smart LED protocol (for example WS2812 vs.
WS2811).
**Timing Configuration - Example (WS2812)**
Generate the base clock f~TX~ for the NEOLED TX engine:
* processor clock f~main~ = 100 MHz
* `NEOLED_CTRL_PRSCx` = `0b001` = f~main~ / 4
_**f~TX~**_ = _f~main~[Hz]_ / `clock_prescaler` = 100MHz / 4 = 25MHz
_**T~TX~**_ = 1 / _**f~TX~**_ = 40ns
Generate carrier period (T~carrier~) and *high-times* (duty cycle) for sending `0` (T~0H~) and `1` (T~1H~) bits:
* `NEOLED_CTRL_T_TOT` = `0b11110` (= decimal 30)
* `NEOLED_CTRL_T_ZERO_H` = `0b01010` (= decimal 10)
* `NEOLED_CTRL_T_ONE_H` = `0b10100` (= decimal 20)
_**T~carrier~**_ = _**T~TX~**_ * `NEOLED_CTRL_T_TOT` = 40ns * 30 = 1.4µs
_**T~0H~**_ = _**T~TX~**_ * `NEOLED_CTRL_T_ZERO_H` = 40ns * 10 = 0.4µs
_**T~1H~**_ = _**T~TX~**_ * `NEOLED_CTRL_T_ONE_H` = 40ns * 20 = 0.8µs
[TIP]
The NEOLED SW driver library (`neorv32_neoled.h`) provides a simplified configuration
function that configures all timing parameters for driving WS2812 LEDs based on the processor
clock frequency.
**TX Data FIFO**
The interface features a configurable TX data buffer (a FIFO) to allow more CPU-independent operation. The buffer
depth is configured via the `IO_NEOLED_TX_FIFO` top generic (default = 1 entry). The FIFO size configuration can be
read via the `NEOLED_CTRL_BUFS_x` control register bits, which result log2(_IO_NEOLED_TX_FIFO_).
When writing data to the `DATA` register the data is automatically written to the TX buffer. Whenever
data is available in the buffer the serial transmission engine will take and transmit it to the LEDs.
The data transfer size (`NEOLED_MODE_EN`) can be modified at any time since this control register bit is also buffered
in the FIFO. This allows an arbitrary mix of RGB and RGBW LEDs in the chain.
Software can check the FIFO fill level via the control register's `NEOLED_CTRL_TX_EMPTY`, `NEOLED_CTRL_TX_HALF`
and `NEOLED_CTRL_TX_FULL` flags. The `NEOLED_CTRL_TX_BUSY` flags provides additional information if the the serial
transmit engine is still busy sending data.
[WARNING]
Please note that the timing configurations (`NEOLED_CTRL_PRSCx`, `NEOLED_CTRL_T_TOT_x`,
`NEOLED_CTRL_T_ONE_H_x` and `NEOLED_CTRL_T_ZERO_H_x`) are **NOT** stored to the buffer. Changing
these value while the buffer is not empty or the TX engine is still busy will cause data corruption.
**Strobe Command ("RESET")**
According to the WS2812 specs the data written to the LED's shift registers is strobed to the actual PWM driver
registers when the data line is low for 50μs ("RESET" command, see table above). This can be implemented
using busy-wait for at least 50μs. Obviously, this concept wastes a lot of processing power.
To circumvent this, the NEOLED module provides an option to automatically issue an idle time for creating the RESET
command. If the `NEOLED_CTRL_STROBE` control register bit is set, _all_ data written to the data FIFO (via `DATA`,
the actually written data is irrelevant) will trigger an idle phase (`neoled_o` = zero) of 127 periods (= _**T~carrier~**_).
This idle time will cause the LEDs to strobe the color data into the PWM driver registers.
Since the `NEOLED_CTRL_STROBE` flag is also buffered in the TX buffer, the RESET command is treated just as another
data word being written to the TX buffer making busy wait concepts obsolete and allowing maximum refresh rates.
**NEOLED Interrupt**
The NEOLED modules features a single interrupt that triggers based on the current TX buffer fill level.
The interrupt can only become pending if the NEOLED module is enabled. The specific interrupt condition
is configured via the `NEOLED_CTRL_IRQ_CONF` bit in the unit's control register.
If `NEOLED_CTRL_IRQ_CONF` is set, the module's interrupt is generated whenever the TX FIFO is less than half-full.
In this case software can write up to `IO_NEOLED_TX_FIFO`/2 new data words to `DATA` without checking the FIFO
status flags. If `NEOLED_CTRL_IRQ_CONF` is cleared, an interrupt is generated when the TX FIFO is empty.
One the NEOLED interrupt has been triggered and became pending, it has to explicitly cleared again by
writing zero to according <<_mip>> CSR bit.
**Register Map**
.NEOLED register map (`struct NEORV32_NEOLED`)
[cols="<2,<1,<5,^1,<5"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.13+<| `0xfffffd00` .13+<| `CTRL` <|`0` `NEOLED_CTRL_EN` ^| r/w <| NEOLED enable
<|`1` `NEOLED_CTRL_MODE` ^| r/w <| data transfer size; `0`=24-bit; `1`=32-bit
<|`2` `NEOLED_CTRL_STROBE` ^| r/w <| `0`=send normal color data; `1`=send RESET command on data write access
<|`5:3` `NEOLED_CTRL_PRSC2 : NEOLED_CTRL_PRSC0` ^| r/w <| 3-bit clock prescaler, bit 0
<|`9:6` `NEOLED_CTRL_BUFS3 : NEOLED_CTRL_BUFS0` ^| r/- <| 4-bit log2(_IO_NEOLED_TX_FIFO_)
<|`14:10` `NEOLED_CTRL_T_TOT_4 : NEOLED_CTRL_T_TOT_0` ^| r/w <| 5-bit pulse clock ticks per total single-bit period (T~total~)
<|`19:15` `NEOLED_CTRL_T_ZERO_H_4 : NEOLED_CTRL_T_ZERO_H_0` ^| r/w <| 5-bit pulse clock ticks per high-time for sending a zero-bit (T~0H~)
<|`24:20` `NEOLED_CTRL_T_ONE_H_4 : NEOLED_CTRL_T_ONE_H_0` ^| r/w <| 5-bit pulse clock ticks per high-time for sending a one-bit (T~1H~)
<|`27` `NEOLED_CTRL_IRQ_CONF` ^| r/w <| TX FIFO interrupt configuration: `0`=IRQ if FIFO is empty, `1`=IRQ if FIFO is less than half-full
<|`28` `NEOLED_CTRL_TX_EMPTY` ^| r/- <| TX FIFO is empty
<|`29` `NEOLED_CTRL_TX_HALF` ^| r/- <| TX FIFO is _at least_ half full
<|`30` `NEOLED_CTRL_TX_FULL` ^| r/- <| TX FIFO is full
<|`31` `NEOLED_CTRL_TX_BUSY` ^| r/- <| TX serial engine is busy when set
| `0xfffffd04` | `DATA` <|`31:0` / `23:0` ^| -/w <| TX data (32- or 24-bit, depending on _NEOLED_CTRL_MODE_ bit)
|=======================

View File

@ -0,0 +1,190 @@
<<<
:sectnums:
==== One-Wire Serial Interface Controller (ONEWIRE)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_onewire.vhd |
| Software driver file(s): | neorv32_onewire.c |
| | neorv32_onewire.h |
| Top entity port: | `onewire_i` | 1-bit 1-wire bus sense input
| | `onewire_o` | 1-bit 1-wire bus output (pull low only)
| Configuration generics: | `IO_ONEWIRE_EN` | implement ONEWIRE interface controller when `true`
| CPU interrupts: | fast IRQ channel 13 | operation done interrupt (see <<_processor_interrupts>>)
|=======================
**Overview**
The NEORV32 ONEWIRE module implements a single-wire interface controller that is compatible to the
_Dallas/Maxim 1-Wire_ protocol, which is an asynchronous half-duplex bus requiring only a single signal wire
connected to `onewire_io` (plus ground).
The bus is based on a single open-drain signal. The controller and all the devices can only pull-down the bus actively.
Hence, an external pull-up resistor is required. Recommended values are between 1kΩ and 4kΩ depending on the bus
characteristics (wire length, number of devices, etc.). Furthermore, a series resistor (~100Ω) at the controller side
is recommended to control the slew rate and to reduce signal reflections. Also, additional external ESD protection clamp diodes
should be added to the bus line.
**Tri-State Drivers**
The ONEWIRE module requires a tri-state driver (actually, open-drain) for the 1-wire bus line, which has to be implemented
in the top module of the setup. A generic VHDL example is given below (`onewire` is the actual 1-wire
bus signal, which is of type `std_logic`).
.ONEWIRE VHDL tri-state driver example
[source,VHDL]
----
onewire <= '0' when (onewire_o = '0') else 'Z'; -- drive
onewire_i <= std_ulogic(onewire); -- sense
----
**Theory of Operation**
The ONEWIRE controller provides two interface registers: `CTRL` and `DATA.` The control registers (`CTRL`)
is used to configure the module, to trigger bus transactions and to monitor the current state of the module.
The `DATA` register is used to read/write data from/to the bus.
The module is enabled by setting the `ONEWIRE_CTRL_EN` bit in the control register. If this bit is cleared, the
module is automatically reset and the bus is brought to high-level (due to the external pull-up resistor).
The basic timing configuration is programmed via the clock prescaler bits `ONEWIRE_CTRL_PRSCx` and the
clock divider bits `ONEWIRE_CTRL_CLKDIVx` (see next section).
The controller can execute three basic bus operations, which are triggered by setting one out of three specific
control register bits (the bits auto-clear):
[start=1]
. generate reset pulse and check for device presence; triggered when setting `ONEWIRE_CTRL_TRIG_RST`
. transfer a single-bit (read-while-write); triggered when setting `ONEWIRE_CTRL_TRIG_BIT`
. transfer a full-byte (read-while-write); triggered when setting `ONEWIRE_CTRL_TRIG_BYTE`
[IMPORTANT]
Only one trigger bit may be set at once, otherwise undefined behavior might occur.
When a single-bit operation has been triggered, the data previously written to `DATA[0]` will be send to the bus
and `DATA[7]` will be sampled from the bus. Accordingly, a full-byte transmission will send the previously
byte written to `DATA[7:0]` to the bus and will update `DATA[7:0]` with the data read from the bus (LSB-first).
The triggered operation has completed when the module's busy flag `ONEWIRE_CTRL_BUSY` has cleared again.
.Read from Bus
[NOTE]
In order to read a single bit from the bus `DATA[0]` has to set to `1` before triggering the bit transmission
operation to allow the accessed device to pull-down the bus. Accordingly, `DATA` has to be set to `0xFF` before
triggering the byte transmission operation when the controller shall read a byte from the bus.
The `ONEWIRE_CTRL_PRESENCE` bit gets set if at least one device has send a "presence" signal right after the
reset pulse.
**Bus Timing**
The control register provides a 2-bit clock prescaler select (`ONEWIRE_CTRL_PRSCx`) and a 8-bit clock divider
(`ONEWIRE_CTRL_CLKDIVx`) for timing configuration. Both are used to define the elementary **base time T~base~**.
All bus operations are timed using _multiples_ of this elementary base time.
.ONEWIRE Clock Prescaler Configurations
[cols="<4,^1,^1,^1,^1"]
[options="header",grid="rows"]
|=======================
| **`ONEWIRE_CTRL_PRSCx`** | `0b00` | `0b01` | `0b10` | `0b11`
| Resulting `clock_prescaler` | 2 | 4 | 8 | 64
|=======================
Together with the clock divider value (`ONEWIRE_CTRL_PRSCx` bits = `clock_divider`) the base time is defined by the
following formula:
_**T~base~**_ = (1 / _f~main~[Hz]_) * `clock_prescaler` * (`clock_divider` + 1)
Example:
* _f~main~_ = 100MHz
* clock prescaler select = `0b01` -> `clock_prescaler` = 4
* clock divider `clock_divider` = 249
_**T~base~**_ = (1 / 100000000Hz) * 4 * (249 + 1) = 10000ns = **10µs**
The base time is used to coordinate all bus interactions. Hence, all delays, time slots and points in time are
quantized as multiples of the base time. The following images show the two basic operations of the ONEWIRE
controller: single-bit (0 or 1) transaction and reset with presence detect. The relevant points in time are
shown as _absolute_ time (in multiples of the time base) with the bus' falling edge as reference point.
[cols="^2,^2"]
[grid="none"]
|=======================
a| image::onewire_data.png[align=center]
a| image::onewire_reset.png[align=center]
| Single-bit data transmission (not to scale) | Reset pulse and presence detect (not to scale)
|=======================
.Data Transmission Timing
[cols="<2,<6,^3,^3"]
[options="header",grid="rows"]
|=======================
| Symbol | Description | Multiples of T~base~ | Time when T~base~ = 10µs
4+^| **Single-bit data transmission**
| `t0` (a->b) | Time until end of active low-phase when writing a `'1'` or when reading | 1 | 10µs
| `t1` (a->c) | Time until controller samples bus state (read operation) | 2 | 20µs
| `t2` (a->d) | Time until end of bit time slot (when writing a `'0'` or when reading) | 7 | 70µs
| `t3` (a->e) | Time until end of inter-slot pause (= total duration of one bit) | 9 | 90µs
4+^| **Reset pulse and presence detect**
| `t4` (f->g) | Time until end of active reset pulse | 48 | 480µs
| `t5` (f->h) | Time until controller samples bus presence | 55 | 550µs
| `t6` (f->i) | Time until end of presence phase | 96 | 960µs
|=======================
[NOTE]
The default values for base time multiples were chosen to for stable and reliable bus
operation (not for maximum throughput).
The absolute points in time are hardwired by the VHDL code and cannot be changed during runtime.
However, the timing parameter can be customized by editing the ONEWIRE's VHDL source file:
.Hardwired time configuration in `neorv32_onewire.vhd`
[source,VHDL]
----
-- timing configuration (absolute time in multiples of the base tick time t_base) --
constant t_write_one_c : unsigned(6 downto 0) := to_unsigned( 1, 7); -- t0
constant t_read_sample_c : unsigned(6 downto 0) := to_unsigned( 2, 7); -- t1
constant t_slot_end_c : unsigned(6 downto 0) := to_unsigned( 7, 7); -- t2
constant t_pause_end_c : unsigned(6 downto 0) := to_unsigned( 9, 7); -- t3
constant t_reset_end_c : unsigned(6 downto 0) := to_unsigned(48, 7); -- t4
constant t_presence_sample_c : unsigned(6 downto 0) := to_unsigned(55, 7); -- t5
constant t_presence_end_c : unsigned(6 downto 0) := to_unsigned(96, 7); -- t6
----
.Overdrive
[IMPORTANT]
The ONEWIRE controller does not support the _overdrive_ mode. However, it can be implemented by reducing the base
time **T~base~** (and by eventually changing the hardwired timing configuration in the VHDL source file).
**Interrupt**
A single interrupt is provided by the ONEWIRE module to signal "operation done" condition to the CPU. Whenever the
controller completes a "generate reset pulse", a "transfer single-bit" or a "transfer full-byte" operation the
interrupt is triggered. Once triggered, the interrupt has to be _explicitly_ cleared again by writing zero to the
according <<_mip>> CSR FIRQ bit.
**Register Map**
.ONEWIRE register map (`struct NEORV32_ONEWIRE`)
[cols="<4,<2,<6,^2,<6"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.10+<| `0xfffff200` .10+<| `CTRL` <|`0` `ONEWIRE_CTRL_EN` ^| r/w <| ONEWIRE enable, reset if cleared
<|`2:1` `ONEWIRE_CTRL_PRSC1 : ONEWIRE_CTRL_PRSC0` ^| r/w <| 2-bit clock prescaler select
<|`10:3` `ONEWIRE_CTRL_CLKDIV7 : ONEWIRE_CTRL_CLKDIV0` ^| r/w <| 8-bit clock divider value
<|`11` `ONEWIRE_CTRL_TRIG_RST` ^| -/w <| trigger reset pulse, auto-clears
<|`12` `ONEWIRE_CTRL_TRIG_BIT` ^| -/w <| trigger single bit transmission, auto-clears
<|`13` `ONEWIRE_CTRL_TRIG_BYTE` ^| -/w <| trigger full-byte transmission, auto-clears
<|`28:14` - ^| r/- <| _reserved_, read as zero
<|`29` `ONEWIRE_CTRL_SENSE` ^| r/- <| current state of the bus line
<|`30` `ONEWIRE_CTRL_PRESENCE` ^| r/- <| device presence detected after reset pulse
<|`31` `ONEWIRE_CTRL_BUSY` ^| r/- <| operation in progress when set
| `0xfffff204` | `DATA` |`7:0` `ONEWIRE_DATA_MSB : ONEWIRE_DATA_LSB` | r/w | receive/transmit data (8-bit)
|=======================

View File

@ -0,0 +1,85 @@
<<<
:sectnums:
==== Pulse-Width Modulation Controller (PWM)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_pwm.vhd |
| Software driver file(s): | neorv32_pwm.c |
| | neorv32_pwm.h |
| Top entity port: | `pwm_o` | PWM output channels (12-bit)
| Configuration generics: | `IO_PWM_NUM_CH` | number of PWM channels to implement (0..12)
| CPU interrupts: | none |
|=======================
**Overview**
**Overview**
The PWM module implements a pulse-width modulation controller with up to 12 independent channels providing
8-bit resolution per channel. The actual number of implemented channels is defined by the `IO_PWM_NUM_CH` generic.
Setting this generic to zero will completely remove the PWM controller from the design.
[NOTE]
The `pwm_o` has a static size of 12-bit. If less than 12 PWM channels are configured, only the LSB-aligned channel
bits are used while the remaining bits are hardwired to zero.
**Theory of Operation**
The PWM controller is activated by setting the `PWM_CTRL_EN` bit in the module's control register `CTRL`. When this
bit is cleared, the unit is reset and all PWM output channels are set to zero. The module
provides three duty cycle registers `DC[0]` to `DC[2]`. Each register contains the duty cycle configuration for four
consecutive channels. For example, the duty cycle of channel 0 is defined via bits 7:0 in `DC[0]`. The duty cycle of
channel 2 is defined via bits 15:0 in `DC[0]` and so on.
[NOTE]
Regardless of the configuration of `IO_PWM_NUM_CH` all module registers can be accessed without raising an exception.
Software can discover the number of available channels by writing 0xff to all duty cycle configuration bytes and
reading those values back. The duty-cycle of channels that were not implemented always reads as zero.
Based on the configured duty cycle the according intensity of the channel can be computed by the following formula:
_**Intensity~x~**_ = `DC[y](i*8+7 downto i*8)` / (2^8^)
The base frequency of the generated PWM signals is defined by the PWM core clock. This clock is derived
from the main processor clock and divided by a prescaler via the 3-bit `PWM_CTRL_PRSCx` in the unit's control
register.
.PWM prescaler configuration
[cols="<4,^1,^1,^1,^1,^1,^1,^1,^1"]
[options="header",grid="rows"]
|=======================
| **`PWM_CTRL_PRSCx`** | `0b000` | `0b001` | `0b010` | `0b011` | `0b100` | `0b101` | `0b110` | `0b111`
| Resulting `clock_prescaler` | 2 | 4 | 8 | 64 | 128 | 1024 | 2048 | 4096
|=======================
The resulting PWM carrier frequency is defined by:
_**f~PWM~**_ = _f~main~[Hz]_ / (2^8^ * `clock_prescaler`)
**Register Map**
.PWM register map (`struct neorv32_pwm_t`)
[cols="<4,<2,<6,^2,<8"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.3+<| `0xfffff000` .3+<| `CTRL` <|`0` `PWM_CTRL_EN` ^| r/w <| PWM enable
<|`3:1` `PWM_CTRL_PRSC2 : PWM_CTRL_PRSC0` ^| r/w <| 3-bit clock prescaler select
<|`31:4` - ^| r/- <| _reserved_, read as zero
.4+<| `0xfffff004` .4+<| `DC[0]` <|`7:0` ^| r/w <| 8-bit duty cycle for channel 0
<|`15:8` ^| r/w <| 8-bit duty cycle for channel 1
<|`23:16` ^| r/w <| 8-bit duty cycle for channel 2
<|`31:24` ^| r/w <| 8-bit duty cycle for channel 3
.4+<| `0xfffff008` .4+<| `DC[1]` <|`7:0` ^| r/w <| 8-bit duty cycle for channel 4
<|`15:8` ^| r/w <| 8-bit duty cycle for channel 5
<|`23:16` ^| r/w <| 8-bit duty cycle for channel 6
<|`31:24` ^| r/w <| 8-bit duty cycle for channel 7
.4+<| `0xfffff00c` .4+<| `DC[2]` <|`7:0` ^| r/w <| 8-bit duty cycle for channel 8
<|`15:8` ^| r/w <| 8-bit duty cycle for channel 9
<|`23:16` ^| r/w <| 8-bit duty cycle for channel 10
<|`31:24` ^| r/w <| 8-bit duty cycle for channel 11
|=======================

View File

@ -0,0 +1,102 @@
<<<
:sectnums:
==== Serial Data Interface Controller (SDI)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_sdi.vhd |
| Software driver file(s): | neorv32_sdi.c |
| | neorv32_sdi.h |
| Top entity port: | `sdi_clk_i` | 1-bit serial clock input
| | `sdi_dat_o` | 1-bit serial data output
| | `sdi_dat_i` | 1-bit serial data input
| | `sdi_csn_i` | 1-bit chip-select input (low-active)
| Configuration generics: | `IO_SDI_EN` | implement SDI controller when `true`
| | `IO_SDI_FIFO` | data FIFO size, has to a power of two, min 1
| CPU interrupts: | fast IRQ channel 11 | configurable SDI interrupt (see <<_processor_interrupts>>)
|=======================
**Overview**
The serial data interface module provides a **device-class** SPI interface and allows to connect the processor
to an external SPI _host_, which is responsible for triggering (clocking) the actual transmission - the SDI is entirely
passive. An optional receive/transmit FIFO can be configured via the _IO_SDI_FIFO_ generic to support block-based
transmissions without CPU interaction.
.Device-Mode Only
[NOTE]
The NEORV32 SDI module only supports _device mode_. Transmission are initiated by an external host and not by the
the processor itself. If you are looking for a _host-mode_ serial peripheral interface (transactions
initiated by the NEORV32) check out the <<_serial_peripheral_interface_controller_spi>>.
The SDI module provides a single control register `CTRL` to configure the module and to check it's status
and a single data register `DATA` for receiving/transmitting data.
**Theory of Operation**
The SDI module is enabled by setting the `SDI_CTRL_EN` bit in the `CTRL` control register. Clearing this bit
resets the entire module including the RX and TX FIFOs.
The SDI operates on byte-level only. Data written to the `DATA` register will be pushed to the TX FIFO. Received
data can be retrieved by reading the RX FIFO via the `DATA` register. The current state of these FIFOs is available
via the control register's `SDI_CTRL_RX_*` and `SDI_CTRL_TX_*` flags. The RX FIFO can be manually cleared at any time
by setting the `SDI_CTRL_CLR_RX` bit.
.MSB-first Only
[NOTE]
The NEORV32 SDI module only supports MSB-first mode.
.Transmission Abort
[NOTE]
If the external SPI controller aborts an transmission (by setting the chip-select signal high again) _before_
8 data bits have been transferred, no data is written to the RX FIFO.
**SDI Clocking**
The SDI module supports both SPI clock polarity modes ("CPOL") but regarding the clock phase only "CPHA=0" is supported
yet. All SDI operations are clocked by the external `sdi_clk_i` signal. This signal is synchronized to the processor's
clock domain to simplify timing behavior. However, the clock synchronization requires that the external SDI clock
(`sdi_clk_i`) does **not exceed 1/4 of the processor's main clock**.
**SDI Interrupt**
The SDI module provides a set of programmable interrupt conditions based on the level of the RX & TX FIFOs. The different
interrupt sources are enabled by setting the according control register's `SDI_CTRL_IRQ` bits. All enabled interrupt
conditions are logically OR-ed so any enabled interrupt source will trigger the module's interrupt signal.
Once the SDI interrupt has fired it will remain active until the actual cause of the interrupt is resolved; for
example if just the `SDI_CTRL_IRQ_RX_AVAIL` bit is set, the interrupt will keep firing until the RX FIFO is empty again.
Furthermore, an active SDI interrupt has to be explicitly cleared again by writing zero to the according
<<_mip>> CSR bit.
**Register Map**
.SDI register map (`struct NEORV32_SDI`)
[cols="<2,<1,<4,^1,<7"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.16+<| `0xfffff700` .16+<| `CTRL` <|`0` `SDI_CTRL_EN` ^| r/w <| SDI module enable
<|`1` `SDI_CTRL_CLR_RX` ^| -/w <| clear RX FIFO when set, bit auto-clears
<|`3:2` _reserved_ ^| r/- <| reserved, read as zero
<|`7:4` `SDI_CTRL_FIFO_MSB : SDI_CTRL_FIFO_LSB` ^| r/- <| FIFO depth; log2(_IO_SDI_FIFO_)
<|`14:8` _reserved_ ^| r/- <| reserved, read as zero
<|`15` `SDI_CTRL_IRQ_RX_AVAIL` ^| r/w <| fire interrupt if RX FIFO is not empty
<|`16` `SDI_CTRL_IRQ_RX_HALF` ^| r/w <| fire interrupt if RX FIFO is at least half full
<|`17` `SDI_CTRL_IRQ_RX_FULL` ^| r/w <| fire interrupt if if RX FIFO is full
<|`18` `SDI_CTRL_IRQ_TX_EMPTY` ^| r/w <| fire interrupt if TX FIFO is empty
<|`22:19` _reserved_ ^| r/- <| reserved, read as zero
<|`23` `SDI_CTRL_RX_AVAIL` ^| r/- <| RX FIFO data available (RX FIFO not empty)
<|`24` `SDI_CTRL_RX_HALF` ^| r/- <| RX FIFO at least half full
<|`25` `SDI_CTRL_RX_FULL` ^| r/- <| RX FIFO full
<|`26` `SDI_CTRL_TX_EMPTY` ^| r/- <| TX FIFO empty
<|`27` `SDI_CTRL_TX_FULL` ^| r/- <| TX FIFO full
<|`31:28` _reserved_ ^| r/- <| reserved, read as zero
| `0xfffff704` | `DATA` |`7:0` | r/w | receive/transmit data (FIFO)
|=======================

View File

@ -0,0 +1,116 @@
<<<
:sectnums:
==== Stream Link Interface (SLINK)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_slink.vhd |
| Software driver file(s): | neorv32_slink.c |
| | neorv32_slink.h |
| Top entity port(s): | `slink_rx_dat_i` | RX link data (32-bit)
| | `slink_rx_val_i` | RX link data valid (1-bit)
| | `slink_rx_lst_i` | RX link last element of stream (1-bit)
| | `slink_rx_rdy_o` | RX link ready to receive (1-bit)
| | `slink_tx_dat_o` | TX link data (32-bit)
| | `slink_tx_val_o` | TX link data valid (1-bit)
| | `slink_tx_lst_o` | TX link last element of stream (1-bit)
| | `slink_tx_rdy_i` | TX link allowed to send (1-bit)
| Configuration generics: | `IO_SLINK_EN` | implement SLINK when _true_
| | `IO_SLINK_RX_FIFO` | RX FIFO depth (1..32k), has to be a power of two
| | `IO_SLINK_TX_FIFO` | TX FIFO depth (1..32k), has to be a power of two
| CPU interrupt: | fast IRQ channel 14 | SLINK IRQ (see <<_processor_interrupts>>)
|=======================
**Overview**
The stream link interface provides independent RX and TX channels for sending for sending and receiving
stream data. Each channel features an internal FIFO with configurable depth to buffer stream data
(`IO_SLINK_RX_FIFO` for the RX FIFO, `IO_SLINK_TX_FIFO` for the TX FIFO). The SLINK interface provides higher
bandwidth and less latency than the external bus interface making it ideally suited for coupling custom
stream processing units or streaming peripherals.
.Example Program
[TIP]
An example program for the SLINK module is available in `sw/example/demo_slink`.
**Interface & Protocol**
The SLINK interface consists of four signals per channel:
* `dat` contains the actual data word
* `val` marks the current transmission cycle as valid
* `lst` makes the current transmission cycle as the last element of a stream
* `rdy` indicates that the receiving part is ready to receive
.AXI4-Stream Compatibility
[NOTE]
The interface names and the underlying protocol is compatible to the AXI4-Stream standard.
**Theory of Operation**
The SLINK provides four interface registers. The control register (`CTRL`) is used to configure
the module and to check its status. Three individual data registers (`RX_DATA`, `TX_DATA`, `TX_DATA_LAST`)
are used to send and received the link's actual data stream.
The `RX_DATA` register provides direct access to the RX link FIFO buffer. After reading data from this the register
the control register's `SLINK_CTRL_RX_LAST` can be checked to determine if the according data word has been marked
as "end of stream" via the `slink_rx_lst_i` signal (this signal is also buffered by the link's FIFO).
Writing to the `TX_DATA` or `TX_DATA_LAST` register will immediately write to the TX link FIFO buffer.
When writing to the `TX_DATA_LAST` the according data word will be marked as "end of stream" via the
`slink_tx_lst_o` signal (this signal is also buffered by the link's FIFO).
The configured FIFO sizes can be retrieved by software via the control register's `SLINK_CTRL_RX_FIFO_*` and
`SLINK_CTRL_TX_FIFO_*` bits.
The SLINK is globally activated by setting the control register's enable bit `SLINK_CTRL_EN`. Clearing this bit will
reset all internal logic and will also clear both FIFOs. The FIFOs can also be cleared manually at all time by
setting the `SLINK_CTRL_RX_CLR` and `SLINK_CTRL_TX_CLR` bits (these bits will auto-clear).
[NOTE]
Writing to the TX channel's FIFO while it is _full_ will have no effect. Reading from the RX channel's FIFO while it
is _empty_ will also have no effect and will return the last received data word.
The current status of the RX and TX FIFOs can be determined via the `SLINK_CTRL_RX_*` and `SLINK_CTRL_TX_*` flags.
A global interrupt can be programmed based on these FIFO status flags via the control register's `SLINK_CTRL_IRQ_*`
bits. Note that all enabled interrupt conditions are logically OR-ed. Once the SLINK's interrupt has become pending,
it has to be explicitly cleared again by writing zero to the according <<_mip>> CSR bit(s).
**Register Map**
.SLINK register map (`struct NEORV32_SLINK`)
[cols="<2,<2,<4,^1,<4"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s) | R/W | Function
.22+<| `0xffffec00` .22+<| `NEORV32_SLINK.CTRL` <| `0` `SLINK_CTRL_EN` ^| r/w <| SLINK global enable
<| `1` `SLINK_CTRL_RX_CLR` ^| -/w <| Clear RX FIFO when set (bit auto-clears)
<| `2` `SLINK_CTRL_TX_CLR` ^| -/w <| Clear TX FIFO when set (bit auto-clears)
<| `3` _reserved_ ^| r/- <| _reserved_, read as zero
<| `4` `SLINK_CTRL_RX_LAST` ^| r/- <| Last word read from `RX_DATA` is marked as "end of stream"
<| `7:5` _reserved_ ^| r/- <| _reserved_, read as zero
<| `8` `SLINK_CTRL_RX_EMPTY` ^| r/- <| RX FIFO empty
<| `9` `SLINK_CTRL_RX_HALF` ^| r/- <| RX FIFO at least half full
<| `10` `SLINK_CTRL_RX_FULL` ^| r/- <| RX FIFO full
<| `11` `SLINK_CTRL_TX_EMPTY` ^| r/- <| TX FIFO empty
<| `12` `SLINK_CTRL_TX_HALF` ^| r/- <| TX FIFO at least half full
<| `13` `SLINK_CTRL_TX_FULL` ^| r/- <| TX FIFO full
<| `15:14` _reserved_ ^| r/- <| _reserved_, read as zero
<| `16` `SLINK_CTRL_IRQ_RX_NEMPTY` ^| r/w <| IRQ if RX FIFO not empty
<| `17` `SLINK_CTRL_IRQ_RX_HALF` ^| r/w <| IRQ if RX FIFO at least half full
<| `18` `SLINK_CTRL_IRQ_RX_FULL` ^| r/w <| IRQ if RX FIFO full
<| `19` `SLINK_CTRL_IRQ_TX_EMPTY` ^| r/w <| IRQ if TX FIFO empty
<| `20` `SLINK_CTRL_IRQ_TX_NHALF` ^| r/w <| IRQ if TX FIFO not at least half full
<| `21` `SLINK_CTRL_IRQ_TX_NFULL` ^| r/w <| IRQ if TX FIFO not full
<| `23:22` _reserved_ ^| r/- <| _reserved_, read as zero
<| `27:24` `SLINK_CTRL_RX_FIFO_MSB : SLINK_CTRL_RX_FIFO_LSB` ^| r/- <| log2(RX FIFO size)
<| `31:28` `SLINK_CTRL_TX_FIFO_MSB : SLINK_CTRL_TX_FIFO_LSB` ^| r/- <| log2(TX FIFO size)
| `0xffffec04` | `NEORV32_SLINK.RX_DATA` | `31:0` | r/- | Read word from RX link FIFO
| `0xffffec08` | `NEORV32_SLINK.TX_DATA` | `31:0` | -/w | Write word to TX link FIFO
| `0xffffec0c` | `NEORV32_SLINK.TX_DATA_LAST` | `31:0` | -/w | Write word to TX link FIFO and also set "end-of-stream" delimiter
|=======================

View File

@ -0,0 +1,140 @@
<<<
:sectnums:
==== Serial Peripheral Interface Controller (SPI)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_spi.vhd |
| Software driver file(s): | neorv32_spi.c |
| | neorv32_spi.h |
| Top entity port: | `spi_clk_o` | 1-bit serial clock output
| | `spi_dat_o` | 1-bit serial data output
| | `spi_dat_i` | 1-bit serial data input
| | `spi_csn_o` | 8-bit dedicated chip select output (low-active)
| Configuration generics: | `IO_SPI_EN` | implement SPI controller when `true`
| | `IO_SPI_FIFO` | FIFO depth, has to be a power of two, min 1
| CPU interrupts: | fast IRQ channel 6 | configurable SPI interrupt (see <<_processor_interrupts>>)
|=======================
**Overview**
The NEORV32 SPI transceiver module operates on 8-bit base, supports all 4 standard clock modes
and provides up to 8 dedicated chip select signals via the top entity's `spi_csn_o` signal.
An receive/transmit FIFO can be configured via the `IO_SPI_FIFO` generic to support block-based
transmissions without CPU interaction.
The SPI module provides a single control register `CTRL` to configure the module and to check it's status
and a single data register `DATA` for receiving/transmitting data.
.Host-Mode Only
[NOTE]
The NEORV32 SPI module only supports _host mode_. Transmission are initiated only by the processor's SPI module
and not by an external SPI module. If you are looking for a _device-mode_ serial peripheral interface (transactions
initiated by an external host) check out the <<_serial_data_interface_controller_sdi>>.
**Theory of Operation**
The SPI module is enabled by setting the `SPI_CTRL_EN` bit in the `CTRL` control register. No transfer can be initiated
and no interrupt request will be triggered if this bit is cleared. Clearing this bit will reset the module, clear
the FIFO and terminate any transfer being in process.
The data quantity to be transferred within a single data transmission is fixed to 8 bits. However, the
total transmission length is left to the user: after asserting chip-select an arbitrary amount of 8-bit transmission
can be made before de-asserting chip-select again.
A transmission is started when writing data to the transmitter FIFO via the `DATA` register. Note that data always
transferred MSB-first. The SPI operation is completed as soon as the `SPI_CTRL_BUSY` flag clears. Received data can
be retrieved by reading the RX FIFO also via the `DATA` register. The control register's `SPI_CTRL_RX_AVAIL`,
`SPI_CTRL_TX_EMPTY`, `SPI_CTRL_TX_NHALF` and `SPI_CTRL_TX_FULL` flags provide information regarding the RX/TX FIFO levels.
The SPI controller features 8 dedicated chip-select lines. These lines are controlled via the control register's
`SPI_CTRL_CS_SELx` and `SPI_CTRL_CS_EN` bits. The 3-bit `SPI_CTRL_CS_SELx` bits are used to select one out of the eight
dedicated chip select lines. As soon as `SPI_CTRL_CS_EN` is _set_ the selected chip select line is activated (driven _low_).
Note that disabling the SPI module via the _SPI_CTRL_EN_ bit will also deactivate any currently activated chip select line.
**SPI Clock Configuration**
The SPI module supports all standard SPI clock modes (0, 1, 2, 3), which are configured via the two control register bits
`SPI_CTRL_CPHA` and `SPI_CTRL_CPOL`. The `SPI_CTRL_CPHA` bit defines the _clock phase_ and the `SPI_CTRL_CPOL`
bit defines the _clock polarity_.
.SPI clock modes; image from https://en.wikipedia.org/wiki/File:SPI_timing_diagram2.svg (license: (Wikimedia) https://en.wikipedia.org/wiki/Creative_Commons[Creative Commons] https://creativecommons.org/licenses/by-sa/3.0/deed.en[Attribution-Share Alike 3.0 Unported])
image::SPI_timing_diagram2.wikimedia.png[]
The SPI clock frequency (`spi_clk_o`) is programmed by the 3-bit `SPI_CTRL_PRSCx` clock prescaler for a coarse clock selection
and a 4-bit clock divider `SPI_CTRL_CDIVx` for a fine clock configuration.
The following clock prescalers (`SPI_CTRL_PRSCx`) are available:
.SPI prescaler configuration
[cols="<4,^1,^1,^1,^1,^1,^1,^1,^1"]
[options="header",grid="rows"]
|=======================
| **`SPI_CTRL_PRSCx`** | `0b000` | `0b001` | `0b010` | `0b011` | `0b100` | `0b101` | `0b110` | `0b111`
| Resulting `clock_prescaler` | 2 | 4 | 8 | 64 | 128 | 1024 | 2048 | 4096
|=======================
Based on the programmed clock configuration, the actual SPI clock frequency f~SPI~ is derived
from the processor's main clock f~main~ according to the following equation:
_**f~SPI~**_ = _f~main~[Hz]_ / (2 * `clock_prescaler` * (1 + `SPI_CTRL_CDIVx`))
Hence, the maximum SPI clock is f~main~ / 4 and the lowest SPI clock is f~main~ / 131072. The SPI clock is always
symmetric having a duty cycle of 50%.
**High-Speed Mode**
The SPI provides a high-speed mode to further boost the maximum SPI clock frequency. When enabled via the control
register's `SPI_CTRL_HIGHSPEED` bit the clock prescaler configuration (`SPI_CTRL_PRSCx` bits) is overridden setting it
to a minimal factor of 1. However, the clock speed can still be fine-tuned using the `SPI_CTRL_CDIVx` bits.
_**f~SPI~**_ = _f~main~[Hz]_ / (2 * 1 * (1 + `SPI_CTRL_CDIVx`))
Hence, the maximum SPI clock when in high-speed mode is f~main~ / 2.
**SPI Interrupt**
The SPI module provides a set of programmable interrupt conditions based on the level of the RX/TX FIFO. The different
interrupt sources are enabled by setting the according control register's `SPI_CTRL_IRQ_*` bits. All enabled interrupt
conditions are logically OR-ed so any enabled interrupt source will trigger the module's interrupt signal.
Once the SPI interrupt has fired it remains pending until the actual cause of the interrupt is resolved; for
example if just the `SPI_CTRL_IRQ_RX_AVAIL` bit is set, the interrupt will keep firing until the RX FIFO is empty again.
Furthermore, an active SPI interrupt has to be explicitly cleared again by writing zero to the according
<<_mip>> CSR bit.
**Register Map**
.SPI register map (`struct NEORV32_SPI`)
[cols="<2,<1,<4,^1,<7"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.19+<| `0xfffff800` .19+<| `CTRL` <|`0` `SPI_CTRL_EN` ^| r/w <| SPI module enable
<|`1` `SPI_CTRL_CPHA` ^| r/w <| clock phase
<|`2` `SPI_CTRL_CPOL` ^| r/w <| clock polarity
<|`5:3` `SPI_CTRL_CS_SEL2 : SPI_CTRL_CS_SEL0` ^| r/w <| Direct chip-select 0..7
<|`6` `SPI_CTRL_CS_EN` ^| r/w <| Direct chip-select enable: setting `spi_csn_o(SPI_CTRL_CS_SEL)` low when set
<|`9:7` `SPI_CTRL_PRSC2 : SPI_CTRL_PRSC0` ^| r/w <| 3-bit clock prescaler select
<|`13:10` `SPI_CTRL_CDIV2 : SPI_CTRL_CDIV0` ^| r/w <| 4-bit clock divider for fine-tuning
<|`14` `SPI_CTRL_HIGHSPEED` ^| r/w <| high-speed mode enable (overriding `SPI_CTRL_PRSC`)
<|`15` _reserved_ ^| r/- <| reserved, read as zero
<|`16` `SPI_CTRL_RX_AVAIL` ^| r/- <| RX FIFO data available (RX FIFO not empty)
<|`17` `SPI_CTRL_TX_EMPTY` ^| r/- <| TX FIFO empty
<|`18` `SPI_CTRL_TX_NHALF` ^| r/- <| TX FIFO _not_ at least half full
<|`19` `SPI_CTRL_TX_FULL` ^| r/- <| TX FIFO full
<|`20` `SPI_CTRL_IRQ_RX_AVAIL` ^| r/w <| Trigger IRQ if RX FIFO not empty
<|`21` `SPI_CTRL_IRQ_TX_EMPTY` ^| r/w <| Trigger IRQ if TX FIFO empty
<|`22` `SPI_CTRL_IRQ_TX_NHALF` ^| r/w <| Trigger IRQ if TX FIFO _not_ at least half full
<|`26:23` `SPI_CTRL_FIFO_MSB : SPI_CTRL_FIFO_LSB` ^| r/- <| FIFO depth; log2(_IO_SPI_FIFO_)
<|`30:27` _reserved_ ^| r/- <| reserved, read as zero
<|`31` `SPI_CTRL_BUSY` ^| r/- <| SPI module busy when set (serial engine operation in progress and TX FIFO not empty yet)
| `0xfffff804` | `DATA` |`7:0` | r/w | receive/transmit data (FIFO)
|=======================

View File

@ -0,0 +1,113 @@
<<<
:sectnums:
==== System Configuration Information Memory (SYSINFO)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_sysinfo.vhd |
| Software driver file(s): | neorv32_sysinfo.h |
| Top entity port: | none |
| Configuration generics: | * | most of the top's configuration generics
| CPU interrupts: | none |
|=======================
**Overview**
The SYSINFO allows the application software to determine the setting of most of the <<_processor_top_entity_generics>>
that are related to processor/SoC configuration. All registers of this unit are read-only.
This device is always implemented - regardless of the actual hardware configuration. The bootloader as well
as the NEORV32 software runtime environment require information from this device (like memory layout
and default clock frequency) for correct operation.
**Register Map**
.SYSINFO register map (`struct NEORV32_SYSINFO`)
[cols="<2,<1,<7"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Function
| `0xfffffe00` | `CLK` | clock frequency in Hz (via top's `CLOCK_FREQUENCY` generic)
| `0xfffffe04` | `MEM[4]` | internal memory configuration (see <<_sysinfo_memory_configuration>>)
| `0xfffffe08` | `SOC` | specific SoC configuration (see <<_sysinfo_soc_configuration>>)
| `0xfffffe0c` | `CACHE` | cache configuration information (see <<_sysinfo_cache_configuration>>)
|=======================
===== SYSINFO - Memory Configuration
[NOTE]
Bit fields in this register are set to all-zero if the according cache is not implemented.
.SYSINFO `MEM` Bytes
[cols="^1,<2,<7"]
[options="header",grid="all"]
|=======================
| Byte | Name [C] | Function
| `0` | `SYSINFO_MEM_IMEM` | _log2_(internal IMEM size in bytes), via top's `MEM_INT_IMEM_SIZE` generic
| `1` | `SYSINFO_MEM_DMEM` | _log2_(internal DMEM size in bytes), via top's `MEM_INT_DMEM_SIZE` generic
| `2` | - | _reserved_, read as zero
| `3` | `SYSINFO_MEM_RVSG` | _log2_(reservation set size granularity in bytes), via top's `AMO_RVS_GRANULARITY` generic
|=======================
===== SYSINFO - SoC Configuration
.SYSINFO `SOC` Bits
[cols="^2,<6,<10"]
[options="header",grid="all"]
|=======================
| Bit | Name [C] | Function
| `0` | `SYSINFO_SOC_BOOTLOADER` | set if the processor-internal bootloader is implemented (via top's `INT_BOOTLOADER_EN` generic)
| `1` | `SYSINFO_SOC_MEM_EXT` | set if the external Wishbone bus interface is implemented (via top's `MEM_EXT_EN` generic)
| `2` | `SYSINFO_SOC_MEM_INT_IMEM` | set if the processor-internal DMEM implemented (via top's `MEM_INT_DMEM_EN` generic)
| `3` | `SYSINFO_SOC_MEM_INT_DMEM` | set if the processor-internal IMEM is implemented (via top's `MEM_INT_IMEM_EN` generic)
| `4` | `SYSINFO_SOC_MEM_EXT_ENDIAN` | set if external bus interface uses BIG-endian byte-order (via top's `MEM_EXT_BIG_ENDIAN` generic)
| `5` | `SYSINFO_SOC_ICACHE` | set if processor-internal instruction cache is implemented (via top's `ICACHE_EN` generic)
| `6` | `SYSINFO_SOC_DCACHE` | set if processor-internal data cache is implemented (via top's `DCACHE_EN` generic)
| `7` | `SYSINFO_SOC_CLOCK_GATING` | set if CPU clock gating is implemented (via top's `CLOCK_GATING_EN` generic)
| `11:8` | - | _reserved_, read as zero
| `12` | `SYSINFO_SOC_IO_CRC` | set if cyclic redundancy check unit is implemented (via top's `IO_CRC_EN` generic)
| `13` | `SYSINFO_SOC_IO_SLINK` | set if stream link interface is implemented (via top's `IO_SLINK_EN` generic)
| `14` | `SYSINFO_SOC_IO_DMA` | set if direct memory access controller is implemented (via top's `IO_DMA_EN` generic)
| `15` | `SYSINFO_SOC_IO_GPIO` | set if the GPIO is implemented (via top's `IO_GPIO_EN` generic)
| `16` | `SYSINFO_SOC_IO_MTIME` | set if the MTIME is implemented (via top's `IO_MTIME_EN` generic)
| `17` | `SYSINFO_SOC_IO_UART0` | set if the primary UART0 is implemented (via top's `IO_UART0_EN` generic)
| `18` | `SYSINFO_SOC_IO_SPI` | set if the SPI is implemented (via top's `IO_SPI_EN` generic)
| `19` | `SYSINFO_SOC_IO_TWI` | set if the TWI is implemented (via top's `IO_TWI_EN` generic)
| `20` | `SYSINFO_SOC_IO_PWM` | set if the PWM is implemented (via top's `IO_PWM_NUM_CH` generic)
| `21` | `SYSINFO_SOC_IO_WDT` | set if the WDT is implemented (via top's `IO_WDT_EN` generic)
| `22` | `SYSINFO_SOC_IO_CFS` | set if the custom functions subsystem is implemented (via top's `IO_CFS_EN` generic)
| `23` | `SYSINFO_SOC_IO_TRNG` | set if the TRNG is implemented (via top's `IO_TRNG_EN` generic)
| `24` | `SYSINFO_SOC_IO_SDI` | set if the SDI is implemented (via top's `IO_SDI_EN` generic)
| `25` | `SYSINFO_SOC_IO_UART1` | set if the secondary UART1 is implemented (via top's `IO_UART1_EN` generic)
| `26` | `SYSINFO_SOC_IO_NEOLED` | set if the NEOLED is implemented (via top's `IO_NEOLED_EN` generic)
| `27` | `SYSINFO_SOC_IO_XIRQ` | set if the XIRQ is implemented (via top's `XIRQ_NUM_CH` generic)
| `28` | `SYSINFO_SOC_IO_GPTMR` | set if the GPTMR is implemented (via top's `IO_GPTMR_EN` generic)
| `29` | `SYSINFO_SOC_XIP` | set if the XIP module is implemented (via top's `XIP_EN` generic)
| `30` | `SYSINFO_SOC_IO_ONEWIRE` | set if the ONEWIRE interface is implemented (via top's `IO_ONEWIRE_EN` generic)
| `31` | `SYSINFO_SOC_OCD` | set if on-chip debugger is implemented (via top's `ON_CHIP_DEBUGGER_EN` generic)
|=======================
===== SYSINFO - Cache Configuration
[NOTE]
Bit fields in this register are set to all-zero if the according cache is not implemented.
.SYSINFO `CACHE` Bits
[cols="^1,<10,<10"]
[options="header",grid="all"]
|=======================
| Bit | Name [C] | Function
| `3:0` | `SYSINFO_CACHE_IC_BLOCK_SIZE_3 : SYSINFO_CACHE_IC_BLOCK_SIZE_0` | _log2_(i-cache block size in bytes), via top's `ICACHE_BLOCK_SIZE` generic
| `7:4` | `SYSINFO_CACHE_IC_NUM_BLOCKS_3 : SYSINFO_CACHE_IC_NUM_BLOCKS_0` | _log2_(i-cache number of cache blocks), via top's `ICACHE_NUM_BLOCKS` generic
| `11:9` | `SYSINFO_CACHE_IC_ASSOCIATIVITY_3 : SYSINFO_CACHE_IC_ASSOCIATIVITY_0` | _log2_(i-cache associativity), via top's `ICACHE_ASSOCIATIVITY` generic
| `15:12` | `SYSINFO_CACHE_IC_REPLACEMENT_3 : SYSINFO_CACHE_IC_REPLACEMENT_0` | i-cache replacement policy (`0001` = LRU if associativity > 0)
| `19:16` | `SYSINFO_CACHE_DC_BLOCK_SIZE_3 : SYSINFO_CACHE_DC_BLOCK_SIZE_0` | _log2_(d-cache block size in bytes), via top's `DCACHE_BLOCK_SIZE` generic
| `23:20` | `SYSINFO_CACHE_DC_NUM_BLOCKS_3 : SYSINFO_CACHE_DC_NUM_BLOCKS_0` | _log2_(d-cache number of cache blocks), via top's `DCACHE_NUM_BLOCKS` generic
| `27:24` | `SYSINFO_CACHE_DC_ASSOCIATIVITY_3 : SYSINFO_CACHE_DC_ASSOCIATIVITY_0` | always zero
| `31:28` | `SYSINFO_CACHE_DC_REPLACEMENT_3 : SYSINFO_CACHE_DC_REPLACEMENT_0` | always zero
|=======================

View File

@ -0,0 +1,86 @@
<<<
:sectnums:
==== True Random-Number Generator (TRNG)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_trng.vhd |
| Software driver file(s): | neorv32_trng.c |
| | neorv32_trng.h |
| Top entity port: | none |
| Configuration generics: | `IO_TRNG_EN` | implement TRNG when `true`
| | `IO_TRNG_FIFO` | data FIFO depth, min 1, has to be a power of two
| CPU interrupts: | fast IRQ channel 15 | TRNG FIFO level interrupt (see <<_processor_interrupts>>)
|=======================
**Overview**
The NEORV32 true random number generator provides _physically_ true random numbers. It is based on free-running
ring-oscillators that generate **phase noise** when being sampled by a constant clock. This phase noise is
used as physical entropy source. The TRNG features a platform independent architecture without FPGA-specific
primitives, macros or attributes so it can be synthesized for _any_ FPGA.
.In-Depth Documentation
[TIP]
For more information about the neoTRNG architecture and an analysis of its random quality check out the
neoTRNG repository: https://github.com/stnolting/neoTRNG
.Inferring Latches
[NOTE]
The synthesis tool might emit warnings regarding **inferred latches** or **combinatorial loops**. However, this
is not design flaw as this is exactly what we want. ;)
.Simulation
[IMPORTANT]
When simulating the processor the TRNG is automatically set to "simulation mode". In this mode the physical entropy
sources (the ring oscillators) are replaced by a simple **pseudo RNG** based on a LFSR providing only
**deterministic pseudo-random** data. The `TRNG_CTRL_SIM_MODE` flag of the control register is set if simulation
mode is active.
**Theory of Operation**
The TRNG features a single control register `CTRL` for control, status check and data access. When the `TRNG_CTRL_EN`
bit is set, the TRNG is enabled and starts operation. As soon as the `TRNG_CTRL_VALID` bit is set a new random data byte
is available and can be obtained from the lowest 8 bits of the `CTRL` register. If this bit is cleared, there is no
valid data available and the lowest 8 bit of the `CTRL` register are set to all-zero.
An internal entropy FIFO can be configured using the `IO_TRNG_FIFO` generic. This FIFO automatically samples
new random data from the TRNG to provide some kind of _random data pool_ for applications, which require a large number
of random data in a short time. The random data FIFO can be cleared at any time either by disabling the TRNG or by
setting the `TRNG_CTRL_FIFO_CLR` flag. The FIFO depth can be retrieved by software via the `TRNG_CTRL_FIFO_*` bits.
**TRNG Interrupt**
The TRNG provides a single interrupt channel that can be programmed to trigger on certain FIFO fill-level conditions.
This feature can be used to inform the CPU that a certain amount of entropy is available for further processing. Using
the control register's `TRNG_CTRL_IRQ_*` bits the IRQ can be configured to trigger if the data FIFO is empty
(`TRNG_CTRL_IRQ_FIFO_NEMPTY`), if the data FIFO is at least half full (`TRNG_CTRL_IRQ_FIFO_HALF`) or if the data FIFO is
entirely full (`TRNG_CTRL_IRQ_FIFO_NEMPTY`). Note that all enabled interrupt conditions are logically OR-ed.
Once the TRNG interrupt has fired it remains pending until the actual cause of the interrupt is resolved. Furthermore,
an active TRNG interrupt has to be explicitly cleared again by writing zero to the according <<_mip>> CSR bit.
**Register Map**
.TRNG register map (`struct NEORV32_TRNG`)
[cols="<2,<1,<4,^1,<7"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.11+<| `0xfffffa00` .11+<| `CTRL` <|`7:0` `TRNG_CTRL_DATA_MSB : TRNG_CTRL_DATA_MSB` ^| r/- <| 8-bit random data
<|`15:8` - ^| r/- <| reserved, read as zero
<|`19:16` `TRNG_CTRL_FIFO_MSB : TRNG_CTRL_FIFO_MSB` ^| r/- <| FIFO depth, log2(`IO_TRNG_FIFO`)
<|`25:20` - ^| r/- <| reserved, read as zero
<|`26` `TRNG_CTRL_IRQ_FIFO_NEMPTY` ^| r/w <| IRQ if data FIFO is not empty
<|`26` `TRNG_CTRL_IRQ_FIFO_HALF` ^| r/w <| IRQ if data FIFO is at least half full
<|`27` `TRNG_CTRL_IRQ_FIFO_FULL` ^| r/w <| IRQ if data FIFO is full
<|`28` `TRNG_CTRL_FIFO_CLR` ^| -/w <| flush random data FIFO when set; auto-clears
<|`29` `TRNG_CTRL_SIM_MODE` ^| r/- <| simulation mode (PRNG!)
<|`30` `TRNG_CTRL_EN` ^| r/w <| TRNG enable
<|`31` `TRNG_CTRL_VALID` ^| r/- <| random data is valid when set
|=======================

View File

@ -0,0 +1,140 @@
<<<
:sectnums:
==== Two-Wire Serial Interface Controller (TWI)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_twi.vhd |
| Software driver file(s): | neorv32_twi.c |
| | neorv32_twi.h |
| Top entity port: | `twi_sda_i` | 1-bit serial data line sense input
| | `twi_sda_o` | 1-bit serial data line output (pull low only)
| | `twi_scl_i` | 1-bit serial clock line sense input
| | `twi_scl_o` | 1-bit serial clock line output (pull low only)
| Configuration generics: | `IO_TWI_EN` | implement TWI controller when `true`
| CPU interrupts: | fast IRQ channel 7 | transmission done interrupt (see <<_processor_interrupts>>)
|=======================
**Overview**
The NEORV32 TWI implements a **TWI controller**. Currently, **no multi-controller support** is available.
Furthermore, the NEORV32 TWI unit cannot operate in peripheral mode.
[IMPORTANT]
The serial clock (SCL) and the serial data (SDA) lines can only be actively driven low by the
controller. Hence, external pull-up resistors are required for these lines.
**Tri-State Drivers**
The TWI module requires two tri-state drivers (actually: open-drain) for the SDA and SCL lines, which have to be
implemented in the top module of the setup. A generic VHDL example is given below (`sda` and `scl` are the actual TWI
bus signal, which are of type `std_logic`).
.TWI VHDL tri-state driver example
[source,VHDL]
----
sda <= '0' when (twi_sda_o = '0') else 'Z'; -- drive
scl <= '0' when (twi_scl_o = '0') else 'Z'; -- drive
twi_sda_i <= std_ulogic(sda); -- sense
twi_scl_i <= std_ulogic(scl); -- sense
----
**TWI Clock Speed**
The TWI clock frequency is programmed by the 3-bit `TWI_CTRL_PRSCx` clock prescaler for a coarse selection
and a 4-bit clock divider `TWI_CTRL_CDIVx` for a fine selection.
.TWI prescaler configuration
[cols="<4,^1,^1,^1,^1,^1,^1,^1,^1"]
[options="header",grid="rows"]
|=======================
| **`TWI_CTRL_PRSCx`** | `0b000` | `0b001` | `0b010` | `0b011` | `0b100` | `0b101` | `0b110` | `0b111`
| Resulting `clock_prescaler` | 2 | 4 | 8 | 64 | 128 | 1024 | 2048 | 4096
|=======================
Based on the the clock configuration, the actual TWI clock frequency f~SCL~ is derived
from the processor's main clock f~main~ according to the following equation:
_**f~SCL~**_ = _f~main~[Hz]_ / (4 * `clock_prescaler` * (1 + TWI_CTRL_CDIV))
Hence, the maximum TWI clock is f~main~ / 8 and the lowest TWI clock is f~main~ / 262144. The generated TWI clock is
always symmetric having a duty cycle of exactly 50%. However, an accessed peripheral can "slow down" the bus clock
by using **clock stretching** (= actively driving the SCL line low). The controller will pause operation in this case
if clock stretching is enabled via the `TWI_CTRL_CSEN` bit of the unit's control register `CTRL`
**TWI Transfers**
The TWI is enabled via the `TWI_CTRL_EN` bit in the `CTRL` control register. The user program can start / stop a
transmission by issuing a START or STOP condition. These conditions are generated by setting the
according bits (`TWI_CTRL_START` or `TWI_CTRL_STOP`) in the control register.
Data is transferred via the TWI bus by writing a byte to the `DATA` register. The written byte is send via the TWI bus
and the received byte from the bus is also available in this register after the transmission is completed.
The TWI operation (transmitting data or performing a START or STOP condition) is in progress as long as the
control register's `TWI_CTRL_BUSY` bit is set.
[TIP]
A transmission can be terminated at any time by disabling the TWI module
by clearing the _TWI_CTRL_EN_ control register bit. This will also reset the whole module.
[NOTE]
When reading data from a device, an all-one byte (`0xFF`) has to be written to TWI data register `NEORV32_TWI.DATA`
so the accessed device can actively pull-down SDA when required.
**TWI ACK/NACK and MACK**
An accessed TWI peripheral has to acknowledge each transferred byte. When the `TWI_CTRL_ACK` bit is set after a
completed transmission the accessed peripheral has send an ACKNOWLEDGE (ACK). If this bit is cleared after a completed
transmission, the peripheral has send a_NOT-ACKNOWLEDGE (NACK).
The NEORV32 TWI controller can also send an ACK generated by itself ("controller acknowledge / MACK") right after
transmitting a byte by driving SDA low during the ACK time slot. Some TWI modules require this MACK to acknowledge
certain data movement operations.
The control register's `TWI_CTRL_MACK` bit has to be set to make the TWI module automatically generate a MACK after
the byte transmission has been completed. If this bit is cleared, the ACK/NACK generated by the peripheral is sampled
in this time slot instead (normal mode).
**TWI Bus Status**
The TWI controller can check if the TWI bus is currently claimed (SCL and SDA both low). The bus can be claimed by the
NEORV32 TWI itself or by any other controller. Bit `TWI_CTRL_CLAIME` of the control register will be set if the bus
is currently claimed.
**TWI Interrupt**
The TWI module provides a single interrupt to signal "transmission done" to the CPU. Whenever the TWI
module completes the current transmission of one byte the interrupt is triggered. Note the the interrupt
is **not** triggered when completing a START or STOP condition. Once triggered, the interrupt has to be
explicitly cleared again by writing zero to the according <<_mip>> CSR bit.
**Register Map**
.TWI register map (`struct NEORV32_TWI`)
[cols="<2,<1,<4,^1,<7"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.10+<| `0xfffff900` .10+<| `CTRL` <|`0` `TWI_CTRL_EN` ^| r/w <| TWI enable, reset if cleared
<|`1` `TWI_CTRL_START` ^| -/w <| generate START condition, auto-clears
<|`2` `TWI_CTRL_STOP` ^| -/w <| generate STOP condition, auto-clears
<|`3` `TWI_CTRL_MACK` ^| r/w <| generate controller-ACK for each transmission ("MACK")
<|`4` `TWI_CTRL_CSEN` ^| r/w <| allow clock stretching when set
<|`7:5` `TWI_CTRL_PRSC2 : TWI_CTRL_PRSC0` ^| r/w <| 3-bit clock prescaler select
<|`11:8` `TWI_CTRL_CDIV3 : TWI_CTRL_CDIV0` ^| r/w <| 4-bit clock divider
<|`28:12` - ^| r/- <| _reserved_, read as zero
<|`29` `TWI_CTRL_CLAIMED` ^| r/- <| set if the TWI bus is claimed by any controller
<|`30` `TWI_CTRL_ACK` ^| r/- <| ACK received when set, NACK received when cleared
<|`31` `TWI_CTRL_BUSY` ^| r/- <| transfer/START/STOP in progress when set
| `0xfffff904` | `DATA` |`7:0` | r/w | receive/transmit data
|=======================

View File

@ -0,0 +1,195 @@
<<<
:sectnums:
==== Primary Universal Asynchronous Receiver and Transmitter (UART0)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_uart.vhd |
| Software driver file(s): | neorv32_uart.c |
| | neorv32_uart.h |
| Top entity port: | `uart0_txd_o` | serial transmitter output
| | `uart0_rxd_i` | serial receiver input
| | `uart0_rts_o` | flow control: RX ready to receive, low-active
| | `uart0_cts_i` | flow control: RX ready to receive, low-active
| Configuration generics: | `IO_UART0_EN` | implement UART0 when `true`
| | `UART0_RX_FIFO` | RX FIFO depth (power of 2, min 1)
| | `UART0_TX_FIFO` | TX FIFO depth (power of 2, min 1)
| CPU interrupts: | fast IRQ channel 2 | RX interrupt
| | fast IRQ channel 3 | TX interrupt (see <<_processor_interrupts>>)
|=======================
**Overview**
The NEORV32 UART provides a standard serial interface with independent transmitter and receiver channels, each
equipped with a configurable FIFO. The transmission frame is fixed to **8N1**: 8 data bits, no parity bit, 1 stop
bit. The actual transmission rate (Baud rate) is programmable via software. The module features two memory-mapped
registers: `CTRL` and `DATA`. These are used for configuration, status check and data transfer.
.Standard Console
[NOTE]
All default example programs and software libraries of the NEORV32 software framework (including the bootloader
and the runtime environment) use the primary UART (_UART0_) as default user console interface. Furthermore, UART0
is used to implement the "standard consoles" (`STDIN`, `STDOUT` and `STDERR`).
**Theory of Operation**
The module is enabled by setting the `UART_CTRL_EN` bit in the UART0 control register `CTRL`. The Baud rate
is configured via a 10-bit `UART_CTRL_BAUDx` baud divisor (`baud_div`) and a 3-bit `UART_CTRL_PRSCx`
clock prescaler (`clock_prescaler`).
.UART0 Clock Configuration
[cols="<4,^1,^1,^1,^1,^1,^1,^1,^1"]
[options="header",grid="rows"]
|=======================
| **`UART_CTRL_PRSCx`** | `0b000` | `0b001` | `0b010` | `0b011` | `0b100` | `0b101` | `0b110` | `0b111`
| Resulting `clock_prescaler` | 2 | 4 | 8 | 64 | 128 | 1024 | 2048 | 4096
|=======================
_**Baud rate**_ = (_f~main~[Hz]_ / `clock_prescaler`) / (`baud_div` + 1)
The control register's `UART_CTRL_RX_*` and `UART_CTRL_TX_*` flags provide information about the RX and TX FIFO fill level.
Disabling the module via the `UART_CTRL_EN` bit will also clear these FIFOs.
A new TX transmission is started by writing to the `DATA` register. The
transfer is completed when the `UART_CTRL_TX_BUSY` control register flag returns to zero. RX data is available when
the `UART_CTRL_RX_NEMPTY` flag becomes set. The `UART_CTRL_RX_OVER` will be set if the RX FIFO overflows. This flag
is cleared only by disabling the module via `UART_CTRL_EN`.
**UART Interrupts**
The UART module provides independent interrupt channels for RX and TX. These interrupts are triggered by certain RX and TX
FIFO levels. The actual configuration is programmed independently for the RX and TX interrupt channel via the control register's
`UART_CTRL_IRQ_RX_*` and `UART_CTRL_IRQ_TX_*` bits:
. **RX IRQ** The RX interrupt can be triggered by three different RX FIFO level states: If `UART_CTRL_IRQ_RX_NEMPTY` is set the
interrupt fires if the RX FIFO is _not_ empty (e.g. when incoming data is available). If `UART_CTRL_IRQ_RX_HALF` is set the RX IRQ
fires if the RX FIFO is at least half-full. If `UART_CTRL_IRQ_RX_FULL` the interrupt fires if the RX FIFO is full. Note that all
these programmable conditions are logically OR-ed (interrupt fires if any enabled conditions is true).
. **TX IRQ** The TX interrupt can be triggered by two different TX FIFO level states: If `UART_CTRL_IRQ_TX_EMPTY` is set the
interrupt fires if the TX FIFO is empty. If `UART_CTRL_IRQ_TX_NHALF` is set the interrupt fires if the TX FIFO is _not_ at least
half full. Note that all these programmable conditions are logically OR-ed (interrupt fires if any enabled conditions is true).
Once an UART interrupt has fired it remains pending until the actual cause of the interrupt is resolved; for
example if just the `UART_CTRL_IRQ_RX_NEMPTY` bit is set, the RX interrupt will keep firing until the RX FIFO is empty again.
Furthermore, a pending UART interrupt has to be explicitly cleared again by writing zero to the according <<_mip>> CSR bit.
.RX/TX FIFO Size
[TIP]
Software can retrieve the configured sizes of the RX and TX FIFO via the according `UART_DATA_RX_FIFO_SIZE` and
`UART_DATA_TX_FIFO_SIZE` bits from the `DATA` register.
**RTS/CTS Hardware Flow Control**
The NEORV32 UART supports optional hardware flow control using the standard CTS `uart0_cts_i` ("clear to send") and RTS
`uart0_rts_o` ("ready to send" / "ready to receive (RTR)") signals. Both signals are low-active.
Hardware flow control is enabled by setting the `UART_CTRL_HWFC_EN` bit in the modules control register `CTRL`.
When hardware flow control is enabled:
. The UART's transmitter will not start a new transmission until the `uart0_cts_i` signal goes low.
During this time, the UART busy flag `UART_CTRL_TX_BUSY` remains set.
. The UART will set `uart0_rts_o` signal low if the RX FIFO is **less than half full** (to have a wide safety margin).
As long as this signal is low, the connected device can send new data. `uart0_rts_o` is always low if the hardware flow-control
is disabled. Disabling the UART (setting `UART_CTRL_EN` low) while having hardware flow-control enabled, will set `uart0_rts_o`
high to signal that the UARt is not capable of receiving new data.
[NOTE]
Note that RTS and CTS signaling can only be activated together. If the RTS handshake is not required the signal can be left
unconnected. If the CTS handshake is not required it has to be tied to zero.
**Simulation Mode**
The UART provides a _simulation-only_ mode to dump console data as well as raw data directly to a file. When the simulation
mode is enabled (by setting the `UART_CTRL_SIM_MODE` bit) there will be **no** physical transaction on the `uart0_txd_o` signal.
Instead, all data written to the `DATA` register is immediately dumped to a file. Data written to `DATA[7:0]` will be dumped as
ASCII chars to a file named `neorv32.uart0.sim_mode.text.out`. Additionally, the ASCII data is printed to the simulator console.
Both file are created in the simulation's home folder.
**Register Map**
.UART0 register map (`struct NEORV32_UART0`)
[cols="<4,<2,<5,^2,<5"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.19+<| `0xfffff500` .19+<| `CTRL` <|`0` `UART_CTRL_EN` ^| r/w <| UART enable
<|`1` `UART_CTRL_SIM_MODE` ^| r/w <| enable **simulation mode**
<|`2` `UART_CTRL_HWFC_EN` ^| r/w <| enable RTS/CTS hardware flow-control
<|`5:3` `UART_CTRL_PRSC2 : UART_CTRL_PRSC0` ^| r/w <| Baud rate clock prescaler select
<|`15:6` `UART_CTRL_BAUD9 : UART_CTRL_BAUD0` ^| r/w <| 12-bit Baud value configuration value
<|`16` `UART_CTRL_RX_NEMPTY` ^| r/- <| RX FIFO not empty
<|`17` `UART_CTRL_RX_HALF` ^| r/- <| RX FIFO at least half-full
<|`18` `UART_CTRL_RX_FULL` ^| r/- <| RX FIFO full
<|`19` `UART_CTRL_TX_EMPTY` ^| r/- <| TX FIFO empty
<|`20` `UART_CTRL_TX_NHALF` ^| r/- <| TX FIFO not at least half-full
<|`21` `UART_CTRL_TX_FULL` ^| r/- <| TX FIFO full
<|`22` `UART_CTRL_IRQ_RX_NEMPTY` ^| r/w <| fire IRQ if RX FIFO not empty
<|`23` `UART_CTRL_IRQ_RX_HALF` ^| r/w <| fire IRQ if RX FIFO at least half-full
<|`24` `UART_CTRL_IRQ_RX_FULL` ^| r/w <| fire IRQ if RX FIFO full
<|`25` `UART_CTRL_IRQ_TX_EMPTY` ^| r/w <| fire IRQ if TX FIFO empty
<|`26` `UART_CTRL_IRQ_TX_NHALF` ^| r/w <| fire IRQ if TX not at least half full
<|`29:27` - ^| r/- <| _reserved_ read as zero
<|`30` `UART_CTRL_RX_OVER` ^| r/- <| RX FIFO overflow; cleared by disabling the module
<|`31` `UART_CTRL_TX_BUSY` ^| r/- <| TX busy or TX FIFO not empty
.4+<| `0xfffff504` .4+<| `DATA` <|`7:0` `UART_DATA_RTX_MSB : UART_DATA_RTX_LSB` ^| r/w <| receive/transmit data
<|`11:8` `UART_DATA_RX_FIFO_SIZE_MSB : UART_DATA_RX_FIFO_SIZE_LSB` ^| r/- <| log2(RX FIFO size)
<|`15:12` `UART_DATA_TX_FIFO_SIZE_MSB : UART_DATA_TX_FIFO_SIZE_LSB` ^| r/- <| log2(RX FIFO size)
<|`31:16` ^| r/- <| _reserved_, read as zero
|=======================
<<<
// ####################################################################################################################
:sectnums:
==== Secondary Universal Asynchronous Receiver and Transmitter (UART1)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_uart.vhd |
| Software driver file(s): | neorv32_uart.c |
| | neorv32_uart.h |
| Top entity port: | `uart1_txd_o` | serial transmitter output
| | `uart1_rxd_i` | serial receiver input
| | `uart1_rts_o` | flow control: RX ready to receive, low-active
| | `uart1_cts_i` | flow control: RX ready to receive, low-active
| Configuration generics: | `IO_UART1_EN` | implement UART1 when `true`
| | `UART1_RX_FIFO` | RX FIFO depth (power of 2, min 1)
| | `UART1_TX_FIFO` | TX FIFO depth (power of 2, min 1)
| CPU interrupts: | fast IRQ channel 4 | RX interrupt
| | fast IRQ channel 5 | TX interrupt (see <<_processor_interrupts>>)
|=======================
**Overview**
The secondary UART (UART1) is functionally identical to the primary UART
(<<_primary_universal_asynchronous_receiver_and_transmitter_uart0>>). Obviously, UART1 uses different addresses for the
control register (`CTRL`) and the data register (`DATA`). The register's bits/flags use the same bit positions and naming
as for the primary UART. The RX and TX interrupts of UART1 are mapped to different CPU fast interrupt (FIRQ) channels.
**Simulation Mode**
The secondary UART (UART1) provides the same simulation options as the primary UART (UART0). However, output data is
written to UART1-specific file `neorv32.uart1.sim_mode.text.out`. This data is also printed to the simulator console.
**Register Map**
.UART1 register map (`struct NEORV32_UART1`)
[cols="<2,<1,<1,^1,<2"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
| `0xfffff600` | `CTRL` | ... | ... | Same as UART0
| `0xfffff604` | `DATA` | ... | ... | Same as UART0
|=======================

View File

@ -0,0 +1,103 @@
<<<
:sectnums:
==== Watchdog Timer (WDT)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_wdt.vhd |
| Software driver file(s): | neorv32_wdt.c |
| | neorv32_wdt.h |
| Top entity port: | none |
| Configuration generics: | `IO_WDT_EN` | implement watchdog when `true`
| CPU interrupts: | fast IRQ channel 0 | watchdog timeout (see <<_processor_interrupts>>)
|=======================
**Theory of Operation**
The watchdog (WDT) provides a last resort for safety-critical applications. The WDT provides a "bark and bite"
concept. The timeout counter first triggers an optional CPU interrupt ("bark") when reaching half of the programmed
interval to inform the application of the imminent timeout. When the full timeout value is reached
a system-wide hardware reset is generated ("bite"). The internal counter has to be reset explicitly by the application
program every now and then to prevent a timeout.
**Configuration**
The watchdog is enabled by setting the control register's `WDT_CTRL_EN` bit. When this bit is cleared, the internal
timeout counter is reset to zero and no interrupt and no system reset can be triggered.
The internal 32-bit timeout counter is clocked at 1/4096th of the processor's main clock (f~WDT~[Hz] = f~main~[Hz] / 4096).
Whenever this counter reaches the programmed timeout value (`WDT_CTRL_TIMEOUT` bits in the control register) a
hardware reset is triggered. In order to inform the application of an imminent timeout, an optional CPU interrupt is
triggered when the timeout counter reaches _half_ of the programmed timeout value.
The watchdog's timeout counter is reset ("feeding the watchdog") by writing the reset **PASSWORD** to the `RESET` register.
The password is hardwired to hexadecimal `0x709D1AB3`.
.Watchdog Interrupt
[NOTE]
A watchdog interrupt occurs when the watchdog is enabled and the internal counter reaches _exactly_ half of the programmed
timeout value. Hence, the interrupt only fires once. However, a triggered WDT interrupt has to be explicitly cleared by
writing zero to the according <<_mip>> CSR bit.
.Watchdog Operation during Debugging
[IMPORTANT]
By default, the watchdog stops operation when the CPU enters debug mode and will resume normal operation after
the CPU has left debug mode again. This will prevent an unintended watchdog timeout during a debug session. However,
the watchdog can also be configured to keep operating even when the CPU is in debug mode by setting the control
register's `WDT_CTRL_DBEN` bit.
.Watchdog Operation during CPU Sleep
[IMPORTANT]
By default, the watchdog stops operating when the CPU enters sleep mode. However, the watchdog can also be configured
to keep operating even when the CPU is in sleep mode by setting the control register's `WDT_CTRL_SEN` bit.
**Configuration Lock**
The watchdog control register can be _locked_ to protect the current configuration from being modified. The lock is
activated by setting the `WDT_CTRL_LOCK` bit. In the locked state any write access to the control register is entirely
ignored (see table below, "writable if locked"). However, read accesses to the control register as well as watchdog resets
are further possible.
The lock bit can only be set if the WDT is already enabled (`WDT_CTRL_EN` is set). Furthermore, the lock bit can
only be cleared again by a system-wide hardware reset.
**Strict Mode**
The _strict operation mode_ provides additional safety functions. If the strict mode is enabled by the `WDT_CTRL_STRICT`
control register bit an **immediate hardware** reset if enforced if
* the `RESET` register is written with an incorrect password or
* the `CTRL` register is written and the `WDT_CTRL_LOCK` bit is set.
**Cause of last Hardware Reset**
The cause of the last system hardware reset can be determined via the `WDT_CTRL_RCAUSE_*` bits:
* `0b00`: Reset caused by external reset signal/pin
* `0b01`: Reset caused by on-chip debugger
* `0b10`: Reset caused by watchdog
**Register Map**
.WDT register map (`struct NEORV32_WDT`)
[cols="<2,<1,<4,^1,^1,^2,<4"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Reset value | Writable if locked | Function
.8+<| `0xfffffb00` .8+<| `CTRL` <|`0` `WDT_CTRL_EN` ^| r/w ^| `0` ^| no <| watchdog enable
<|`1` `WDT_CTRL_LOCK` ^| r/w ^| `0` ^| no <| lock configuration when set, clears only on system reset, can only be set if enable bit is set already
<|`2` `WDT_CTRL_DBEN` ^| r/w ^| `0` ^| no <| set to allow WDT to continue operation even when CPU is in debug mode
<|`3` `WDT_CTRL_SEN` ^| r/w ^| `0` ^| no <| set to allow WDT to continue operation even when CPU is in sleep mode
<|`4` `WDT_CTRL_STRICT` ^| r/w ^| `0` ^| no <| set to enable strict mode (force hardware reset if reset password is incorrect or if write access to locked CTRL register)
<|`6:5` `WDT_CTRL_RCAUSE_HI : WDT_CTRL_RCAUSE_LO` ^| r/- ^| `0` ^| - <| cause of last system reset; 0=external reset, 1=ocd-reset, 2=watchdog reset
<|`7` - ^| r/- ^| - ^| - <| _reserved_, reads as zero
<|`31:8` `WDT_CTRL_TIMEOUT_MSB : WDT_CTRL_TIMEOUT_LSB` ^| r/w ^| 0 ^| no <| 24-bit watchdog timeout value
| `0xfffffb04` | `RESET` | | -/w | - | yes | Write PASSWORD to reset WDT timeout counter ("feed the watchdog")
|=======================

View File

@ -0,0 +1,124 @@
<<<
:sectnums:
==== Processor-External Memory Interface (WISHBONE)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_wishbone.vhd |
| Software driver file(s): | none | _implicitly used_
| Top entity port: | `wb_tag_o` | request tag output (3-bit)
| | `wb_adr_o` | address output (32-bit)
| | `wb_dat_i` | data input (32-bit)
| | `wb_dat_o` | data output (32-bit)
| | `wb_we_o` | write enable (1-bit)
| | `wb_sel_o` | byte enable (4-bit)
| | `wb_stb_o` | strobe (1-bit)
| | `wb_cyc_o` | valid cycle (1-bit)
| | `wb_ack_i` | acknowledge (1-bit)
| | `wb_err_i` | bus error (1-bit)
| | `fence_o` | an executed `fence` instruction
| | `fencei_o` | an executed `fence.i` instruction
| Configuration generics: | `MEM_EXT_EN` | enable external memory interface when `true`
| | `MEM_EXT_TIMEOUT` | number of clock cycles after which an unacknowledged external bus access will auto-terminate (0 = disabled)
| | `MEM_EXT_PIPE_MODE` | when `false` (default): classic/standard Wishbone protocol; when `true`: pipelined Wishbone protocol
| | `MEM_EXT_BIG_ENDIAN` | byte-order (Endianness) of external memory interface; `true`=BIG, `false`=little (default)
| | `MEM_EXT_ASYNC_RX` | use registered RX path when `false` (default); use async/direct RX path when `true`
| | `MEM_EXT_ASYNC_TX_` | use registered TX path when `false` (default); use async/direct TX path when `true`
| CPU interrupts: | none |
|=======================
The external memory interface provides a Wishbone b4-compatible on-chip bus interface. The bus interface is
implemented if the `MEM_EXT_EN` generic is `true`. This interface can be used to attach external memories,
custom hardware accelerators, additional IO devices or all other kinds of IP blocks.
The external interface is not mapped to a specific address space. Instead, all CPU memory accesses that
do not target a specific processor-internal address region (accessing the "void"; see section <<_address_space>>)
are redirected to the external memory interface.
**Wishbone Bus Protocol**
The external memory interface either uses the **standard** (also called "classic") Wishbone protocol (default) or
**pipelined** Wishbone protocol. The protocol to be used is configured via the `MEM_EXT_PIPE_MODE` generic:
* If `MEM_EXT_PIPE_MODE` is `false`, all bus control signals including `wb_stb_o` are active and remain stable until the
transfer is acknowledged/terminated.
* If `MEM_EXT_PIPE_MODE` is `true`, all bus control except `wb_stb_o` are active and remain until the transfer is
acknowledged/terminated. In this case, `wb_stb_o` is asserted only during the very first bus clock cycle.
.Exemplary Wishbone bus accesses using "classic" and "pipelined" protocol
[cols="^2,^2"]
[grid="none"]
|=======================
a| image::wishbone_classic_read.png[700,300]
a| image::wishbone_pipelined_write.png[700,300]
| **Classic** Wishbone read access | **Pipelined** Wishbone write access
|=======================
[WARNING]
If the Wishbone interface is configured to operate in classic/standard mode (`MEM_EXT_PIPE_MODE` = false) a
**sync** RX path (`MEM_EXT_ASYNC_RX` = false) is required for the inter-cycle pause. If `MEM_EXT_ASYNC_RX` is
enabled while `MEM_EXT_PIPE_MODE` is disabled the module will automatically disable the asynchronous RX option.
.Wishbone Specs.
[TIP]
A detailed description of the implemented Wishbone bus protocol and the according interface signals
can be found in the data sheet "Wishbone B4 - WISHBONE System-on-Chip (SoC) Interconnection
Architecture for Portable IP Cores". A copy of this document can be found in the `docs` folder of this
project.
**Bus Access**
The NEORV32 Wishbone gateway does not support burst transfers yet, so there is always just a single transfer "in fly".
Hence, the Wishbone `STALL` signal is not implemented. An accessed Wishbone device does not have to respond immediately to a bus
request by sending an ACK. Instead, there is a _time window_ where the device has to acknowledge the transfer. This time window
s configured by the `MEM_EXT_TIMEOUT` generic that defines the maximum time (in clock cycles) a bus access can be pending
before it is automatically terminated with an error condition. If `MEM_EXT_TIMEOUT` is set to zero, the timeout is disabled
and a bus access can take an arbitrary number of cycles to complete (this is not recommended!).
When `MEM_EXT_TIMEOUT` is greater than zero, the Wishbone gateway starts an internal countdown whenever the CPU
accesses an address via the external memory interface. If the accessed device does not acknowledge (via `wb_ack_i`)
or terminate (via `wb_err_i`) the transfer within `MEM_EXT_TIMEOUT` clock cycles, the bus access is automatically canceled
setting `wb_cyc_o` low again and a CPU load/store/instruction fetch bus access fault exception is raised.
**Wishbone Tag**
The 3-bit wishbone `wb_tag_o` signal provides additional information regarding the access type:
* `wb_tag_o(0)`: `1` = privileged access (CPU is in machine mode); `0` = unprivileged access (CPU is not in machine mode)
* `wb_tag_o(1)`: always zero
* `wb_tag_o(2)`: `1` = instruction fetch access, `0` = data access
**Endianness**
The NEORV32 CPU and the Processor setup are *little-endian* architectures. To allow direct connection
to a big-endian memory system the external bus interface provides an Endianness configuration. The
Endianness of the external memory interface can be configured via the `MEM_EXT_BIG_ENDIAN` generic.
By default, the external memory interface uses little-endian byte-order.
Application software can check the Endianness configuration of the external bus interface via the
SYSINFO module (see section <<_system_configuration_information_memory_sysinfo>> for more information).
**Access Latency**
By default, the Wishbone gateway introduces two additional latency cycles: processor-outgoing (`*_o`) and
processor-incoming (`*_i`) signals are fully registered. Thus, any access from the CPU to a processor-external devices
via Wishbone requires 2 additional clock cycles. This can ease timing closure when using large (combinatorial) Wishbone
interconnection networks.
Optionally, the latency of the Wishbone gateway can be reduced by removing the input and output register stages.
Enabling the `MEM_EXT_ASYNC_RX` option will remove the input register stage; enabling `MEM_EXT_ASYNC_TX` option will
remove the output register stages. Each enabled option reduces access latency by 1 cycle.
.Output Gating
[NOTE]
All outgoing Wishbone signals use a "gating mechanism" so they only change if there is a actual Wishbone transaction being in
progress. This can reduce dynamic switching activity in the external bus system and also simplifies simulation-based
inspection of the Wishbone transactions. Note that this output gating is only available if the output register buffer is not
disabled (`MEM_EXT_ASYNC_TX` = `false`).

View File

@ -0,0 +1,211 @@
<<<
:sectnums:
==== Execute In Place Module (XIP)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_xip.vhd |
| Software driver file(s): | neorv32_xip.c |
| | neorv32_xip.h |
| Top entity port: | `xip_csn_o` | 1-bit chip select, low-active
| | `xip_clk_o` | 1-bit serial clock output
| | `xip_dat_i` | 1-bit serial data input
| | `xip_dat_o` | 1-bit serial data output
| Configuration generics: | `XIP_EN` | implement XIP module when `true`
| | `XIP_CACHE_EN` | implement XIP cache when `true`
| | `XIP_CACHE_NUM_BLOCKS` | number of blocks in XIP cache; has to be a power of two
| | `XIP_CACHE_BLOCK_SIZE` | number of bytes per XIP cache block; has to be a power of two, min 4
| CPU interrupts: | none |
|=======================
**Overview**
The execute in-place (XIP) module allows to execute code (and read constant data) directly from an external SPI flash memory.
The standard serial peripheral interface (SPI) is used as transfer protocol. All bus requests issued by the CPU
are converted transparently into SPI flash access commands. Hence, the external XIP flash behaves like a simple on-chip ROM.
From the CPU side, the modules provides two independent interfaces: one for transparently accessing the XIP flash and another
one for accessing the module's control and status registers. The first interface provides the _transparent_
gateway to the SPI flash, so the CPU can directly fetch and execute instructions and/or read constant data.
Note that this interface is read-only. Any write access will raise a bus error exception. The second interface is
mapped to the processor's IO space and allows accesses to the XIP module's configuration registers as well as
conducting individual SPI transfers.
The XIP module provides an optional configurable cache to accelerate SPI flash accesses.
.XIP Address Mapping
[NOTE]
When XIP mode is enabled the flash is mapped to fixed address space region starting at address
`0xE0000000` (see section <<_address_space>>) supporting a maximum flash size of 256MB.
.XIP Example Program
[TIP]
An example program is provided in `sw/example/demo_xip` that illustrate how to program and configure
an external SPI flash to run a program from it.
**SPI Configuration**
The XIP module accesses external flash using the standard SPI protocol. The module always sends data MSB-first and
provides all of the standard four clock modes (0..3), which are configured via the `XIP_CTRL_CPOL` (clock polarity)
and `XIP_CTRL_CPHA` (clock phase) control register bits, respectively. The flash's "read command", which initiates
a read access, is defined by the `XIP_CTRL_RD_CMD` control register bits. For most SPI flash memories this is `0x03`
for _normal_ SPI mode.
The SPI clock (`xip_clk_o`) frequency is programmed by the 3-bit `XIP_CTRL_PRSCx` clock prescaler for a coarse clock
selection and a 4-bit clock divider `XPI_CTRL_CDIVx` for a fine clock selection.
The following clock prescalers (`XIP_CTRL_PRSCx`) are available:
.XIP clock prescaler configuration
[cols="<4,^1,^1,^1,^1,^1,^1,^1,^1"]
[options="header",grid="rows"]
|=======================
| **`XIP_CTRL_PRSCx`** | `0b000` | `0b001` | `0b010` | `0b011` | `0b100` | `0b101` | `0b110` | `0b111`
| Resulting `clock_prescaler` | 2 | 4 | 8 | 64 | 128 | 1024 | 2048 | 4096
|=======================
Based on the programmed clock configuration, the actual SPI clock frequency f~SPI~ is derived
from the processor's main clock f~main~ according to the following equation:
_**f~SPI~**_ = _f~main~[Hz]_ / (2 * `clock_prescaler` * (1 + `XPI_CTRL_CDIVx`))
Hence, the maximum SPI clock is f~main~ / 4 and the lowest SPI clock is f~main~ / 131072. The SPI clock is always
symmetric having a duty cycle of 50%.
**High-Speed Mode**
The XIP module provides a high-speed mode to further boost the maximum SPI clock frequency. When enabled via the control
register's `XIP_CTRL_HIGHSPEED` bit the clock prescaler configuration (`XIP_CTRL_PRSCx` bits) is overridden setting it
to a minimal factor of 1. However, the clock speed can still be fine-tuned using the `XPI_CTRL_CDIVx` bits.
_**f~SPI~**_ = _f~main~[Hz]_ / (2 * 1 * (1 + `XPI_CTRL_CDIVx`))
Hence, the maximum SPI clock when in high-speed mode is f~main~ / 2.
**Direct SPI Access**
The XIP module allows to initiate _direct_ SPI transactions. This feature can be used to configure the attached SPI
flash or to perform direct read and write accesses to the flash memory. Two data registers `DATA_LO` and
`DATA_HI` are provided to send up to 64-bit of SPI data. The `DATA_HI` register is write-only,
so a total of just 32-bits of receive data is provided. Note that the module handles the chip-select
line (`xip_csn_o`) by itself so it is not possible to construct larger consecutive transfers.
The actual data transmission size in bytes is defined by the control register's `XIP_CTRL_SPI_NBYTES` bits.
Any configuration from 1 byte to 8 bytes is valid. Other value will result in unpredictable behavior.
Since data is always transferred MSB-first, the data in `DATA_HI:DATA_LO` also has to be MSB-aligned. Receive data is
available in `DATA_LO` only since `DATA_HI` is write-only. Writing to `DATA_HI` triggers the actual SPI transmission.
The `XIP_CTRL_PHY_BUSY` control register flag indicates a transmission being in progress.
The chip-select line of the XIP module (`xip_csn_o`) will only become asserted (enabled, pulled low) if the
`XIP_CTRL_SPI_CSEN` control register bit is set. If this bit is cleared, `xip_csn_o` is always disabled
(pulled high).
[NOTE]
Direct SPI mode is only possible when the module is enabled (setting `XIP_CTRL_EN`) but **before** the actual
XIP mode is enabled via `XIP_CTRL_XIP_EN`.
[TIP]
When the XIP mode is not enabled, the XIP module can also be used as additional general purpose SPI controller
with a transfer size of up to 64 bits per transmission.
**Using the XIP Mode**
The XIP module is globally enabled by setting the `XIP_CTRL_EN` bit in the device's `CTRL` control register.
Clearing this bit will reset the whole module and will also terminate any pending SPI transfer.
Since there is a wide variety of SPI flash components with different sizes, the XIP module allows to specify
the address width of the flash: the number of address bytes used for addressing flash memory content has to be
configured using the control register's _XIP_CTRL_XIP_ABYTES_ bits. These two bits contain the number of SPI
address bytes (**minus one**). For example for a SPI flash with 24-bit addresses these bits have to be set to
`0b10`.
The transparent XIP accesses are transformed into SPI transmissions with the following format (starting with the MSB):
* 8-bit command: configured by the `XIP_CTRL_RD_CMD` control register bits ("SPI read command")
* 8 to 32 bits address: defined by the `XIP_CTRL_XIP_ABYTES` control register bits ("number of address bytes")
* 32-bit data: sending zeros and receiving the according flash word (32-bit)
Hence, the maximum XIP transmission size is 72-bit, which has to be configured via the `XIP_CTRL_SPI_NBYTES`
control register bits. Note that the 72-bit transmission size is only available in XIP mode. The transmission
size of the direct SPI accesses is limited to 64-bit.
[NOTE]
When using four SPI flash address bytes, the most significant 4 bits of the address are always hardwired
to zero allowing a maximum **accessible** flash size of 256MB.
[NOTE]
The XIP module always fetches a full naturally aligned 32-bit word from the SPI flash. Any sub-word data masking
or alignment will be performed by the CPU core logic.
[IMPORTANT]
The XIP mode requires the 4-byte data words in the flash to be ordered in **little-endian** byte order.
After the SPI properties (including the amount of address bytes **and** the total amount of SPI transfer bytes)
and XIP address mapping are configured, the actual XIP mode can be enabled by setting
the control register's `XIP_CTRL_XIP_EN` bit. This will enable the "transparent SPI access port" of the module and thus,
the _transparent_ conversion of access requests into proper SPI flash transmissions. Hence, any access to the processor's
memory-mapped XIP region (`0xE0000000` to `0xEFFFFFFF`) will be converted into SPI flash accesses.
Make sure `XIP_CTRL_SPI_CSEN` is also set so the module can actually select/enable the attached SPI flash.
No more direct SPI accesses via `DATA_HI:DATA_LO` are possible when the XIP mode is enabled. However, the
XIP mode can be disabled at any time.
[NOTE]
If the XIP module is disabled (_XIP_CTRL_EN_ = `0`) any accesses to the memory-mapped XIP flash address region
will raise a bus access exception. If the XIP module is enabled (_XIP_CTRL_EN_ = `1`) but XIP mode is not enabled
yet (_XIP_CTRL_XIP_EN_ = '0') any access to the programmed XIP memory segment will also raise a bus access exception.
[TIP]
It is highly recommended to enable the <<_processor_internal_instruction_cache_icache>> to cover some
of the SPI access latency.
**XIP Cache**
Since every single instruction fetch request from the CPU is translated into serial SPI transmissions the access latency is
very high resulting in a low throughput. In order to improve performance, the XIP module provides an optional cache that
allows to buffer recently-accessed data. The cache is implemented as a simple direct-mapped read-only cache with a configurable
cache layout:
* `XIP_CACHE_EN`: when set to `true` the CIP cache is implemented
* `XIP_CACHE_NUM_BLOCKS` defines the number of cache blocks (or lines)
* `XIP_CACHE_BLOCK_SIZE` defines the size in bytes of each cache block
When the cache is implemented, the XIP module operates in **burst mode** utilizing the flash's _incremental read_ capabilities.
Thus, several bytes (= `XIP_CACHE_BLOCK_SIZE`) are read consecutively from the flash using a single read command.
The XIP cache is cleared when the XIP module is disabled (`XIP_CTRL_EN = 0`), when XIP mode is disabled
(`XIP_CTRL_XIP_EN = 0`) or when the CPU issues a `fence(.i)` instruction.
**Register Map**
.XIP Register Map (`struct NEORV32_XIP`)
[cols="<2,<1,<4,^1,<7"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s), Name [C] | R/W | Function
.15+<| `0xffffff40` .15+<| `CTRL` <|`0` `XIP_CTRL_EN` ^| r/w <| XIP module enable
<|`3:1` `XIP_CTRL_PRSC2 : XIP_CTRL_PRSC0` ^| r/w <| 3-bit SPI clock prescaler select
<|`4` `XIP_CTRL_CPOL` ^| r/w <| SPI clock polarity
<|`5` `XIP_CTRL_CPHA` ^| r/w <| SPI clock phase
<|`9:6` `XIP_CTRL_SPI_NBYTES_MSB : XIP_CTRL_SPI_NBYTES_LSB` ^| r/w <| Number of bytes in SPI transaction (1..9)
<|`10` `XIP_CTRL_XIP_EN` ^| r/w <| XIP mode enable
<|`12:11` `XIP_CTRL_XIP_ABYTES_MSB : XIP_CTRL_XIP_ABYTES_LSB` ^| r/w <| Number of address bytes for XIP flash (minus 1)
<|`20:13` `XIP_CTRL_RD_CMD_MSB : XIP_CTRL_RD_CMD_LSB` ^| r/w <| Flash read command
<|`21` `XIP_CTRL_SPI_CSEN` ^| r/w <| Allow SPI chip-select to be actually asserted when set
<|`22` `XIP_CTRL_HIGHSPEED` ^| r/w <| enable SPI high-speed mode (ignoring `XIP_CTRL_PRSCx`)
<|`23:26` `XIP_CTRL_CDIV3 : XIP_CTRL_CDIV0` ^| r/- <| 4-bit clock divider for fine-tuning
<|`27:28` - ^| r/- <| _reserved_, read as zero
<|`29` `XIP_CTRL_BURST_EN` ^| r/- <| XIP burst mode enabled (if XIP cache is implemented)
<|`30` `XIP_CTRL_PHY_BUSY` ^| r/- <| SPI PHY busy when set
<|`31` `XIP_CTRL_XIP_BUSY` ^| r/- <| XIP access in progress when set
| `0xffffff44` | _reserved_ |`31:0` | r/- | _reserved_, read as zero
| `0xffffff48` | `DATA_LO` |`31:0` | r/w | Direct SPI access - data register low
| `0xffffff4C` | `DATA_HI` |`31:0` | -/w | Direct SPI access - data register high; write access triggers SPI transfer
|=======================

View File

@ -0,0 +1,78 @@
<<<
:sectnums:
==== External Interrupt Controller (XIRQ)
[cols="<3,<3,<4"]
[frame="topbot",grid="none"]
|=======================
| Hardware source file(s): | neorv32_xirq.vhd |
| Software driver file(s): | neorv32_xirq.c |
| | neorv32_xirq.h |
| Top entity port: | `xirq_i` | External interrupts input (32-bit)
| Configuration generics: | `XIRQ_NUM_CH` | Number of external IRQ channels to implement (0..32)
| | `XIRQ_TRIGGER_TYPE` | IRQ trigger type configuration
| | `XIRQ_TRIGGER_POLARITY` | IRQ trigger polarity configuration
| CPU interrupts: | fast IRQ channel 8 | XIRQ (see <<_processor_interrupts>>)
|=======================
**Overview**
The external interrupt controller provides a simple mechanism to implement up to 32 processor-external interrupt
request signals. The external IRQ requests are prioritized, queued and signaled to the CPU via a
_single_ CPU fast interrupt request.
**Theory of Operation**
The XIRQ provides up to 32 external interrupt channels configured via the `XIRQ_NUM_CH` generic. Each bit in the `xirq_i`
input signal vector represents one interrupt channel. If less than 32 channels are configured, only the LSB-aligned channels
are used while the remaining ones are left unconnected internally. The actual interrupt trigger type is configured before
synthesis using the `XIRQ_TRIGGER_TYPE` and `XIRQ_TRIGGER_POLARITY` generics (see table below).
.XIRQ Trigger Configuration
[cols="^2,^2,<3"]
[options="header",grid="all"]
|=======================
| `XIRQ_TRIGGER_TYPE(i)` | `XIRQ_TRIGGER_POLARITY(i)` | Resulting Trigger of `xirq_i(i)`
| `0` | `0` | low-level
| `0` | `1` | high-level
| `1` | `0` | falling-edge
| `1` | `1` | rising-edge
|=======================
The interrupt controller features three interface registers: external interrupt channel enable (`EIE`), external interrupt
channel pending (`EIP`) and external interrupt source (`ESC`). From a functional point of view, the functionality of these
registers follow the one of the RISC-V <<_mie>>, <<_mip>> and <<_mcause>> CSRs.
If the configured trigger of an interrupt channel fires (e.g. a rising edge) the according interrupt channel becomes _pending_,
which is indicated by the according channel bit being set in the `EIP` register. This pending interrupt can be cleared at any time
by writing zero to the according `EIP` bit.
A pending interrupt can only trigger a CPU interrupt if the according is enabled via the `EIE` register. Once triggered, disabled
channels that were triggered remain pending until explicitly cleared. The channels are prioritized in a static order, i.e. channel 0
(`xirq_i(0)`) has the highest priority and channel 31 (`xirq_i(31)`) has the lowest priority. If any pending interrupt channel is
actually enabled, an interrupt request is sent to the CPU.
The CPU can determine the most prioritized external interrupt request either by checking the bits in the `IPR` register or by reading
the interrupt source register `ESC`. This register provides a 5-bit wide ID (0..31) identifying the currently firing external interrupt.
Writing _any_ value to this register will acknowledge the _current_ XIRQ interrupt (so the XIRQ controller can issue a new CPU interrupt).
In order to acknowledge an XIRQ interrupt, the interrupt handler has to...
* clear the pending CPU FIRQ by clearing the according <<_mip>> CSR bit
* clear the pending XIRQ channel by clearing the according `EIP` bit
* writing _any_ value to `ESC` to acknowledge the XIRQ interrupt
**Register Map**
.XIRQ register map (`struct NEORV32_XIRQ`)
[cols="^4,<2,^2,^2,<14"]
[options="header",grid="all"]
|=======================
| Address | Name [C] | Bit(s) | R/W | Description
| `0xfffff300` | `EIE` | `31:0` | r/w | External interrupt enable register (one bit per channel, LSB-aligned)
| `0xfffff304` | `EIP` | `31:0` | r/w | External interrupt pending register (one bit per channel, LSB-aligned); writing 0 to a bit clears the according pending interrupt
| `0xfffff308` | `ESC` | `4:0` | r/w | Interrupt source ID (0..31) of firing IRQ (prioritized!); writing _any_ value will acknowledge the current XIRQ interrupt
| `0xfffff30c` | - | `31:0` | r/- | _reserved_, read as zero
|=======================

View File

@ -0,0 +1,550 @@
:sectnums:
== Software Framework
The NEORV32 project comes with a complete software ecosystem called the "software framework", which
is based on the C-language RISC-V GCC port and consists of the following parts:
* <<_compiler_toolchain>>
* <<_core_libraries>>
* <<_application_makefile>>
* <<_executable_image_format>>
** <<_linker_script>>
** <<_ram_layout>>
** <<_c_standard_library>>
** <<_start_up_code_crt0>>
* <<_bootloader>>
* <<_neorv32_runtime_environment>>
A summarizing list of the most important elements of the software framework and their according
files and folders is shown below:
[cols="<5,<5"]
[grid="none"]
|=======================
| Application start-up code | `sw/common/crt0.S`
| Application linker script | `sw/common/neorv32.ld`
| Core hardware driver libraries ("HAL") | `sw/lib/include/` & `sw/lib/source/`
| Central application makefile | `sw/common/common.mk`
| Tool for generating NEORV32 executables | `sw/image_gen/`
| Default bootloader | `sw/bootloader`
| Example programs | `sw/example`
|=======================
.Software Documentation
[TIP]
All core libraries and example programs are documented "in-code" using **Doxygen**.
The documentation is automatically built and deployed to GitHub pages and is available online
at https://stnolting.github.io/neorv32/sw/files.html.
.Example Programs
[TIP]
A collection of annotated example programs, which show how to use certain CPU functions
and peripheral/IO modules, can be found in `sw/example`.
// ####################################################################################################################
:sectnums:
=== Compiler Toolchain
The toolchain for this project is based on the free and open RISC-V GCC-port. You can find the compiler sources and
build instructions on the official RISC-V GNU toolchain GitHub page: https://github.com/riscv/riscv-gnutoolchain.
The NEORV32 implements a 32-bit RISC-V architecture and uses a 32-bit integer and soft-float ABI by default.
Make sure the toolchain / toolchain build is configured accordingly.
* `MARCH=rv32i`
* `MABI=ilp32`
* `RISCV_PREFIX=riscv32-unknown-elf-`
These default configurations can be overridden at any times using <<_application_makefile>> variables.
[TIP]
More information regarding the toolchain (building from scratch or downloading prebuilt ones) can be found in the
user guide section https://stnolting.github.io/neorv32/ug/#_software_toolchain_setup[Software Toolchain Setup].
<<<
// ####################################################################################################################
:sectnums:
=== Core Libraries
The NEORV32 project provides a set of pre-defined C libraries that allow an easy integration of the processor/CPU features
(also called "HAL" - hardware abstraction layer). All driver and runtime-related files are located in
`sw/lib`. These are automatically included and linked by adding the following include statement:
[source,c]
----
#include <neorv32.h> // NEORV32 HAL, core and runtime libraries
----
.NEORV32 HAL File List
[cols="<3,<3,<6"]
[options="header",grid="rows"]
|=======================
| C source file | C header file | Description
| - | `neorv32.h` | Main NEORV32 library file
| `neorv32_cfs.c` | `neorv32_cfs.h` | <<_custom_functions_subsystem_cfs>> HAL
| `neorv32_crc.c` | `neorv32_crc.h` | <<_cyclic_redundancy_check_crc>> HAL
| `neorv32_cpu.c` | `neorv32_cpu.h` | <<_neorv32_central_processing_unit_cpu>> HAL
| `neorv32_cpu_amo.c` | `neorv32_cpu_amo.h` | Emulation functions for the read-modify-write <<_a_isa_extension>> instructions
| | `neorv32_cpu_csr.h` | <<_control_and_status_registers_csrs>> definitions
| `neorv32_cpu_cfu.c` | `neorv32_cpu_cfu.h` | <<_custom_functions_unit_cfu>> HAL
| - | `neorv32_dm.h` | <<_debug_module_dm>> HAL
| `neorv32_dma.c` | `neorv32_dma.h` | <<_direct_memory_access_controller_dma>> HAL
| `neorv32_gpio.c` | `neorv32_gpio.h` | <<_general_purpose_input_and_output_port_gpio>> HAL
| `neorv32_gptmr.c` | `neorv32_gptmr.h` | <<_general_purpose_timer_gptmr>> HAL
| - | `neorv32_intrinsics.h` | Macros for intrinsics & custom instructions
| `neorv32_mtime.c` | `neorv32_mtime.h` | <<_machine_system_timer_mtime>> HAL
| `neorv32_neoled.c` | `neorv32_neoled.h` | <<_smart_led_interface_neoled>> HAL
| `neorv32_onewire.c` | `neorv32_onewire.h` | <<_one_wire_serial_interface_controller_onewire>> HAL
| `neorv32_pwm.c` | `neorv32_pwm.h` | <<_pulse_width_modulation_controller_pwm>> HAL
| `neorv32_rte.c` | `neorv32_rte.h` | <<_neorv32_runtime_environment>>
| `neorv32_sdi.c` | `neorv32_sdi.h` | <<_serial_data_interface_controller_sdi>> HAL
| `neorv32_slink.c` | `neorv32_slink.h` | <<_stream_link_interface_slink>> HAL
| `neorv32_spi.c` | `neorv32_spi.h` | <<_serial_peripheral_interface_controller_spi>> HAL
| - | `neorv32_sysinfo.h` | <<_system_configuration_information_memory_sysinfo>> HAL
| `neorv32_trng.c` | `neorv32_trng.h` | <<_true_random_number_generator_trng>> HAL
| `neorv32_twi.c` | `neorv32_twi.h` | <<_two_wire_serial_interface_controller_twi>> HAL
| `neorv32_uart.c` | `neorv32_uart.h` | <<_primary_universal_asynchronous_receiver_and_transmitter_uart0>> and UART1 HAL
| `neorv32_wdt.c` | `neorv32_wdt.h` | <<_watchdog_timer_wdt>> HAL
| `neorv32_xip.c` | `neorv32_xip.h` | <<_execute_in_place_module_xip>> HAL
| `neorv32_xirq.c` | `neorv32_xirq.h` | <<_external_interrupt_controller_xirq>> HAL
| `syscalls.c` | - | Newlib "system calls" (stubs)
| - | `legacy.h` | Backwards compatibility wrappers and functions (do not use for new designs)
|=======================
.Core Library Documentation
[TIP]
The _doxygen_-based documentation of the software framework including all core libraries is available online at
https://stnolting.github.io/neorv32/sw/files.html.
.CMSIS System View Description File (SVD)
[TIP]
A CMSIS-SVD-compatible **System View Description (SVD)** file including all peripherals is available in `sw/svd`.
Together with a third-party plugin the processor's SVD file can be imported right into GDB to allow comfortable
debugging of peripheral/IO devices (see https://github.com/stnolting/neorv32/discussions/656).
<<<
// ####################################################################################################################
:sectnums:
=== Application Makefile
Application compilation is based on a single, centralized GNU makefile (`sw/common/common.mk`). Each project in the
`sw/example` folder provides a makefile that just _includes_ this central makefile.
[TIP]
When creating a new project, copy an existing project folder or at least the makefile to the new project folder.
It is recommended to create new projects also in `sw/example` to keep the file dependencies. However, these
dependencies can be manually configured via makefile variables if the new project is located somewhere else.
[NOTE]
Before the makefile can be used to compile applications, the RISC-V GCC toolchain needs to be installed and
the compiler's `bin` folder has to be added to the system's `PATH` environment variable. More information can be
found in https://stnolting.github.io/neorv32/ug/#_software_toolchain_setup[User Guide: Software Toolchain Setup].
:sectnums:
==== Makefile Targets
Just executing `make` (or executing `make help`) will show the help menu listing all available targets.
[source,makefile]
----
$ make
NEORV32 Software Application Makefile
Find more information at https://github.com/stnolting/neorv32
Targets:
help - show this text
check - check toolchain
info - show makefile/toolchain configuration
gdb - run GNU debugging session
asm - compile and generate <main.asm> assembly listing file for manual debugging
elf - compile and generate <main.elf> ELF file
bin - compile and generate <neorv32_raw_exe.bin> RAW executable file (binary file, no header)
hex - compile and generate <neorv32_raw_exe.hex> RAW executable file (hex char file, no header)
image - compile and generate VHDL IMEM boot image (for application, no header) in local folder
install - compile, generate and install VHDL IMEM boot image (for application, no header)
sim - in-console simulation using default/simple testbench and GHDL
all - exe + install + hex + bin + asm
elf_info - show ELF layout info
clean - clean up project home folder
clean_all - clean up whole project, core libraries and image generator
bl_image - compile and generate VHDL BOOTROM boot image (for bootloader only, no header) in local folder
bootloader - compile, generate and install VHDL BOOTROM boot image (for bootloader only, no header)
Variables:
USER_FLAGS - Custom toolchain flags [append only]: ""
USER_LIBS - Custom libraries [append only]: ""
EFFORT - Optimization level: "-Os"
MARCH - Machine architecture: "rv32i_zicsr_zifencei"
MABI - Machine binary interface: "ilp32"
APP_INC - C include folder(s) [append only]: "-I ."
ASM_INC - ASM include folder(s) [append only]: "-I ."
RISCV_PREFIX - Toolchain prefix: "riscv32-unknown-elf-"
NEORV32_HOME - NEORV32 home folder: "../../.."
GDB_ARGS - GDB (connection) arguments: "-ex target extended-remote localhost:3333"
GHDL_RUN_FLAGS - GHDL simulation run arguments: ""
----
:sectnums:
==== Makefile Configuration
The compilation flow is configured via variables right at the beginning of the central
makefile (`sw/common/common.mk`):
.Customizing Makefile Variables
[TIP]
The makefile configuration variables can be overridden or extended directly when invoking the makefile. For
example `$ make MARCH=rv32ic_zicsr_zifencei clean_all exe` overrides the default `MARCH` variable definitions.
.Default Makefile Configuration
[source,makefile]
----
# *****************************************************************************
# USER CONFIGURATION
# *****************************************************************************
# User's application sources (*.c, *.cpp, *.s, *.S); add additional files here
APP_SRC ?= $(wildcard ./*.c) $(wildcard ./*.s) $(wildcard ./*.cpp) $(wildcard ./*.S)
# User's application include folders (don't forget the '-I' before each entry)
APP_INC ?= -I .
# User's application include folders - for assembly files only (don't forget the '-I' before each
entry)
ASM_INC ?= -I .
# Optimization
EFFORT ?= -Os
# Compiler toolchain
RISCV_PREFIX ?= riscv32-unknown-elf-
# CPU architecture and ABI
MARCH ?= rv32i_zicsr_zifencei
MABI ?= ilp32
# User flags for additional configuration (will be added to compiler flags)
USER_FLAGS ?=
# User libraries (will be included by linker)
USER_LIBS ?=
# Relative or absolute path to the NEORV32 home folder
NEORV32_HOME ?= ../../..
# GDB arguments
GDB_ARGS ?= -ex "target extended-remote localhost:3333"
# *****************************************************************************
----
.Variables Description
[cols="<2,<8"]
[grid="none"]
|=======================
| `APP_SRC` | The source files of the application (`*.c`, `*.cpp`, `*.S` and `*.s` files are allowed; files of these types in the project folder are automatically added via wild cards). Additional files can be added separated by white spaces
| `APP_INC` | Include file folders; separated by white spaces; must be defined with `-I` prefix
| `ASM_INC` | Include file folders that are used only for the assembly source files (`*.S`/`*.s`).
| `EFFORT` | Optimization level, optimize for size (`-Os`) is default; legal values: `-O0`, `-O1`, `-O2`, `-O3`, `-Os`, `-Ofast`, ...
| `RISCV_PREFIX` | The toolchain prefix to be used; follows the triplet naming convention `[architecture]-[host_system]-[output]-...`
| `MARCH` | The targeted RISC-V architecture/ISA
| `MABI` | Application binary interface (default: 32-bit integer ABI `ilp32`)
| `USER_FLAGS` | Additional flags that will be forwarded to the compiler tools
| `USER_LIBS` | Additional libraries to include during linking (`*.a`)
| `NEORV32_HOME` | Relative or absolute path to the NEORV32 project home folder; adapt this if the makefile/project is not in the project's default `sw/example` folder
| `GDB_ARGS` | Default GDB arguments when running the `gdb` target
| `GHDL_RUN_FLAGS` | GHDL run arguments (e.g. `--stop-time=1ms`)
|=======================
:sectnums:
==== Default Compiler Flags
The following default compiler flags are used for compiling an application. These flags are defined via the
`CC_OPTS` variable.
[TIP]
The makefile's `CC_OPTS` is exported as **define** to be available within a C program; for example
`neorv32_uart0_printf("%s\n", CC_OPTS);`.
[cols="<3,<9"]
[grid="none"]
|=======================
| `-Wall` | Enable all compiler warnings.
| `-ffunction-sections` | Put functions and data segment in independent sections. This allows a code optimization as dead code and unused data can be easily removed.
| `-nostartfiles` | Do not use the default start code. Instead, the NEORV32-specific start-up code (`sw/common/crt0.S`) is used (pulled-in by the linker script).
| `-Wl,--gc-sections` | Make the linker perform dead code elimination.
| `-lm` | Include/link with `math.h`.
| `-lc` | Search for the standard C library when linking.
| `-lgcc` | Make sure we have no unresolved references to internal GCC library subroutines.
| `-mno-fdiv` | Use built-in software functions for floating-point divisions and square roots (since the according instructions are not supported yet).
| `-g` | Include debugging information/symbols in ELF.
| `-mstrict-align` | Unaligned memory accesses cannot be resolved by the hardware and require emulation.
| `-mbranch-cost=10` | Branching costs a lot of cycles.
|=======================
:sectnums:
==== Custom (Compiler) Flags
Custom flags can be _appended_ to the `USER_FLAGS` variable. This allows to customize the entire software framework while
calling `make` without the need to change the makefile(s) or the linker script. The following example will add debug symbols
to the executable (`-g`) and will also re-define the linker script's `__neorv32_heap_size` variable setting the maximal heap
size to 4096 bytes (see sections <<_linker_script>> and <<_ram_layout>>):
.Using the `USER_FLAGS` Variable for Customization
[source,bash]
----
$ make USER_FLAGS+="-g -Wl,--__neorv32_heap_size,__heap_size=4096" clean_all exe
----
The configuration can also be made "permanent" by adapting the application's makefile (make sure to use the
`override` command here):
.Using the `USER_FLAGS` Variable for Permanent Customization
[source,makefile]
----
override USER_FLAGS += "-g -Wl,--__neorv32_heap_size,__heap_size=4096"
----
<<<
// ####################################################################################################################
:sectnums:
=== Executable Image Format
In order to generate an executable for the processors all source files have to be compiled, linked
and packed into a final executable.
:sectnums:
==== Linker Script
After all the application sources have been compiled, they need to be _linked_.
For this purpose the makefile uses the NEORV32-specific linker script `sw/common/neorv32.ld` for
linking all object files that were generated during compilation. In general, the linker script defines
two final memory sections: `rom` and `ram`.
.Linker script - memory sections
[cols="<2,<8"]
[options="header",grid="rows"]
|=======================
| Memory section | Description
| `ram` | Data memory address space (processor-internal/external DMEM)
| `rom` | Instruction memory address space (processor-internal/external IMEM) _or_ internal bootloader ROM
|=======================
[NOTE]
The `rom` section is automatically re-mapped to the processor-internal <<_bootloader_rom_bootrom>> when (re-)compiling the
bootloader
Each section has two main attributes: `ORIGIN` and `LENGTH`. `ORIGIN` defines the base address of the according section
while `LENGTH` defines its size in bytes. The attributes are configured indirectly via variables that provide default values.
.Linker script - section configuration
[source]
----
/* Default rom/ram (IMEM/DMEM) sizes */
__neorv32_rom_size = DEFINED(__neorv32_rom_size) ? __neorv32_rom_size : 2048M;
__neorv32_ram_size = DEFINED(__neorv32_ram_size) ? __neorv32_ram_size : 8K;
/* Default section base addresses */
__neorv32_rom_base = DEFINED(__neorv32_rom_base) ? __neorv32_rom_base : 0x00000000;
__neorv32_ram_base = DEFINED(__neorv32_ram_base) ? __neorv32_ram_base : 0x80000000;
----
The region size and base address configuration can be edited by the user - either by explicitly
changing the default values in the linker script or by overriding them when invoking `make`:
.Overriding default `rom` size configuration (configuring 4096 bytes)
[source, bash]
----
$ make USER_FLAGS+="-Wl,--defsym,__neorv32_rom_size=4096" clean_all exe
----
[IMPORTANT]
`__neorv32_rom_base` (= `ORIGIN` of the `ram` section) and `__neorv32_ram_base` (= `ORIGIN` of the `rom` section) have to
be sync to the actual memory layout configuration of the processor (see section <<_address_space>>).
[NOTE]
The default configuration for the `rom` section assumes a maximum of 2GB _logical_ memory address space. This size does not
have to reflect the _actual_ physical size of the entire instruction memory. It just provides a maximum limit. When uploading
a new executable via the bootloader, the bootloader itself checks if sufficient _physical_ instruction memory is available.
If a new executable is embedded right into the internal-IMEM the synthesis tool will check, if the configured instruction memory
size is sufficient.
The linker maps all the regions from the compiled object files into five final sections: `.text`,
`.rodata`, `.data`, `.bss` and `.heap`:
.Linker script - memory regions
[cols="<1,<9"]
[options="header",grid="rows"]
|=======================
| Region | Description
| `.text` | Executable instructions generated from the start-up code and all application sources.
| `.rodata` | Constants (like strings) from the application; also the initial data for initialized variables.
| `.data` | This section is required for the address generation of fixed (= global) variables only.
| `.bss` | This section is required for the address generation of dynamic memory constructs only.
| `.heap` | This section is required for the address generation of dynamic memory constructs only.
|=======================
The `.text` and `.rodata` sections are mapped to processor's instruction memory space and the `.data`,
`.bss` and `heap` sections are mapped to the processor's data memory space. Finally, the `.text`, `.rodata` and `.data`
sections are extracted and concatenated into a single file `main.bin`.
.Section Alignment
[NOTE]
The default NEORV32 linker script aligns _all_ regions so they start and end on a 32-bit (word) boundaries. The default
NEORV32 start-up code (crt0) makes use of this alignment by using word-level memory instructions to initialize the `.data`
section and to clear the `.bss` section (faster!).
:sectnums:
==== RAM Layout
The default NEORV32 linker script uses all of the defined RAM (linker script memory section `ram`) to several sections.
Note that depending on the application some sections might have zero size.
.Default RAM Layout
image::ram_layout.png[400]
[start=1]
. **Constant data (`.data`)**: The constant data section is placed right at the beginning of the RAM. For example, this section
contains _explicitly initialized_ global variables. This section is initialized by the executable.
. **Dynamic data (`.bss`)**: The constant data section is followed by the dynamic data section, which contains _uninitialized_ data
like global variables without explicit initialization. This section is cleared by the start-up code `crt0.S`.
. **Heap (`.heap`)**: The heap is used for dynamic memory that is managed by functions like `malloc()` and `free()`. The heap
grows upwards. This section is not initialized at all.
. **Stack**: The stack starts at the very end of the RAM at address `ORIGIN(ram) + LENGTH(ram) - 4`. The stack grows downwards.
There is _no explicit limit_ for the maximum stack size as this is hard to check. However, a physical memory protection rule could
be used to configure a maximum size by adding a "protection area" between stack and heap (a PMP region without any access rights).
.Heap Size
[IMPORTANT]
The maximum size of the heap is defined by the linker script's `__neorv32_heap_size` variable. This variable has to be
**explicitly defined** in order to define a heap size (and to use dynamic memory allocation at all) other than zero. The user
can define the heap size while invoking the application makefile: `$ USER_FLAGS+="-Wl,--defsym,__neorv32_heap_size=4k" make clean_all exe`
(defines a heap size of 4*1024 bytes).
.Heap-Stack Collisions
[WARNING]
Take care when using dynamic memory to avoid collision of the heap and stack memory areas. There is no compile-time protection
mechanism available as the actual heap and stack size are defined by _runtime_ data. Also beware of fragmentation when
using dynamic memory allocation.
:sectnums:
==== C Standard Library
The default software framework relies on **newlib** as default C standard library.
.RTOS Support
[NOTE]
The NEORV32 CPU and processor **do support** embedded RTOS like FreeRTOS and Zephyr. See the User guide section
https://stnolting.github.io/neorv32/ug/#_zephyr_rtos_support[Zephyr RTOS Support] and
https://stnolting.github.io/neorv32/ug/#_freertos_support[FreeRTOS Support]
for more information. +
+
The FreeRTOS port and demo is available in a separate repository: https://github.com/stnolting/neorv32-freertos
Newlib provides stubs for common "system calls" (like file handling and standard input/output) that are used by other
C libraries like `stdio`. These stubs are available in `sw/source/source/syscalls.c` and were adapted for the NEORV32 processor.
.Standard Consoles
[NOTE]
The <<_primary_universal_asynchronous_receiver_and_transmitter_uart0, UART0>>
is used to implement all the standard input, output and error consoles (`STDIN`, `STDOUT` and `STDERR`).
.Constructors and Destructors
[NOTE]
Constructors and destructors for plain C code or for C++ applications are supported by the software framework.
See `sw/example/hello_cpp` for a minimal example.
.Newlib Test/Demo Program
[TIP]
A simple test and demo program, which uses some of newlib's core functions (like `malloc`/`free` and `read`/`write`)
is available in `sw/example/demo_newlib`
:sectnums:
==== Executable Image Generator
The `main.bin` file is packed by the NEORV32 image generator (`sw/image_gen`) to generate the final executable file.
The image generator can generate several types of executables selected by a flag when calling the generator:
[cols="<2,<8"]
[grid="none"]
|=======================
| `-app_bin` | Generates an executable binary file `neorv32_exe.bin` (including header) for UART uploading via the bootloader.
| `-app_img` | Generates an executable VHDL memory initialization image (no header) for the processor-internal IMEM. This option generates the `rtl/core/neorv32_application_image.vhd` file.
| `-raw_hex` | Generates a plain ASCII hex-char file `neorv32_raw_exe.hex` (no header) for custom purpose.
| `-raw_bin` | Generates a plain binary file `neorv32_raw_exe.bin` (no header) for custom purpose.
| `-bld_img` | Generates an executable VHDL memory initialization image (no header) for the processor-internal BOOT ROM. This option generates the `rtl/core/neorv32_bootloader_image.vhd` file.
|=======================
All these options are managed by the makefile. The normal application compilation flow will generate the `neorv32_exe.bin`
executable designated for uploading via the default NEORV32 bootloader.
.Image Generator Compilation
[NOTE]
The sources of the image generator are automatically compiled when invoking the makefile (requiring a native GCC installation).
.Executable Header
[NOTE]
The image generator add a small header to the `neorv32_exe.bin` executable, which consists of three 32-bit words located right
at the beginning of the file. The first word of the executable is the signature word and is always `0x4788cafe`. Based on this
word the bootloader can identify a valid image file. The next word represents the size in bytes of the actual program image in
bytes. A simple "complement" checksum of the actual program image is given by the third word. This provides a simple protection
against data transmission or storage errors. **Note that this executable format cannot be used for _direct_ execution (e.g. via
XIP or direct memory access).**
:sectnums:
==== Start-Up Code (crt0)
The CPU and also the processor require a minimal start-up and initialization code to bring the CPU (and the SoC)
into a stable and initialized state and to initialize the C runtime environment before the actual application can be executed.
This start-up code is located in `sw/common/crt0.S` and is automatically linked _every_ application program
and placed right before the actual application code so it gets executed right after reset.
The `crt0.S` start-up performs the following operations:
[start=1]
. Clear <<_mstatus>>.
. Clear <<_mie>> disabling all interrupt sources.
. Install an <<_early_trap_handler>> to <<_mtvec>>.
. Initialize the global pointer `gp` and the stack pointer `sp` according to the <<_ram_layout>> provided by the linker script.
. Initialize all integer register `x1` - `x31` (only `x1` - `x15` if the `E` CPU extension is enabled).
. Setup `.data` section to configure initialized variables.
. Clear the `.bss` section.
. Call all _constructors_ (if there are any).
. Call the application's `main` function (with no arguments: `argc` = `argv` = 0).
. If `main` returns:
** All interrupt sources are disabled by clearing <<_mie>>.
** The return value of `main` is copied to the <<_mscratch>> CSR to allow inspection by the debugger.
** Call all _destructors_ (if there are any).
** The CPU enters sleep mode executing the `wfi` instruction in an endless loop.
.Bootloader Start-Up Code
[NOTE]
The bootloader uses the same start-up code as any "usual" application. However, certain parts are omitted when compiling
`crt0` for the bootloader (like calling constructors and destructors). See the `crt0` source code for more information.
:sectnums:
===== Early Trap Handler
The start-up code provides a very basic trap handler for the early boot stage. This handler does nothing but trying to move
on to the next linear instruction whenever an interrupt or synchronous exception is encountered.
This simple trap handler does not interact with the stack at all as it just uses a single register that is backup-ed
using the <<_mscratch>> CSR. Furthermore, the information if the trap-causing instruction is compressed or uncompressed
is **not** determined by loading the instruction from memory. Instead, the transformed instruction word is read from the
<<_mtinst>> CSRs. These two features allow the trap handler to execute with minimal latency and high robustness.
[NOTE]
The early-trap handler should be replaced by a more capable / informative one as soon as the application software is started
(for example by using the <<_neorv32_runtime_environment>>).
<<<
// ####################################################################################################################
include::software_bootloader.adoc[]
<<<
// ####################################################################################################################
include::software_rte.adoc[]

View File

@ -0,0 +1,223 @@
:sectnums:
=== Bootloader
.Pre-Built Bootloader Image
[IMPORTANT]
This section refers to the **default** NEORV32 bootloader. A pre-compiled memory image for the processor-internal
<<_bootloader_rom_bootrom>> is available in the project's +rtl+ folder: `rtl/core/neorv32_bootloader_image.vhd`.
This image is automatically inserted into the boot ROM when synthesizing the processor with the bootloader being
enabled. **Note that the default bootloader image was compiled for a minimal `rv32i` + priv. ISA!**
The NEORV32 bootloader (`sw/bootloader/bootloader.c`) provides an optional built-in firmware that
allows to upload new application executables at _any time_ without the need to re-synthesize the FPGA's bitstream.
A UART connection is used to provide a simple text-based user interface that allows to upload executables.
Furthermore, the bootloader provides options to store an executable to a processor-external SPI flash.
An "auto boot" feature can optionally fetch this executable right after reset if there is no user interaction
via UART. This allows to build processor setups with _non-volatile application storage_ while maintaining the option
to update the application software at any timer.
:sectnums:
==== Bootloader SoC/CPU Requirements
The bootloader requires certain CPU and SoC extensions and modules to be enabled in order to operate correctly.
[cols="^2,<8"]
[grid="none"]
|=======================
| **REQUIRED** | The bootloader is implemented only if the `INT_BOOTLOADER_EN` top generic is `true`. This will automatically select the CPU's <<_indirect_boot>> boot configuration.
| **REQUIRED** | The bootloader requires the privileged architecture CPU extension (<<_zicsr_isa_extension>>) to be enabled.
| **REQUIRED** | At least 512 bytes of data memory (processor-internal DMEM or processor-external DMEM) are required for the bootloader's stack and global variables.
| _RECOMMENDED_ | For user interaction via the <<_bootloader_console>> (like uploading executables) the primary UART (<<_primary_universal_asynchronous_receiver_and_transmitter_uart0>>) is required.
| _RECOMMENDED_ | The default bootloader uses bit 0 of the <<_general_purpose_input_and_output_port_gpio>> output port to drive a high-active "heart beat" status LED.
| _RECOMMENDED_ | The <<_machine_system_timer_mtime>> is used to control blinking of the status LED and also to automatically trigger the <<_auto_boot_sequence>>.
| OPTIONAL | The SPI controller (<<_serial_peripheral_interface_controller_spi>>) is needed to store/load executable from external flash using the <<_auto_boot_sequence>>.
| OPTIONAL | The XIP controller (<<_execute_in_place_module_xip>>) is needed to boot/execute code directly from a pre-programmed SPI flash.
|=======================
:sectnums:
==== Bootloader Flash Requirements
The bootloader can access an SPI-compatible flash via the processor's top entity SPI port. By default, the flash
chip-select line is driven by `spi_csn_o(0)` and the SPI clock uses 1/8 of the processor's main clock as clock frequency.
The SPI flash has to support single-byte read and write operations, 24-bit addresses and at least the following standard commands:
* `0x02`: Program page (write byte)
* `0x03`: Read data (byte)
* `0x04`: Write disable (for volatile status register)
* `0x05`: Read (first) status register
* `0x06`: Write enable (for volatile status register)
* `0xAB`: Wake-up from sleep mode (optional)
* `0xD8`: Block erase (64kB)
.Custom Configuration
[TIP]
Most properties (like chip select line, flash address width, SPI clock frequency, ...) of the default bootloader can be reconfigured
without the need to change the source code. Custom configuration can be made using command line switches (defines) when recompiling
the bootloader. See the User Guide https://stnolting.github.io/neorv32/ug/#_customizing_the_internal_bootloader for more information.
:sectnums:
==== Bootloader Console
To interact with the bootloader, connect the primary UART (UART0) signals (`uart0_txd_o` and `uart0_rxd_o`) of the processor's top
entity via a serial port (-adapter) to your computer (hardware flow control is not used so the according interface signals can be
ignored), configure your terminal program using the following settings and perform a reset of the processor.
Terminal console settings (`19200-8-N-1`):
* 19200 Baud
* 8 data bits
* no parity bit
* 1 stop bit
* newline on `\r\n` (carriage return, newline)
* no transfer protocol / control flow protocol - just raw bytes
.Terminal Program
[IMPORTANT]
Any terminal program that can connect to a serial port should work. However, make sure the program
can transfer data in _raw_ byte mode without any protocol overhead (e.g. XMODEM). Some terminal programs struggle with
transmitting files larger than 4kB (see https://github.com/stnolting/neorv32/pull/215). Try a different terminal program
if uploading of a binary does not work.
The bootloader uses the LSB of the top entity's `gpio_o` output port as high-active status LED. All other
output pins are set to low level and won't be altered. After reset, the status LED will start blinking at 2Hz and the
following intro screen shows up:
[source]
----
<< NEORV32 Bootloader >>
BLDV: Mar 7 2023
HWV: 0x01080107
CLK: 0x05f5e100
MISA: 0x40901106
XISA: 0xc0000fab
SOC: 0xffff402f
IMEM: 0x00008000
DMEM: 0x00002000
Autoboot in 8s. Press any key to abort.
----
The start-up screen gives some brief information about the bootloader and several system configuration parameters:
[cols="<2,<15"]
[grid="none"]
|=======================
| `BLDV` | Bootloader version (built date).
| `HWV` | Processor hardware version (the <<_mimpid>> CSR); in BCD format; example: `0x01040606` = v1.4.6.6).
| `CLK` | Processor clock speed in Hz (via the `CLK` register from the <<_system_configuration_information_memory_sysinfo>>.
| `MISA` | RISC-V CPU extensions (<<_misa>> CSR).
| `XISA` | NEORV32-specific CPU extensions (<<_mxisa>> CSR).
| `SOC` | Processor configuration (via the `SOC` register from the <<_system_configuration_information_memory_sysinfo>>.
| `IMEM` | Internal IMEM size in byte (via the `MEM` register from the <<_system_configuration_information_memory_sysinfo>>.
| `DMEM` | Internal DMEM size in byte (via the `MEM` register from the <<_system_configuration_information_memory_sysinfo>>.
|=======================
Now you have 8 seconds to press _any_ key. Otherwise, the bootloader starts the <<_auto_boot_sequence>>. When
you press any key within the 8 seconds, the actual bootloader user console starts:
[source]
----
<< NEORV32 Bootloader >>
BLDV: Mar 7 2023
HWV: 0x01080107
CLK: 0x05f5e100
MISA: 0x40901106
XISA: 0xc0000fab
SOC: 0xffff402f
IMEM: 0x00008000
DMEM: 0x00002000
Autoboot in 8s. Press any key to abort. <1>
Aborted.
Available CMDs:
h: Help
r: Restart
u: Upload
s: Store to flash
l: Load from flash
x: Boot from flash (XIP)
e: Execute
CMD:>
----
<1> Auto boot sequence aborted due to user console input.
The auto boot countdown is stopped and the bootloader's user console is ready to receive one of the following commands:
* `h`: Show the help text (again)
* `r`: Restart the bootloader and the auto-boot sequence
* `u`: Upload new program executable (`neorv32_exe.bin`) via UART into the instruction memory
* `s`: Store executable to SPI flash at `spi_csn_o(0)` (little-endian byte order)
* `l`: Load executable from SPI flash at `spi_csn_o(0)` (little-endian byte order)
* `x`: Boot program directly from flash via XIP (requires a pre-programmed image)
* `e`: Start the application, which is currently stored in the instruction memory (IMEM)
A new executable can be uploaded via UART by executing the `u` command. After that, the executable can be directly
executed via the `e` command. To store the recently uploaded executable to an attached SPI flash press `s`. To
directly load an executable from the SPI flash press `l`. The bootloader and the auto-boot sequence can be
manually restarted via the `r` command.
.Executable Upload
[IMPORTANT]
Make sure to upload the NEORV32 executable `neorv32_exe.bin`. Uploading any other file (like `main.bin`)
will cause an `ERR_EXE` bootloader error (see <<_bootloader_error_codes>>).
.Booting via XIP
[NOTE]
The bootloader allows to execute an application right from flash using the <<_execute_in_place_module_xip>> module.
This requires a pre-programmed flash. The bootloader's "store" option can **not** be used to program an XIP image.
.SPI Flash Power Down Mode
[NOTE]
The bootloader will issue a "wake-up" command prior to using the SPI flash to ensure it is not
in sleep mode / power-down mode (see https://github.com/stnolting/neorv32/pull/552).
.Default Configuration
[TIP]
More information regarding the default SPI, GPIO, XIP, etc. configuration can be found in the User Guide
section https://stnolting.github.io/neorv32/ug/#_customizing_the_internal_bootloader.
.SPI Flash Programming
[TIP]
For detailed information on using an SPI flash for application storage see User Guide section
https://stnolting.github.io/neorv32/ug/#_programming_an_external_spi_flash_via_the_bootloader[Programming an External SPI Flash via the Bootloader].
:sectnums:
==== Auto Boot Sequence
When you reset the NEORV32 processor, the bootloader waits 8 seconds for a UART console input before it
starts the automatic boot sequence. This sequence tries to fetch a valid boot image from the external SPI
flash, connected to SPI chip select `spi_csn_o(0)`. If a valid boot image is found that can be successfully
transferred into the instruction memory, it is automatically started. If no SPI flash is detected or if there
is no valid boot image found, and error code will be shown.
:sectnums:
==== Bootloader Error Codes
If something goes wrong during bootloader operation an error code and a short message is shown. In this case the processor
is halted, the bootloader status LED is permanently activated and the processor has to be reset manually.
[TIP]
In many cases the error source is just _temporary_ (like some HF spike during an UART upload). Just try again.
[cols="<2,<8"]
[grid="rows"]
|=======================
| **`ERR_EXE`** | If you try to transfer an invalid executable (via UART or from the external SPI flash), this error message shows up. There might be a transfer protocol configuration error in the terminal program or maybe just the wrong file was selected. Also, if no SPI flash was found during an auto-boot attempt, this message will be displayed.
| **`ERR_SIZE`** | Your program is way too big for the internal processors instructions memory. Increase the memory size or reduce your application code.
| **`ERR_CHKS`** | This indicates a checksum error. Something went wrong during the transfer of the program image (upload via UART or loading from the external SPI flash). If the error was caused by a UART upload, just try it again. When the error was generated during a flash access, the stored image might be corrupted.
| **`ERR_FLSH`** | This error occurs if the attached SPI flash cannot be accessed. Make sure you have the right type of flash and that it is properly connected to the NEORV32 SPI port using chip select #0.
| **`ERR_EXC`** | The bootloader encountered an unexpected exception during operation. This might be caused when it tries to access peripherals that were not implemented during synthesis. Example: executing commands `l` or `s` (SPI flash operations) without the SPI module being implemented.
|=======================
[TIP]
If an unexpected exception has been raised the bootloader prints hexadecimal debug information showing
the <<_mcause>>, <<_mepc>> and <<_mtval>> CSR values.

View File

@ -0,0 +1,256 @@
:sectnums:
=== NEORV32 Runtime Environment
The NEORV32 software framework provides a minimal **runtime environment** (abbreviated "RTE") that takes care of a stable
and _safe_ execution environment by handling _all_ traps (exceptions & interrupts). The RTE simplifies trap handling
by wrapping the CPU's privileged architecture (i.e. trap-related CSRs) into a unified software API.
Once initialized, the RTE provides <<_default_rte_trap_handlers>> that catch all possible traps. These
default handlers just output a message via UART to inform the user when a certain trap has been triggered. The
default handlers can be overridden by the application code to install application-specific handler functions for each trap.
[IMPORTANT]
Using the RTE is **optional but highly recommended**. The RTE provides a simple and comfortable way of delegating
traps to application-specific handlers while making sure that all traps (even though they are not explicitly used
by the application) are handled correctly. Performance-optimized applications or embedded operating systems may
not use the RTE at all in order to increase response time.
==== RTE Operation
The RTE manages the trap-related CSRs of the CPU's privileged architecture (<<_machine_trap_handling_csrs>>).
It initializes the <<_mtvec>> CSR in DIRECT mode, which then provides the base entry point for _all_ traps. The address
stored to this register defines the address of the **first-level trap handler**, which is provided by the
NEORV32 RTE. Whenever an exception or interrupt is triggered this first-level trap handler is executed.
The first-level handler performs a complete context save, analyzes the source of the trap and
calls the according **second-level trap handler**, which takes care of the actual exception/interrupt
handling. The RTE manages a private look-up table to store the addresses of the according second-level trap handlers.
After the initial RTE setup, each entry in the RTE's trap handler look-up table is initialized with a
<<_default_rte_trap_handlers>>. These default handler do not execute any trap-related operations - they
just output a message via the *primary UART (UART0)* to inform the user that a trap has occurred, which is not (yet)
handled by the actual application. After sending this message, the RTE tries to continue executing the actual program
by resolving the trap cause.
==== Using the RTE
[IMPORTANT]
All provided RTE functions can be called only from machine-mode code.
The NEORV32 is part of the default NEORV32 software framework. However, it has to explicitly enabled by calling
the RTE's setup function:
.RTE Setup (Function Prototype)
[source,c]
----
void neorv32_rte_setup(void);
----
[NOTE]
The RTE should be enabled right at the beginning of the application's `main` function.
[IMPORTANT]
It is recommended to not use the <<_mscratch>> CSR when using the RTE as this register is used to provide services
for <<_application_context_handling>> (i.e. modifying the registers of application code that caused a trap).
As mentioned above, all traps will just trigger execution of the RTE's <<_default_rte_trap_handlers>> at first.
To use application-specific handlers, which actually "handle" a trap, the default handlers can be overridden
by installing user-defined ones:
.Installing an Application-Specific Trap Handler (Function Prototype)
[source,c]
----
int neorv32_rte_handler_install(uint8_t id, void (*handler)(void));
----
The first argument `id` defines the "trap ID" (for example a certain interrupt request) that shall be handled
by the user-defined handler. These IDs are defined in `sw/lib/include/neorv32_rte.h`:
.RTE Trap Identifiers (cut-out)
[source,c]
----
enum NEORV32_RTE_TRAP_enum {
RTE_TRAP_I_MISALIGNED = 0, /**< Instruction address misaligned */
RTE_TRAP_I_ACCESS = 1, /**< Instruction (bus) access fault */
RTE_TRAP_I_ILLEGAL = 2, /**< Illegal instruction */
RTE_TRAP_BREAKPOINT = 3, /**< Breakpoint (EBREAK instruction) */
RTE_TRAP_L_MISALIGNED = 4, /**< Load address misaligned */
RTE_TRAP_L_ACCESS = 5, /**< Load (bus) access fault */
RTE_TRAP_S_MISALIGNED = 6, /**< Store address misaligned */
RTE_TRAP_S_ACCESS = 7, /**< Store (bus) access fault */
RTE_TRAP_UENV_CALL = 8, /**< Environment call from user mode (ECALL instruction) */
RTE_TRAP_MENV_CALL = 9, /**< Environment call from machine mode (ECALL instruction) */
RTE_TRAP_MSI = 10, /**< Machine software interrupt */
RTE_TRAP_MTI = 11, /**< Machine timer interrupt */
RTE_TRAP_MEI = 12, /**< Machine external interrupt */
RTE_TRAP_FIRQ_0 = 13, /**< Fast interrupt channel 0 */
RTE_TRAP_FIRQ_1 = 14, /**< Fast interrupt channel 1 */
RTE_TRAP_FIRQ_2 = 15, /**< Fast interrupt channel 2 */
RTE_TRAP_FIRQ_3 = 16, /**< Fast interrupt channel 3 */
RTE_TRAP_FIRQ_4 = 17, /**< Fast interrupt channel 4 */
RTE_TRAP_FIRQ_5 = 18, /**< Fast interrupt channel 5 */
RTE_TRAP_FIRQ_6 = 19, /**< Fast interrupt channel 6 */
RTE_TRAP_FIRQ_7 = 20, /**< Fast interrupt channel 7 */
RTE_TRAP_FIRQ_8 = 21, /**< Fast interrupt channel 8 */
RTE_TRAP_FIRQ_9 = 22, /**< Fast interrupt channel 9 */
RTE_TRAP_FIRQ_10 = 23, /**< Fast interrupt channel 10 */
RTE_TRAP_FIRQ_11 = 24, /**< Fast interrupt channel 11 */
RTE_TRAP_FIRQ_12 = 25, /**< Fast interrupt channel 12 */
RTE_TRAP_FIRQ_13 = 26, /**< Fast interrupt channel 13 */
RTE_TRAP_FIRQ_14 = 27, /**< Fast interrupt channel 14 */
RTE_TRAP_FIRQ_15 = 28 /**< Fast interrupt channel 15 */
----
The second argument `*handler` is the actual function that implements the user-defined trap handler.
The custom handler functions need to have a specific format without any arguments and with no return value:
.Custom Trap Handler (Function Prototype)
[source,c]
----
void custom_trap_handler_xyz(void) {
// handle trap...
}
----
.Custom Trap Handler Attributes
[WARNING]
Do **NOT** use the `((interrupt))` attribute for the application trap handler functions! This
will place a `mret` instruction to the end of it making it impossible to return to the first-level
trap handler of the RTE core, which will cause stack corruption.
The following example shows how to install a custom handler (`custom_mtime_irq_handler`) for handling
the RISC-V machine timer (MTIME) interrupt:
.Installing a MTIME IRQ Handler
[source,c]
----
neorv32_rte_handler_install(RTE_TRAP_MTI, custom_mtime_irq_handler);
----
User-defined trap handlers can also be un-installed. This will remove the users trap handler from the RTE core
and will re-install the <<_default_rte_trap_handlers>> for the specific trap.
.Function Prototype: Installing an Application-Specific Trap Handler
[source,c]
----
int neorv32_rte_handler_uninstall(uint8_t id);
----
The argument `id` defines the identifier of the according trap that shall be un-installed.
The following example shows how to un-install the custom handler `custom_mtime_irq_handler` from the
RISC-V machine timer (MTIME) interrupt:
.Example: Removing the Custom MTIME IRQ Handler
[source,c]
----
neorv32_rte_handler_uninstall(RTE_TRAP_MTI);
----
[TIP]
The current RTE configuration can be printed via UART0 via the `neorv32_rte_info` function.
==== Default RTE Trap Handlers
The default RTE trap handlers are executed when a certain trap is triggered that is not (yet) handled by an
application-defined trap handler. The default handler will output a message giving additional debug information
via the <<_primary_universal_asynchronous_receiver_and_transmitter_uart0>> to inform the user and it will also
try to resume normal program execution. Some exemplary RTE outputs are shown below.
.Continuing Execution
[WARNING]
In most cases the RTE can successfully continue operation - for example if it catches an **interrupt** request
that is not handled by the actual application program. However, if the RTE catches an un-handled **trap** like
a bus access fault exception continuing execution will most likely fail making the CPU crash. Some exceptions
cannot be resolved by the default debug trap handlers and will halt the CPU (see example below).
.RTE Default Trap Handler Output Examples
[source]
----
<NEORV32-RTE> [M] Illegal instruction @ PC=0x000002d6, MTINST=0x000000FF, MTVAL=0x00000000 </NEORV32-RTE> <1>
<NEORV32-RTE> [U] Illegal instruction @ PC=0x00000302, MTINST=0x00000000, MTVAL=0x00000000 </NEORV32-RTE> <2>
<NEORV32-RTE> [U] Load address misaligned @ PC=0x00000440, MTINST=0x01052603, MTVAL=0x80000101 </NEORV32-RTE> <3>
<NEORV32-RTE> [M] Fast IRQ 0x00000003 @ PC=0x00000820, MTINST=0x00000000, MTVAL=0x00000000 </NEORV32-RTE> <4>
<NEORV32-RTE> [M] Instruction access fault @ PC=0x90000000, MTINST=0x42078b63, MTVAL=0x00000000 !!FATAL EXCEPTION!! Halting CPU. </NEORV32-RTE>\n <5>
----
<1> Illegal 32-bit instruction `MTINST=0x000000FF` at address `PC=0x000002d6` while the CPU was in machine-mode (`[M]`).
<2> Illegal 16-bit instruction `MTINST=0x00000000` at address `PC=0x00000302` while the CPU was in user-mode (`[U]`).
<3> Misaligned load access at address `PC=0x00000440` caused by instruction `MTINST=0x01052603` (trying to load a full 32-bit word from address `MTVAL=0x80000101`) while the CPU was in machine-mode (`[U]`).
<4> Fast interrupt request from channel 3 before executing instruction at address `PC=0x00000820` while the CPU was in machine-mode (`[M]`).
<5> Instruction bus access fault at address `PC=0x90000000` while executing instruction `MTINST=0x42078b63` - this is fatal for the default debug trap handler while the CPU was in machine-mode (`[M]`).
The specific message right at the beginning of the debug trap handler message corresponds to the trap code
obtained from the <<_mcause>> CSR (see <<_neorv32_trap_listing>>). A full list of all messages and the according
`mcause` trap codes is shown below.
.RTE Default Trap Handler Messages and According `mcause` Values
[cols="<5,^5"]
[options="header",grid="rows"]
|=======================
| Trap identifier | According `mcause` CSR value
| "Instruction address misaligned" | `0x00000000`
| "Instruction access fault" | `0x00000001`
| "Illegal instruction" | `0x00000002`
| "Breakpoint" | `0x00000003`
| "Load address misaligned" | `0x00000004`
| "Load access fault" | `0x00000005`
| "Store address misaligned" | `0x00000006`
| "Store access fault" | `0x00000007`
| "Environment call from U-mode" | `0x00000008`
| "Environment call from M-mode" | `0x0000000b`
| "Machine software IRQ" | `0x80000003`
| "Machine timer IRQ" | `0x80000007`
| "Machine external IRQ" | `0x8000000b`
| "Fast IRQ 0x00000000" | `0x80000010`
| "Fast IRQ 0x00000001" | `0x80000011`
| "Fast IRQ 0x00000002" | `0x80000012`
| "Fast IRQ 0x00000003" | `0x80000013`
| "Fast IRQ 0x00000004" | `0x80000014`
| "Fast IRQ 0x00000005" | `0x80000015`
| "Fast IRQ 0x00000006" | `0x80000016`
| "Fast IRQ 0x00000007" | `0x80000017`
| "Fast IRQ 0x00000008" | `0x80000018`
| "Fast IRQ 0x00000009" | `0x80000019`
| "Fast IRQ 0x0000000a" | `0x8000001a`
| "Fast IRQ 0x0000000b" | `0x8000001b`
| "Fast IRQ 0x0000000c" | `0x8000001c`
| "Fast IRQ 0x0000000d" | `0x8000001d`
| "Fast IRQ 0x0000000e" | `0x8000001e`
| "Fast IRQ 0x0000000f" | `0x8000001f`
| "Unknown trap cause" | undefined
|=======================
==== Application Context Handling
Upon trap entry the RTE backups the _entire_ application context (i.e. all `x` general purpose registers)
to the stack. The context is restored automatically after trap completion. The base address of the according
stack frame is copied to the <<_mscratch>> CSR. By having this information available, the RTE provides dedicated
functions for accessing and _altering_ the application context:
.Context Access Functions
[source,c]
----
// Prototypes
uint32_t neorv32_rte_context_get(int x); // read register x
void neorv32_rte_context_put(int x, uint32_t data); write data to register x
// Examples
uint32_t tmp = neorv32_rte_context_get(9); // read register 'x9'
neorv32_rte_context_put(28, tmp); // write 'tmp' to register 'x28'
----
.RISC-V `E` Extension
[NOTE]
Registers `x16..x31` are not available if the RISC-V <<_e_isa_extension>> is enabled.
The context access functions can be used by application-specific trap handlers to emulate unsupported
CPU / SoC features like unimplemented IO modules, unsupported instructions and even unaligned memory accesses.
.Demo Program: Emulate Unaligned Memory Access
[TIP]
A demo program, which showcases how to emulate unaligned memory accesses using the NEORV32 runtime environment
can be found in `sw/example/demo_emulate_unaligned`.

Binary file not shown.

After

Width:  |  Height:  |  Size: 60 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 129 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.8 KiB

View File

@ -0,0 +1,42 @@
# :copyright:
Figures are own work if not otherwise stated. License: https://github.com/stnolting/neorv32/blob/main/LICENSE
No copyright infringement intended.
## `neopixel.png`
source: Adafruit NeoPixel Überguide
License:
* ?
## `SPI_timing_diagram2.wikimedia.png`
source: https://en.wikipedia.org/wiki/File:SPI_timing_diagram2.svg
License:
* Creative Commons: https://en.wikipedia.org/wiki/Creative_Commons
* Attribution-Share Alike 3.0 Unported: https://creativecommons.org/licenses/by-sa/3.0/deed.en
## `riscv_logo.png` and `riscv_logo_small.png`
source: https://riscv.org/risc-v-logo/
License:
* https://riscv.org/about/risc-v-branding-guidelines/
## `oshw_logo.png`
source: https://www.oshwa.org/open-source-hardware-logo/
License:
* Creative Commons Attribution-ShareAlike 4.0 International License
## `neorv32_logo_smcard.jpg`
source: background image by https://pixabay.com
License:
* Pixabay license

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 51 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 86 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 65 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 148 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 31 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 134 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 39 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

103
docs/legal.adoc 100644
View File

@ -0,0 +1,103 @@
<<<
:sectnums:
== Legal
// ####################################################################################################################
:sectnums!:
=== License
**BSD 3-Clause License**
Copyright (c) 2024, Stephan Nolting. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that
the following conditions are met:
. Redistributions of source code must retain the above copyright notice, this list of conditions and the
following disclaimer.
. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and
the following disclaimer in the documentation and/or other materials provided with the distribution.
. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or
promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
OF THE POSSIBILITY OF SUCH DAMAGE.
==========================
**The NEORV32 RISC-V Processor** +
https://github.com/stnolting/neorv32 +
Dipl.-Ing. (M.Sc.) Stephan Nolting +
🇪🇺 European Union, Germany +
stnolting@gmail.com
==========================
<<<
// ####################################################################################################################
:sectnums!:
=== Proprietary Notice
* "GitHub" is a Subsidiary of Microsoft Corporation.
* "Vivado" and "Artix" are trademarks of Xilinx Inc.
* "AXI", "AXI4-Lite" and "AXI4-Stream" are trademarks of Arm Holdings plc.
* "ModelSim" is a trademark of Mentor Graphics A Siemens Business.
* "Quartus Prime" and "Cyclone" are trademarks of Intel Corporation.
* "iCE40", "UltraPlus" and "Radiant" are trademarks of Lattice Semiconductor Corporation.
* "Windows" is a trademark of Microsoft Corporation.
* "Tera Term" copyright by T. Teranishi.
* "NeoPixel" is a trademark of Adafruit Industries.
* Images/figures made with _Microsoft Power Point_.
* Timing diagrams made with _WaveDrom Editor_.
* Documentation made with `asciidoctor`.
* "Segger Embedded Studio" and "J-Link" are trademarks of Segger Microcontroller Systems GmbH.
* All further/unreferenced projects/products/brands belong to their according copyright holders.
:sectnums!:
=== Disclaimer
This project is released under the BSD 3-Clause license. No copyright infringement intended.
Other implied or used projects/sources might have different licensing see their documentation to get more information.
:sectnums!:
=== Limitation of Liability for External Links
This document contains links to the websites of third parties ("external links"). As the content of these websites
is not under our control, we cannot assume any liability for such external content. In all cases, the provider of
information of the linked websites is liable for the content and accuracy of the information provided. At the
point in time when the links were placed, no infringements of the law were recognizable to us. As soon as an
infringement of the law becomes known to us, we will immediately remove the link in question.
:sectnums!:
=== Citing
[NOTE]
This is an open-source project that is free of charge. Use this project in any way you like
(as long as it complies to the permissive license). Please cite it appropriately. 👍
.Contributors ❤️
[NOTE]
Please add as many https://github.com/stnolting/neorv32/graphs/contributors[contributors] as possible to the `author` field. +
This project would not be where it is without them.
.DOI
[TIP]
This project also provides a _digital object identifier_ provided by https://zenodo.org[zenodo]:
https://doi.org/10.5281/zenodo.5018888[image:https://zenodo.org/badge/DOI/10.5281/zenodo.5018888.svg[title='zenodo']]
:sectnums!:
=== Acknowledgments
**A big shout-out to the community and all https://github.com/stnolting/neorv32/graphs/contributors[contributors],
who helped improving this project! ❤️**
https://riscv.org[RISC-V] - instruction sets want to be free!
Continuous integration provided by https://github.com/features/actions[GitHub Actions] and powered by https://github.com/ghdl/ghdl[GHDL].

View File

@ -0,0 +1,48 @@
extends: default
page:
margin: [0.8in, 0.67in, 0.75in, 0.67in]
link:
font-color: #edac00
image:
align: center
caption:
align: center
running-content:
start-at: toc
header:
height: 0.65in
vertical-align: bottom
image-vertical-align: bottom
font-size: 11
border-color: #000000
border-width: 1
recto:
left:
content: '*The https://github.com/stnolting/neorv32[NEORV32] RISC-V Processor*'
right:
content: '*Visit on https://github.com/stnolting/neorv32[GitHub]*'
verso:
left:
content: '*The https://github.com/stnolting/neorv32[NEORV32] RISC-V Processor*'
right:
content: '*Visit on https://github.com/stnolting/neorv32[GitHub]*'
footer:
start-at: toc
height: 0.75in
font-size: 10
border-color: #000000
border-width: 1
recto:
left:
content: '{page-number} / {page-count}'
center:
content: 'Copyright by Stephan Nolting. All rights reserved.'
right:
content: '{docdate}'
verso:
left:
content: '{page-number} / {page-count}'
center:
content: 'Version {revnumber}'
right:
content: '{docdate}'

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

File diff suppressed because it is too large Load Diff

Binary file not shown.

2
docs/sources/.gitignore vendored 100644
View File

@ -0,0 +1,2 @@
*.pptx
*.png

View File

@ -0,0 +1,17 @@
{signal: [
{name: 'clk', wave: 'p..|.......|..'},
{name: 'req.addr', wave: 'x3.|.x4.x5.|.x', data: ['A_addr', 'B_addr', 'C_addr']},
{name: 'req.data', wave: 'x..|..4.x..|..', data: ['wdata']},
{name: 'req.ben', wave: 'x..|..4.x..|..', data: ['ben']},
{name: 'req.stb', wave: '010|..10.10|..', node: '.a....d..f....'},
{name: 'req.rw', wave: '0..|..1..0.|..', node: '..............'},
{name: 'req.src', wave: 'x0.|.x0.x..|..'},
{name: 'req.priv', wave: 'x0.|.x0.x..|..'},
{name: 'req.rvso', wave: 'x0.|.x0.x..|..'},
{},
{name: 'rsp.data', wave: '2..|32.....|x2', data: ['0', 'rdata', '0'], node: '....b.......g.'},
{name: 'rsp.ack', wave: '0..|10.10..|..', node: '....c..e......'},
{name: 'rsp.err', wave: '0..|.......|10', node: '............h.'}
],
edge: ['a~>b', 'a~>c', 'd~>e', 'f~>g', 'f~>h']
}

View File

@ -0,0 +1,19 @@
{signal: [
{name: 'clk', wave: 'p....|.....|.....'},
{name: 'req.addr', wave: 'x3..x|.4..x|.5..x', data: ['addr', 'addr', 'addr']},
{name: 'req.data', wave: 'x....|.4..x|.5..x', data: ['wdata1', 'wdata2']},
{name: 'req.ben', wave: 'xx..x|.4..x|.5..x', data: ['0b1111', '0b1111']},
{name: 'req.stb', wave: '010..|.10..|.10..', node: '.a.....d......'},
{name: 'req.rw', wave: '0....|.1...|.....', node: '..............'},
{name: 'req.src', wave: '0....|.....|.....'},
{name: 'req.priv', wave: '0....|.....|.....'},
{name: 'req.rvso', wave: '01..0|.1..0|.1..0', node: '.b.......e....'},
{},
{name: 'rsp.data', wave: '2..32|...42|...52', data: ['0', 'rdata', '0', '0', '0', '1']},
{name: 'rsp.ack', wave: '0..10|...10|...10', node: '.........f....'},
{name: 'rsp.err', wave: '0....|.....|.....'},
{},
{name: 'rvs_valid', wave: '0.1..|....0|.....', node: '..c.......g...'}
],
edge: ['a~>c', 'b~>c', 'd~>g', 'e~>g', 'f~>g']
}

View File

@ -0,0 +1,8 @@
{reg: [
{bits: 7, name: 11, attr: 'Opcode: Custom-0'},
{bits: 5, name: 'rd', attr: 'Destination'},
{bits: 3, name: 'funct3', type: 1},
{bits: 5, name: 'rs1', attr: 'Source 1'},
{bits: 5, name: 'rs2', attr: 'Source 2'},
{bits: 7, name: 'funct7', type: 1}
]}

View File

@ -0,0 +1,9 @@
{reg: [
{bits: 7, name: 43, attr: 'Opcode: Custom-1'},
{bits: 5, name: 'rd', attr: 'Destination'},
{bits: 3, name: 'funct3', type: 1},
{bits: 5, name: 'rs1', attr: 'Source 1'},
{bits: 5, name: 'rs2', attr: 'Source 2'},
{bits: 2, name: 4, attr: '-'},
{bits: 5, name: 'rs3', attr: 'Source 3'}
]}

View File

@ -0,0 +1,9 @@
{reg: [
{bits: 7, name: 91, attr: 'Opcode: Custom-2'},
{bits: 5, name: 'rd', attr: 'Destination'},
{bits: 3, name: 'rs4.lo', attr: 'Source 4'},
{bits: 5, name: 'rs1', attr: 'Source 1'},
{bits: 5, name: 'rs2', attr: 'Source 2'},
{bits: 2, name: 'rs4.hi', attr: 'Source 4'},
{bits: 5, name: 'rs3', attr: 'Source 3'}
]}

View File

@ -0,0 +1,9 @@
{reg: [
{bits: 7, name: 123, attr: 'Opcode: Custom-3'},
{bits: 5, name: 'rd', attr: 'Destination'},
{bits: 3, name: 'rs4.lo', attr: 'Source 4'},
{bits: 5, name: 'rs1', attr: 'Source 1'},
{bits: 5, name: 'rs2', attr: 'Source 2'},
{bits: 2, name: 'rs4.hi', attr: 'Source 4'},
{bits: 5, name: 'rs3', attr: 'Source 3'}
]}

View File

@ -0,0 +1,14 @@
{ signal: [
{ wave: '10x....10', node: '.........'},
{ node: '.ab......'},
{ node: '.X..c....'},
{ node: '.Y.....d.'},
{ node: '.Z......e'}
],
edge: [
'a-b t0',
'X-c t1',
'Y-d t2',
'Z-e t3'
]
}

View File

@ -0,0 +1,12 @@
{ signal: [
{ wave: '10..x...1', node: '.........'},
{ node: '.f..g....'},
{ node: '.X.. h...'},
{ node: '.Y......i'}
],
edge: [
'f-g t4',
'X-h t5',
'Y-i t6',
]
}

View File

@ -0,0 +1,7 @@
{signal: [
{name: 'clk', wave: 'p.........'},
{name: 'dat', wave: 'x345..6x7x', data: ["A", "B", "C", "D", "E"]},
{name: 'val', wave: '01.....010'},
{name: 'rdy', wave: '01.0.1...0'},
{name: 'lst', wave: '0.......10'},
]}

View File

@ -0,0 +1,14 @@
{signal: [
{name: 'clk', wave: 'p....|...'},
{name: 'wb_tag_o', wave: 'x3...|.x.', data: ['Tag']},
{name: 'wb_adr_o', wave: 'x3...|.x.', data: ['Address']},
{name: 'wb_dat_i', wave: 'x....|3x.', data: ['rdata']},
{name: 'wb_dat_o', wave: 'x....|.x.', data: ['Wdata']},
{name: 'wb_we_o', wave: 'x0...|.x.'},
{name: 'wb_sel_o', wave: 'x....|.x.', data: ['Byte_enable']},
{name: 'wb_stb_o', wave: '01...|.0.'},
{name: 'wb_cyc_o', wave: '01...|.0.'},
{name: 'wb_lock_o', wave: '0....|...'},
{name: 'wb_ack_i', wave: '0....|10.'},
{name: 'wb_err_i', wave: '0....|...'},
]}

View File

@ -0,0 +1,14 @@
{signal: [
{name: 'clk', wave: 'p....|...'},
{name: 'wb_tag_o', wave: 'x3...|.x.', data: ['Tag']},
{name: 'wb_adr_o', wave: 'x3...|.x.', data: ['Address']},
{name: 'wb_dat_i', wave: 'x....|.x.'},
{name: 'wb_dat_o', wave: 'x3...|.x.', data: ['Wdata']},
{name: 'wb_we_o', wave: 'x1...|.x.'},
{name: 'wb_sel_o', wave: 'x3...|.x.', data: ['Byte_enable']},
{name: 'wb_stb_o', wave: '010..|...'},
{name: 'wb_cyc_o', wave: '01...|.0.'},
{name: 'wb_lock_o', wave: '0....|...'},
{name: 'wb_ack_i', wave: '0....|10.'},
{name: 'wb_err_i', wave: '0....|...'},
]}

Some files were not shown because too many files have changed in this diff Show More