Development of a Runtime Measurement System
Chapter Overview
Introduction
I am developing a system designed to accurately measure and analyze runtime of various software or hardware processes. The idea for this system arose from the need for precise timing and measurement solutions within the context of embedded systems development.
Let's examine a practical use case.
Our Use Case - The Development Process
Our developed system (System A) handles two distinct events: Event A (entering) and Event B (exiting).
These events are represented by rising and falling edges on a digital signal line. The time interval between these two events (tB-tA) represents the runtime of the process we want to measure.
Since our system must fulfill specific timing requirements, tB-tA must be within the maximum allowed time constraints. Therefore we add the green line Δtmax_allowed.
When we look into our system, the system consists of two modules. Module 1 is responsible for processing input data, while Module 2 handles output data. Our System A fulfills the requirement. Therefore we release this system with version 1.0.
Let's assume a fictitious customer requests a new feature for our system. Leading to a third module being added to the system. After a few months the Module 3 is almost completed and integrated into the system. Integration tests can finally be performed to ensure that the new module works correctly with the existing modules.
After integrating Module 3, we observe that the runtime (tB-tA) has increased and now exceeds the maximum allowed time constraint (Δtmax_allowed). This indicates that the addition of Module 3 has negatively impacted the system's performance, leading to a violation of the timing requirements.
Conclusion - The Development Process
Now we got a better understanding of the engineering perspective. But to really understand the full problem, we have to have a look into the development workflow we used above.
Our Use Case - The Development Workflow
Summary
This workflow clearly demonstrates the need for a more structured approach to runtime monitoring. With the current development process and workflow, we are not able to effectively monitor and manage runtime performance during our processes and workflows. One-shot measurements are not sufficient to capture the dynamic behavior of the system over time.
The summary was easy, I guess. Before we dive into the solution space, let's explore the problem more deeply.
The Problem Statement
With the described use case above, we've realized we have different problems that need to be addressed.
Workflow related challenges
With our current development process to develop our System A, we face several challenges:
- Lack of early detection for performance issues since we rely on prerelease runtime analysis.
- Inability to track performance trends during the development phase over time
- One-shot manual measurements are time-consuming and error-prone
What we need is:
- Detect timing violations immediately, not just before release
- Prevent costly late-stage fixes
- Track performance trends throughout development
- Rather than discovering issues at release time, systematic automated measurements during each development phase allow us to maintain control over system timing constraints.
- Replace our one-shot measurements directly before release as they provide only one data point, but do not offer insights into performance trends over time or under varying conditions.
Measurement accuracy related challenges
Software-based measurements are popular and easy to realize. However, they change the system under test since you add measurement code. Accuracy is limited and dependent on the speed the embedded system is clocked at and the overhead introduced by the software-defined runtime measurement. This results in massively changing the time behavior of the system. To reduce the intrusiveness, changing the measurement methodology to use external hardware can be useful.
The Idea / The Improvement Statement
Based on the problems identified above, the solution is clear: we need an automated, non-intrusive, and easy-to-integrate runtime measurement system that can be embedded directly into the development workflow.
Instead of relying on software-based measurements that alter system behavior, we propose using an FPGA-based external measurement system that:
- Monitors the system's input and output signals with a minimum need for modifying the system under test
- Provides high-resolution timing measurements using an internal or external oscillator
- Requires minimal setup and wiring
- Can be integrated into existing development workflows
- Generates performance reports and trend analysis
This approach eliminates the overhead of software-based measurements while providing developers with continuous visibility into system performance. By catching timing violations early and tracking performance trends throughout development, we can prevent costly late-stage fixes and maintain strict control over timing constraints.
The FPGA acts as an independent observer, passively measuring the time between Event A (entering) and Event B (exiting) without interfering with the system's actual operation. This provides accurate, reliable measurements that reflect the true system behavior.
Let's enter the solution space.
Proposed Development Workflow with Runtime Analysis (During Development)
This proposal seems obvious. You might think: "I have thought of that just by reading this in 5 minutes." However, the challenge lies in implementing and maintaining such a systematic approach throughout development. Since:
- We want to prevent additional effort and guidance for developers for timing analysis. It must be optional instead of obligatory. We trust the developer to decide by himself when to perform timing analysis.
- We want to prevent buying resource intensive tracing tools and infrastructure even though there is no need to look into each of our modules internal behaviour. We just want to measure the overall system performance tB-tA at this validation stage. Everything else is overkill.
- We want to prevent time consuming setup, wiring. If it is hard to integrate, it will not be used.
Highlevel System Design Diagram
graph TD
I(Measurement Input Signal) --- F(FPGA)
O(Oscillator) --- F(FPGA)
F(FPGA) ---|UART| S(Smaller CPU)
S --- |USB/ ETH| L(Larger CPU)
style F fill:#FFFFED,stroke:#FCE992
style I fill:none,stroke:#228B22
linkStyle 0 stroke:#228B22
FPGA Focused System Design Diagram
graph TD
I(Measurement Input Signal) --- F
Hz(1Hz Debug Pin) --- F
D(Debug Pin) --- F
O(optional external Oscillator) --- F
subgraph F[FPGA]
direction TB
subgraph Left[" "]
direction TB
RE[Rising & Falling Edge Detector] --- CC
CC[ Clock Counter]
CC --- DSM[Calulation & Measurement Data Storage Module]
DSM --- U
OO(Onboard Oscillator) --- CC
RE --- DSM
end
subgraph U[UART]
direction TB
URX[UART RX]
UTX[UART TX]
end
Left ~~~ U
end
F ---|UART| S(Smaller CPU - intermediate storage)
S ---|USB/ ETH| L(Larger CPU - visualization)
style I fill:none,stroke:#228B22
style Hz fill:none,stroke:#000000
style D fill:none,stroke:#000000
style Left fill:none,stroke:none
Toolchain for Simulation and Verification of Verilog Design
VVP
vvp is the runtime engine that executes the default compiled form generated by Icarus Verilog.
GTKWave
Programm to view waveforms. Used after simulation.
Configuration file: .gtkw
graph LR
d(Design Implementation in Verilog) -->|.verilog / .v| IV(iverilog compiler)
IV---> |.vvp| VVP(vvp)
VVP---> |.vcd| G(Gtkwave)
G ---> d
Toolchain for Flashing and Verification on Real Hardware
graph LR
.verilog --> YS(Yosys Script)
YS-->|.ys| Y(Yosys)
.pcf --> N(Nextpnr)
Y --> |.json, .blif,synth.v| N(Nextpnr)
N --> |.asc| I(icepack)
I --> |.bin| BF2(bin2uf2 Conversion)
BF2 --> |.uf2| FPGA(Flash to FPGA)
style .verilog fill:none,stroke:none
style .pcf fill:none,stroke:none
Development Roadmap & Future Enhancements
- Add / Verify option to integrate an external oscillator.
- Evaluate memory storage size on the FPGA side. Develop option to use shared RAM?
- Add option to configure the UART data rate. Currently only 9600 bits/second.
- Add available measurement points (pins).
- Stabilize tx and rx of UART communication from and to FPGA
- Implement firmware for raspberry pi (smaller CPU) to request & receive the data and publish is via ETH
- Data visualization implementation on larger CPU
- Stable protocol definition
System Resolution
Resolution and accuracy depends on the clocking speed of the system. What is our accuracy if we use an oscillator with frequency f?
| Frequency (f) | Period (T) |
|---|---|
| 12 MHz | 83.33 ns |
| 50 MHz | 20 ns |
| 100 MHz | 10 ns |
| 500 MHz | 2 ns |
| 1 GHz | 1 ns |
This means with a clocking oscillator of 100 MHz we can measure time events with a resolution of 10 ns.
- Resolution: the smallest time increment you can distinguish (determined by clock period)
- Accuracy: how close your measurement is to the true value (affected by clock stability, jitter, calibration, etc.)