DRAM Bench

DRAMBench: Autoformalizing DRAM Specifications with Timed Petri Nets

Chip design verification is dominated by manual interpretation of complex DRAM standards. With DRAMPyML and DRAMBench, Fraunhofer IESE and Normal Computing introduce timed Petri net models and an open benchmark to autoformalize memory specifications. This post shows how AI can bridge the gap between natural language specs and formal verification artifacts.

Jan Ernst, Normal Computing
Co-Author
Dr. Jan Ole Ernst

Send email

 

Dr. Jan Ole Ernst is a Research Scientist at Normal Computing with a background in physics and machine learning. Currently, he works on AI for chip design & verification and novel computing architectures.

Chip design verification is one of the most critical and labor-intensive steps in the semiconductor industry, estimated to take up to 70% of the verification effort. Before a chip can be manufactured, engineers must ensure that the design implementation exactly matches its specification and is free of functional errors – because a bug found after fabrication can mean months of delay and millions for costly respins. Today, this verification process demands extensive manual effort: engineers read through hundreds of pages of natural language specifications and painstakingly translate them into formal, testable representations, a process that is slow, error-prone, and does not scale well as specifications grow in complexity. At Fraunhofer IESE, together with Normal Computing, we believe that AI and formal methods can fundamentally change this situation. To help make this vision concrete, we are releasing DRAMBench – an open benchmark and formal modelling framework for memory chips – as well as DRAMPyML, our Python-based framework for timed Petri net models of DRAM protocols.

The Problem: Specifications Are Written for Humans, Not Machines

Industry standards bodies like JEDEC publish detailed specifications for memory chips such as DDR, LPDDR, GDDR, and HBM. These documents define how such chips work: which commands a memory chip must support, the legal orderings of those commands, and the precise timing constraints between them. A single DRAM standard can contain hundreds of timing parameters and complex state-dependent rules governing how commands interact across banks, ranks, and channels.
Today, translating these specifications into something a verification tool can consume is almost entirely manual. Verification engineers read the spec, interpret its intent, and encode it into SystemVerilog assertions, stimulus sequences and other artifacts. This translation step is where spec misinterpretations enter the design and where schedules tend to slip. While AI has made progress in isolated design verification tasks, current approaches typically focus on narrow problems rather than full end-to-end specification compliance of modern chip designs, failing to capture the complexity of real-world verification workflows.

Our Approach: Timed Petri Nets as a Formal Target

Rather than jumping directly from natural language to low-level verification artifacts, we introduce an intermediate formal representation based on timed Petri nets. Petri nets are a well-established mathematical formalism for modeling concurrent systems – they capture states, transitions, and the constraints governing when transitions may fire. By extending them with timing, we can faithfully represent the temporal requirements that are central to DRAM protocols.

Our framework, DRAMPyML, models each DRAM standard as a timed Petri net where:

  • Places represent device states (idle, active, precharging, refreshing)
  • Transitions represent commands (ACT, RD, WR, PRE, REF, and many more)
  • Arcs encode the structural and timing relationships between them – including inhibitor arcs for mutual exclusion, reset arcs for state resets, and timed arcs that enforce minimum delays like tRCD, tRAS, and tRP.

The result is a compact, executable, and mathematically precise model of a DRAM protocol. From this single representation, we can derive SystemVerilog assertions, generate valid command sequences for stimulus and perform formal analysis. The Petri net serves as a single source of truth from which downstream verification artifacts flow.

DRAMBench: A Benchmark for Hardware Autoformalization

To evaluate how well AI models can perform this translation – from natural language specification to formal Petri net – we are releasing DRAMBench. It includes ground truth Petri net models for a broad range of JEDEC DRAM standards: DDR2 through DDR4, LPDDR2 through LPDDR4, GDDR5 and GDDR6, and HBM2.
DRAMBench is designed to be a living benchmark. As new DRAM generations emerge, previously withheld standards are released into the benchmark, and newer ones take their place. This keeps DRAMBench rigorous over time by preventing training data contamination while ensuring the dataset grows with the industry.
To evaluate how closely an AI-generated model matches the ground truth, the benchmark compares legal command sequences. Given two Petri nets, we enumerate their legal k-step command sequences and compute the Jaccard index over the overlap. This gives a single, interpretable similarity score that captures structural and behavioral correctness. Timing correctness is evaluated separately through a dedicated timing constraint recall metric, measuring what fraction of the specification’s timing requirements the generated model recovers.

A Call to Two Communities

DRAMBench sits at the intersection of two fields that do not talk to each other enough: machine learning and hardware engineering.
To the ML community: Hardware specification documents are a rich, underexplored domain for autoformalization research. Unlike mathematical theorem proving, where significant benchmark infrastructure already exists, hardware specs present unique challenges – ambiguous natural language, implicit domain knowledge, complex timing semantics, and hierarchical structure. DRAMBench provides a concrete, well-defined task with quantitative evaluation metrics. We hope it serves as a stepping stone toward broader hardware autoformalization benchmarks. We would welcome an alternative autoformalization approach that does not rely on Petri Nets as intermediate artifacts.
To the hardware community: The cost of manual specification interpretation grows with each generation. The jump from DDR4 to DDR5 alone nearly quadrupled the page count of the JEDEC spec. Each new DRAM generation adds commands, timing parameters, and architectural features. The jump from DDR4 to DDR5 alone introduced significant new complexity. Formal models like timed Petri nets offer a more compact and interpretable way to capture protocol behavior than verbose verification collateral. And if AI can help generate these models, the verification bottleneck loosens considerably [3].

Beyond Memory: A Vision for All Chip Specifications

While we demonstrate our approach on DRAM protocols, the underlying idea is not limited to memory. Any hardware specification that defines states, commands, and timing constraints is a candidate for the same treatment. Communication protocols, bus interfaces, and controllers all fit this pattern. Timed Petri nets are general enough to capture a wide range of protocol behaviors, and the autoformalization pipeline we describe is architecture-agnostic.
Our hope is that this work contributes to a future where chip releases are accompanied not just by PDF datasheets, but by formal models: compact, executable, and machine-readable representations that verification teams can use directly. A formal model shipping alongside a specification would dramatically reduce the time from spec release to verified implementation. It would make verification more accessible to smaller teams and create a foundation for AI-assisted verification tools that can reason about specs end-to-end.

Get Involved

DRAMBench and DRAMPyML are open source under the Apache 2.0 license. The framework is Python-based, built on rustworkx for high-performance graph operations, and designed to be easy to extend with new standards, new formal targets or new evaluation metrics. We welcome contributions from both communities – whether that means adding new DRAM standard models, building autoformalization agents that target DRAMBench, extending the framework to non-memory protocols, or developing new evaluation metrics for formal model comparison. The gap between natural language specifications and formal verification is one of the biggest bottlenecks in chip design. We think closing it is a problem worth solving together.

If you are interested in research collaborations, joint projects, or evaluating AI-supported formal verification in your development processes, Fraunhofer IESE is happy to collaborate. Together with Normal Computing, we are working to close the gap between natural language specifications and formal verification – a bottleneck that is central to the future of chip design. Contact us!

References

DRAMPyML and DRAMBench are developed by Fraunhofer IESE and Normal Computing. For more details, see our papers: „DRAMPyML: A Formal Description of DRAM Protocols with Timed Petri Nets“ (DV-Con Europe 2026), „A Formal Description of Communication Protocols Using Petri-Nets“ (MBMV 2026), and „Autoformalizing Memory Device Specifications using Agents“ (ICLR VerifAI Workshop 2026).