IdeaPrototype.

For all the stages of crafting a computer chip.

IDEA
CREATE HW + SW
SIMULATE
GENERATE RTL
PROTOTYPE

With the speed of light. In one platform.

The creativity tool for computer architects.

The semiconductor industry needs 1 million more engineers by 2030. They're not coming. The answer isn't more headcount. It's more creative power per computer architect.

Stages of ChipCraftX.

X is what you're solving for. Every step has a system behind it.

X = IDEA

Connoisseur

Every project starts differently. A quick RTL module. A full SoC with a PyTorch model. A cache exploration.

Just describe what you want to build. The agent understands your intent and invokes the right tools, workflows, and skills. From a single RTL generation to a full system-level design. You don't configure the pipeline. It configures itself.

X = CREATE HW + SW

ChipCraftSim

Hardware and software are designed in silos.

Co-design both in one environment. Write the hardware description and the software that runs on it, side by side.

X = SIMULATE

ChipCraftSim

You can't test 10 architectures when each sim takes days.

Full simulation environment. Test your PyTorch model on different HW designs. Explore the tradeoff space before committing to silicon.

X = GENERATE RTL

ChipCraftBrain

Writing RTL by hand is the bottleneck.

Spec in, verified RTL out. 98.7% VerilogEval. 94.7% CVDP. The SOTA benchmark leader. And it gets better every run.

X = PROTOTYPE

ChipCraftX

The gap between RTL and real silicon is months.

Deploy to FPGA. Validate on real hardware.

HOW IT WORKS

One loop. Not one shot.

Most tools generate and hope, or dispatch 20–100 samples per problem. ChipCraftX generates, validates with real EDA tools, and repairs. In a closed loop. The system picks its own repair strategy.

RL BRAIN · PICKS REPAIR STRATEGYYOUR SPECANALYZEGENERATE RTLCOMPILESIMULATESYNTHESIZE
PIPELINE

The Right Agent
At Every Stage.

YOUR SPECKemalHW ARCHITECTANALYZEKemalHW ARCHITECTGENERATE RTLAyumiRTL ENGINEERCOMPILEMeiVERIF ENGINEERSIMULATEMeiVERIF ENGINEERSYNTHESIZEHaloLAYOUT ENGINEER
AGENT LEGEND
Kemal

HW Architect

Turns natural language into full SoC architecture specs.

Ayumi

RTL Engineer

Writes production-grade Verilog from architecture specs.

Mei

Verif Engineer

Reads your RTL, writes testbenches that match it.

Halo

Layout Engineer

Physical design oracle. Interprets P&R results and PPA.

Ravi

Bench Evaluator

Manages evaluation suites and feeds results back to the loop.

Dani

Training Engineer

Trains and fine-tunes the hardware LLMs.

BENCHMARK RESULTS

State of the Art.

Evaluated on public benchmarks across the hardware LLM research community. Scored against frontier models and published multi-agent systems. With full citations.

VERILOGEVAL V2
98.7%
#1 on leaderboard[2]
CVDP NON-AGENTIC
94.7%
+51pp vs Claude 3.7[3]
REALBENCH
2.4×
prior state of the art[5]
LIVELEADERBOARD::ACTIVE15 ENTRIESDB::ARXIV_SYNC
METRIC: PASS@1
Liu et al., NVIDIA & UC San Diego · ICCAD 2023 · Source ↗
USChipCraftXMulti-agent + RL loop
98.7%
ChipAgentsAgentic · Alpha Design AI[10]
97.4%
MAGEMulti-agent[1]
95.7%
DeepSeek-R1-0528ChipCraftX Pipeline
90.4%
Claude 3.7 SonnetSingle-shot
89.7%
Qwen3.6 PlusChipCraftX Pipeline
89.1%
Gemma 4 31B AWQChipCraftX Pipeline
84.0%
Mercury 2 (Inception)ChipCraftX Pipeline
82.7%
Qwen2.5-Coder-32B + QLoRAQ8_0
78.8%
FINE-TUNEDQwen2.5-7B MoEChipCraftX Pipeline · fp16
75.0%
Qwen2.5-Coder-32B + QLoRA (Q4)Q4_K_M quantized
71.8%
Gemma 4 31B AWQSingle-shot
66.7%
FINE-TUNEDQwen2.5-7B MoESingle-shot · fp16
59.6%
Human ExpertsVerilogEval baseline[2]
53.2%
GPT-4Zero-shot[2]
52.8%
REFERENCES
[1]MAGE: Multi-Agent Engine for RTL Code Generation · Source ↗
[2]VerilogEval — Liu et al., NVIDIA & UCSD, ICCAD 2023 · Source ↗
[3]CVDP — Pinckney et al., NVIDIA, 2025 · Source ↗
[4]ACE-RTL — Agentic RTL generation system, NVIDIA, Feb 2026 · Source ↗
[5]RealBench — Jin et al., Chinese Academy of Sciences, 2025 · Source ↗
[6]ChipBench — Zhong et al., UCSD & Columbia, ICML 2026 · Source ↗
[7]OpenLLM-RTL / RTLLM v2 — Ahsan et al., HKUST, ICCAD 2024 · Source ↗
[8]Pluto — efficiency benchmark for LLM hardware code, ICLR 2026 · Source ↗
[9]ArchXBench — Purini et al., MLCAD 2025 · Source ↗
[10]ChipAgents — Alpha Design AI, Jan 2025 (press release) · Source ↗

All scores are pass@1 unless noted. ChipCraftX uses Iterative@5 with real EDA tool validation (compile → simulate → synthesize). Competitor scores sourced directly from cited papers. Additional benchmarks tracked: ResBench (Guo et al., HEART 2025, arXiv:2503.08823), RTLBench (IEEE ICCD 2025).