NEOX™ GRAPHICS
The ultimate ultra-low-power RISC-V based GPU Processor
NEOX™ is a parallel multicore and multithreaded GPU architecture based on the RISC-V RV64IMFC instruction set with adaptive NoC. The number of cores varies from 4 to 64 organized in 1-16 cluster elements, each configured for cache sizes and thread counts. Depending on cluster/core configuration, NEOX™ compute power ranges from 12.8 to 409.6 GFLOPS at 800MHz with support for FP16, and FP32, and SIMD instructions.
Additional information
Download NEOX™ Product BriefScalable to match multiple applications & performance levels
NEOX™ is a highly configurable IP supporting 1 to 4 cores per cluster. The cache sizes of each cluster and the thread count of the cores are parameterized and can be tailored to different applications and use cases. Multithreading hides long latency delays from the external memory controller maintaining high computation throughput for the entire array. NEOX™ represents a new era of smart GPU architectures with programmable compute shaders, running on a real-time operating system (RTOS) and supported by light-weight graphics and machine learning programming frameworks. The heavily multi-threaded system and its configurable programming libraries use the same hardware blocks and offer customized extensions by combining graphics, machine learning, vision/video processing and general-purpose compute workloads.ARCHITECTURE
The NEOXTM architecture includes AI-specific ISA extensions, SIMD Vector in variable length datatypes including 8-bit and optionally Graphics ISA Extensions/Coprocessors: Unified Shader Architecture, Tile Based Rendering, Color/Vertex, Vector Support and contains dedicated hardware modules, such as rasterizer, texture unit, tile management unit and texture caches. By supporting a dedicated interface allows SoC architects to augment the instruction set with user-defined instructions to enable product differentiations and the ability to create custom unique designs.

DELIVERABLES, SOFTWARE & INTEGRATION*
NEOX™ SDK, System Verilog RTL, Integration Tests, LLVM C/C++ compiler, GCC C/C++ compiler. Custom instructions for Computer Graphics, Compute and AI, and user defined extensions. Evaluation on Xilinx SoC FPGA platform and SW Cycle Accurate Simulator. Supported OS: Linux, RTOS, Wear OS.
* Listed items represent a superset and are subject to change without further notice.
ARCHITECTURE
- RISC V64GC ISA
- Multicore Array
- Multithreaded
- Adaptive NoC
- Configurable 4-64 Cores
SOFTWARE
- C/C++ LLVM Compiler
- C/C++ GCC Compiler
- Posix Threads
- Open Graphics Frameworks
EVALUATION
- Xilinx Zynq FPGA
- Cycle Accurate Simulator
DELIVERABLES
- System Verilog RTL
- Configuration Tool
- Verification Suite
- Synthesis Scripts
- Software Emulator
- FPGA Prototype
- SDK