New-Tech Europe Magazine | Q2 2021

Figure 5: Exhaustive CSD verification with emulation. (Image source: Siemens EDA)

applications, such as tensor processing unit (TPU), neural network processor (NNP), neural processing engine (NPE), as well as type of implementations, such as 2D, 3D stacking, chiplets, FPGA fabric, and custom AI logic. From the verification perspective, design capacity, design fabric, power analysis, and software stack validation are the four capabilities that must be handled. Hardware Emulation for AI/ML Based on these design characteristics, an emulation platform must accommodate up to 15 billion gates, and compile designs at rates of several hundred million gates per hour for fast turnaround time (TAT) to find and fix a bug, recompile and rerun emulation. It must support a wide communication bandwidth between the host computer and the emulator to manage the intense traffic between the virtual test environment and the DUT. It ought to perform accurate power analysis and be able to execute customer software stacks dependent on the application. Storage (SSD versus CSD) Three bottlenecks undermine the solid-state drive (SSD) adoption. First, the storage media consisting

of NAND flash fabric endures finite life expectancy, wear leveling, the need for garbage collection, performance degradation over time, finicky reliability, and random latency. Second, the host computer interface’s bandwidth and latency do not meet SSD requirements to deliver its full potential. Third, the physics of data movement lower performance and power consumption targets, though some bottlenecks have been eliminated in the computational storage device (CSD). In the SSD, a host computer issues a request for data to the storage drive. Storage sends the data to the computer and the computer writes the processed data back to storage. In the CSD, the host computer sends a request to a lightweight computer installed locally within the CSD. The local computer, instead of sending the data back to the host, processes the data “in situ” and sends the results back to the host. Basically, CSD designers can disaggregate and move computing from the host to in situ to improve performance, lower power usage, and free up PCIe bandwidth for the rest of the system. Several applications benefit from the CSD, including hyperscale

data centers, image recognition, edge computing, AI/ML, real-time analytics, database query, and others. Hardware Emulation for Storage SSD and CSD traditional verification approaches were defeated by the non-deterministic nature of the storage. Emulation-based virtual verification offers new verification methods. Through virtualization, complete system verification including full firmware validation can be carried out at high speed to accelerate time to market and perform architectural explorations to create the optimal solution for a specific task. SSD virtualization allows for pre-silicon performance and latency testing within 5% of actual silicon. Power Trends and Power Analysis Designing silicon with process nodes below 28nm magnifies the disadvantages of dynamic power consumption in several market segments including mobile and CPU/ GPU, data centers, automotive, and AI/ML. Accurate identification of peaks,

New-Tech Magazine Europe l 27

Made with FlippingBook Online newsletter creator