AN 831: Intel® FPGA SDK for OpenCL™: Host Pipelined Multithread

ID 683013
Date 11/20/2017
Public

1.4.1. Host Channel Streaming Feature

Communication between a host and a device is usually a time consuming, especially for large sized data.

Input data is transferred from the host memory into the global memory on the device. To use the data, kernels access this global memory and copy the required data into their local memory. This process of transferring data adds a huge latency since the whole data must be transferred from the host to global memory before starting any process on the device. The same latency is observed for the output data since the whole output must be generated first, saved into the global memory and then, transferred into the host memory.

The following figure illustrates the host channel streaming feature and how it provides a lower overhead by eliminating the need to access global memory. When you implement this feature, kernel is launched only once into the device and used for multiple input files, as they are streamed directly into the device from the host.

Figure 6. Host Channel vs Memory-Based Data Communication

Intel® FPGA SDK for OpenCL™ supports streaming data between a host and a device to eliminate the global memory access in both directions and directly pass data to the local memory through host channels, and also read back from the device. This reduces latency and enables platform (CPU or FPGA) to start processing the streamed data.