Server Acceleration with CIM Arrays
|SigC641x CIM array card|
|1U server with CIM array card|
|Data center operators concerned with saving energy are buidling near rivers in remote areas of the northwestern US (picture Copyright MIT Technology Review, Jul/Aug 2009|
- Signalogic Solution
- Using CIM to Protect Your IP
- Custom Solutions
- Development Status
- OpenMP Accelerator - OpenMP Compatibility
- CIM and Linux Development
- Texas Instruments Multicore CPUs
- Server Accelerators
- Parallel Processing
- Accelerating Micro Servers
- Sandy Bridge/CIM Comparision
OverviewCIM™ (Compute Intensive Multicore) array technology increases performance and reduces energy usage of servers in data centers and cloud computing. For compute-intensive applications, efficiency gains of up to 500:1 versus standard servers are achievable. Up to eight (8) CIM array cards (and in some cases more) are plugged into a half-size, single-width PCIe slots in x86 or ARM servers running Linux, making the product fully compatible with existing investment. CIM allows sections of C source code to be marked, or annotated, for acceleration using OpenMP style syntax.
ApplicationsApplication examples include data center and cloud processing such as video content delivery (to PCís, smart phones, and TV), optimizing telecom carrier network bandwidth, national security applications (such as automated search of surveillance and UAV/drone video data), computer vision, bioscience research, and High Performance Computing (HPC) tasks such as climate modeling, particle physics simulations, and oil and gas exploration.
OpportunityToday there are 10 million servers sold annually with 10% going into data centers, server farms, and cloud computing.2 A typical server farm consists of 1,000s of Intel x86 CPUs. Since its beginnings in the early 1980s, the x86 CPU has evolved to be energy inefficient for Compute Intensive (CI) tasks due to 2 root causes: (i) every new multicore x86 CPU released by Intel contains millions of transistors required to support legacy software (e.g. DOS and Windows code), and (ii) high power requirements of Intel CPUs necessitate large heatsinks and cooling fans that limit the number of CPUs per server. Each x86 CPU typically requires 30-100 watts of power, a sizable heatsink, and one or more cooling fans (which take additional energy). The cost of energy to cool the server and maintain nominal operating temperature can equal or exceed the cost of energy used by the server to perform its processing function. Overall, todayís situation is an inefficient use of energy and space.
Signalogic SolutionSignalogicís proprietary, patent-pending technology enables creation of dense, thin multicore CIM arrays that can be inserted into a wide variety of servers. The technology leverages Texas Instruments low energy, compute-intensive multicore CPU C6x series. No restrictions or special modifications to the server are required, and full compatibility is maintained with Linux operating system and programming tools. One CIM array can increase server CI performance by up to 500 times, depending on the nature of the software application, while consuming only 25 to 40W of power. Operating cost may be reduced in several ways: fewer servers, less energy usage, less expensive servers (fewer x86 cores, less memory).
Using CIM to Protect Your IPAnother advantage of CIM technology is as a means to protect your IP. Since the build process operates at the source code level, the CIM stream is not built with Gcc tools, is not linked with Linux libraries or drivers, and is not subject to GPL or other open source/free software licenses. When CIM code runs in executable form on the CIM array, it's running on hardware completely separate from the motherboard, x86 or ARM host. Whether third-party licenses apply depends solely on your source code content.
Custom SolutionsSince TI multicore CPUs on a CIM array card can process arbitrary C code, it's possible to move an entire function or program to the CIM array. For example in an application performing face recgonition from surveillance video, optmized performance and/or optimal IP protection might be obtained by moving the entire recognition algorithm to the CIM array, rather than limiting to compute-bound sections. Signalogic supports such customized projects, and may accept NRE work for such projects, on a case-by-case basis.
Development StatusFirst generation CIM arrays are being tested in generic x86 servers to accelerate example applications such as matrix operations, coordinate transformations, and signal processing functions such as FFT, convolution, and DCT (used in video analytics). In some cases improvements of 10 to 1 or more vs. a quad core x86 server alone have been obtained. Next stage development will (i) increase the level of open software support needed for a wide range of applications, (ii) build second generation CIM array hardware based on TIís faster, higher core-count C66xx devices, (iii) implement measurement and dynamic adaptation to the host serverís thermal profile, and (iv) engage in trials with key customers. NOTES 1 HPC = High Performance Computing
2 Gartner, Inc., Nov 29, 2010 http://www.gartner.com/it/page.jsp?id=1479923