Follow us on Facebook
Follow us on Twitter
Signalogic on LinkedIn

Home > HPC > FFmpeg Accelerator
FFmpeg Accelerator / HPC card for servers.  FFmpeg command line interface
FFmpeg Accelerator / HPC card for servers.  FFmpeg streaming command line interface
FFmpeg screen captures with c66x acceleration enabled

VMM dialog showing VM configuration of ffmpeg hardware acceleration resources
Virtual ffmpeg hardware with c66x accelerator (VMM dialog / VM configuration screen cap)
side-by-side desktop and tablet with streaming
Side-by-side desktop capture and tablet player, with c66x accelerator low-latency RTP streaming
Click here to see a YouTube video of VDI streaming

FFmpeg Accelerator -- FFmpeg Hardware Acceleration


Overview

Product Information

Information for software and 32-core and 64-core accelerators below includes specifications, documentation links, and technical support.

CIM32 PCIe Card

CIM64 PCIe Card

Signalogic Part # DHSIG-CIM32-PCIe / DHSIG-CIM64-PCIe
System
Manufacturer
Signalogic
Signalogic Logo
Description FFmpeg Accelerator
Product Category HPC (High Performance Computing)
Product Status Preliminary

Default tab content (this text) shown when (i) no tabs are automatically selected, or (ii) if rel value for the tab ="#default".

Availability
Stock: 0
On Order: 0
Factory Lead-Time: 8 weeks
Pricing (USD)
Qty 1: 4500 MOQ: 1
                      
* Required fields
includes a test/demo sw license
Promotions
Current Promotions: None

Why Use Hardware Acceleration

Various vendors talk about GPU, DSP, FPGA, etc. but these are all "really hard to use" solutions. These vendors miss the point, trying to prove that servers are not traditionally good at streaming and image processing. Obviously servers are plenty good at running ffmpeg and image processing.

What is fair to say is that trends in server architecture, including virtualization, DPDK, and software-defined networking, have increased performance expectations of individual servers. In the past HPC users were willing to stack boxes; now due to virtualization, they expect one box to be a stack of VMs -- without penalties in performance, latency, and network I/O throughput.

Signalogic's software and hardware acceleration products are designed to take full advantage of server architecture trends. They augment modern servers, providing well integrated functionality and user interface, as opposed to alternative, non-mainstream methods. User interfaces include ffmpeg command line, opencv APIs, OpenMP programming, and VM configuration. Within the latter, our products add cores, DDR3 mem, and NIC resources that can be used both with/without virtualization. The result is improved server performance density, reduced latency, and increased stream concurrency, without having to use proprietary programming languages, study a 300-page chip vendor data sheet, work with low level "SDKs", or other time-consuming development efforts.

FFmpeg Command Line Examples

Below are some example ffmpeg commands. Adding the option "-hwaccel c66x" enables c66x hardware acceleration.

/ffmpeg -y -hwaccel c66x -s 352x288 -i ../video_files/
airshow_352x288p_30fps_420fmt.yuv -c:v libx264-b 1500000 -r 30 ../video_files/test.h264

/ffmpeg -y -hwaccel c66x -s 352x288 -i ../video_files/ airshow_352x288p_30fps_420fmt.yuv
-c:v libx264 -b 1500000 -r 30 -f rtp rtp://10.0.1.65:45056:60-af-6d-75-75-f1

/ffmpeg -y -hwaccel c66x -s 352x288 -i ../video_files/airshow_352x288p_30fps_420fmt.yuv -c:v libx264 -b
1500000 -r 30 ../video_files/test.h264 -f rtp rtp://10.0.1.65:45056:60-af-6d-75-75-f1

FFmpeg in VMs

c66x accelerator resources can be assigned / allocated to VMs, allowing multiple instances of ffmpeg to run concurrently, each with their own accelerator cores, memory, streaming network I/O, independently from the server motherboard. Currently the KVM hypervisor is supported, along with QEMU, libvirt, and virt-manager. Below is an image showing installation of an Ubuntu VM, running on a CentOS host:

VM configuration of ffmpeg hardware acceleration resources
VM configuration screen cap, showing c66x accelerator resources

HPC Software Model

Below is a software model diagram that shows how c66x CPUs are combined with x86 CPUs inside high performance computing (HPC) servers. The user interface varies in complexity from left to right, from simple (command line) to complex (OpenMP pragmas). Pragmas are compiled by the CIM® Hyperpiler, which separates C/C++ source into "soure code streams" and augments streams with additional, auto-generated C/C++ source code. Multiple different applications can be run at the same time, as concurrent host processes, concurrent VMs, or a combination. Adding DPDK capability to application user space is optional.

HPC software model for combined c66x and x86 CPU servers
Software model for combination of c66x and x86 CPUs in HPC servers

Notes

1CIM® = Compute Intensive Multicore
2RTAF = Real-time Algorithm Framework
3HPMN = High Performance Multicore Network stack

CPU vs. GPU Overview

CPU and GPU devices are constructed from fundamentally different chip architectures. Both are very good at certain things, and both are not so good at some things -- and these strengths and weaknesses are mostly opposites, or complementary. In general:

Neither of these devices is a "do-all" or panacea for complex computing, or somehow fundamentally superior to the other, as the marketing hype would have you believe. People tend to forget that top global semiconductor manufacturers are all on the same technology curve -- which of course makes sense, as they all use the state-of-the-art semiconductor manufacturing technology of our time. If you look closely at the two key factors that form the basis for Moore's Law, performance and power consumption (the chip metric is GFlops / Watt), you won't see much much difference between Intel, Nvidia, Texas Instruments, Xilinx, etc. What you will find are differences in corporate practice and marketing culture, ingrained over very long periods of time -- 30 years or more -- that makes one manufacturer or another more adept at serving certain market segments, with advantages (or disadvantages!) in package size, memory bandwidth, onchip integrated peripherals, programmability, etc.

Below are simplified diagrams of CPU and GPU accelerators.

c66x multicore CPU accelerator diagram
GPU accelerator diagram
c66x multicore CPU accelerator diagram (shown for a CIM-64 card, with 8 cores per CPU and 2 GB mem per CPU). All CPU cores have NIC access GPU accelerator diagram (shown for a Kepler 80 card, with 13 Streaming Multiprocessors (SMs) per GPU, 192 CUDA cores per SM, and 12 GB mem per GPU). A GPU can have literally 1000s of "CUDA cores"

In comparing the diagrams above, some obvious differences and similarities stand out:

How to Purchase

Enter the fields shown above, including:
  • Required Number of Streams
  • Required Number of VMs
  • Max Frame rate
  • Max Resolution
  • Qty -- number of cards required
  • Your discount category
Then click on "Get Quote / Info" to receive pricing via e-mail.
      

Related Items

Video Algorithms Video Analytics
Video Algorithms Video Analytics

Tech Support

Signalogic's engineering staff designs, develops, maintains, operates, and tests software and hardware in the company's in-house lab, using servers from HP, Dell, Supermicro, Artesyn, Advantech, and others, and Linux installations including CentOS, Ubuntu, Red Hat, Wind River, Gentoo, and more. Streaming output is tested using Android and Microsoft Surface tablets with players including VLC, MPlayer, and others. Customers can submit technical questions via e-mail, phone, or Skype chat.

Signalogic engineers are experts in server and embedded system development. Unlike retailers and distributors who offer analytics and video-related products, Signalogic can also perform contract development. Signalogic is a member of third-party programs for HP, Dell, Intel, and Texas Instruments. A high level of expert tech support is a distinct advantage when purchasing products from Signalogic.