Follow us on Facebook
Follow us on Twitter
Signalogic on LinkedIn

Home > Applications > Computational Finance

Computational Finance Using CIM Accelerators


Black-Scholes formula
Black-Scholes formula. Picture credit derivstrategies.com
Yield surface
Yield vs. time. Picture credit Cornell Computer Science, 2005
Value-at-risk surface
Value-at-risk optimization curve. Picture credit McCormick School of Engineering and Applied Science, Northwestern Univ, 2009

Overview

Computational finance problems and applications often require numerical methods to reach a confident solution in a short amount of time. The numerical methods used are computationally complex, for example Monte Carlo simulation, partial differential equations (PDEs), Fourier transform. and other complex algorithms. In cases where market behavior (and thus human behavior) is being modeled or approximated, confidence level is affected by 1000s, if not hundreds of thousands, of factors. To account for these factors and the high degree of complexity -- and still produce a solution in a timely manner, on the order of minutes or hours as opposed to days or weeks -- high performance computing (HPC) is used.

Applying HPC to computational finance problems typically requires a supercomputing system, usually in the form of clusters of servers1 or "server farms" (sometimes referred to in general as a data center). To achieve cost-effective supercomputing, it's crucial to make efficient use of servers, which consume space, energy, and manpower to program and maintain.

Supercomputing servers that contain CIM accelerators2 provide a solution. For example, a single 1U server with four (4) CIM accelerator cards and CIM software installed can provide up to 5 Teraflops, with additional power consumption (i.e. energy consumption increase over the server alone) less than 300 W3. 20 such servers, installed in one 40" x 19" rack can provide 100 Teraflops of processing performance at under 10 kW.

Applications

Computational finance encompasses many applications which can benefit from HPC, and in particular from CIM acceleration in constrained situations, such as limited space to build and maintain server clusters and data centers, or a need to economize energy and cooling costs. Financial data privacy may also be a concern, imposing requirements on the location and size of data centers. Some application4 examples include:

Source Code Examples

Below is an option pricing source code example, based on the Black-Scholes method, marked with CIM pragmas to accelerate compute-intensive sections of code. Note that CIM pragmas and API calls follow the OpenMP convention (highlighted in yellow below).
/*
   Black-Scholes based option pricing using OpenMP

   Copyright Univ of Manitoba Computational Finance Lab, 2011

   Authors:
     Dr. Ruppa K. Thulasiram, Computer Science, Univ of Manitoba, CFD Lab
     Dr. Parimala Thulasiraman, Computer Science, Univ of Manitoba
     Bhanu Sharma, Computer Science, Univ of Manitoba
     
   OpenMP pragmas modified to use 'cim' keyword, Signalogic, Oct 2011
*/

#include <cim.h>
#include <stdio.h>
#include <math.h>
#include <string.h>
#include <stdlib.h>

#define max(a,b) a > b ? a : b

int main(int argc,char** argv) {     

   double s=10, t=1, r=0.12, u=2, d=0.5, k=5, e=2.71828183;
   int j=0,i=0, n=7, level=0, tid;
   #ifdef _PRINT_DEBUG_
   double start = cim_get_wtime();
   #endif
   double dt=(double)(t/n);
   double p1=exp(r*dt);//=pow(e,r*dt);
   double p= (p1-d)/(u-d);

   int length= (int)pow(2.0,(double)n);
   double disc=pow(e,-r*dt);

   double *st;
   st = (double*) malloc(sizeof(double)*length);
   memset(st, 0, length*sizeof(double));
   double *c;
   c = (double*) malloc(sizeof(double)*length);

   double result=0.0;
   st[0]=s*(double)pow(d,n);
   st[1]=st[0]*(u/d);

   #ifdef _PRINT_DEBUG_
   int num= cim_get_num_procs();
   printf("number of CIM cores-->%d\n",num);
   #endif

// Calculating the stock price at the leaf nodes

   cim_set_num_threads(length-2);
   #pragma cim parallel private(j) shared(st) {
	
      for (j=2; j<length; j++) {
		
         if (((j%2==0) && (st[j/2]!=0)) || ((j%2!=0) && (st[j-1]!=0))) {

            #pragma cim critical {

               if (j%2==0)
                 st[j]=st[j/2];
               else
                 st[j]=st[j-1]*(u/d);
               }  
            }                       
         }
      }

   // Calculating the option values at the leaf nodes

      cim_set_num_threads(length);
      #pragma cim parallel for private(j)

      for (j=0; j<length; j++)
        c[j]=max(0.0,st[j]-k);

   // Going back through tree to the root

      cim_set_num_threads( (int)pow(2.0,(double)(n-1)) );
      #pragma cim parallel private(tid, level) {
	
         tid = cim_get_thread_num();

         for (level=0 ; level<n ; level++) {

            if (tid<pow(2.0,(double)(n-level-1))) {

               int index1 = tid*(int)pow(2.0, level+1);
               int index2 = tid*(int)pow(2.0, level+1) + (int)pow(2.0,(double)level);
               // c[tid*2]= disc*(p*c[tid*2] + (1-p)*c[tid*2+(int)pow(2.0,(double)level)]);         
               c[index1]= disc*(p*c[index1] + (1-p)*c[index2]);         
            }

            #pragma cim barrier
         }    
      }

      #ifdef _PRINT_DEBUG_
      printf("the result is %f\n",c[0]);
      printf("total execution time=%f",cim_get_wtime()-start);
      #endif
   }

NOTES

1 The term "server", as used here, means a typical x86 based 1U or 2U rack-mount server, with one x86 CPU of no more than 95W TDP (thermal design power), running a version of Linux OS.
2 CIM = compute intensive multicore.
3 W = Watts, the standard measure of power consumption.
4 Ethics note. The company takes a position in two (2) specific application areas: (i) high-frequency trading and (ii) CDS bets where participants may insert themselves in the consequences. If we know these are the applications, we reserve the right to not sell the product.