Ref: http://ati.amd.com/technology/streamcomputing/Stream_Computing_User_Guide.pdf
System - The following notes are compiled based on the following system.
Intel CPU
ATI Radeon (Check cards for GPU computing capability)
Microsoft Visual Studio .Net with C/C++ compilers - for C/C++ code
Intel Visual Fortran Compilers - for Fortran code
Brook+ SDK by AMD - to compile Brook code
br source file
================
The Brook+ source file contain code that follow C/C++ syntax and is compiled/pre-processed by the Brook+ compiler into C/C++ file. Both Brook+ functions and C/C++ functions can exist within the same *.br file. The Brook+ functions are the functions that utilises the GPU hardware.
Example of a Brook+ function is given below:
kernel void sumaa(float a<>, float b<>, out float c<>){
c = a + b;
}
1. Special Brook+ keywords (ie. Not C/C++ words): kernel, out
2. Note the template like structures "float a<>" which are recognized by the Brook+ compiler. They indicate stream / GPU data type and are not the same as C++ templates.
3. Multiple functions like the above can exist in the same *.br file. Other normal C/C++ functions can also exist inside the *.br file.
Compiling Brook+ Code (*.br)
=============================
0. Open up a Command Console and go to the directory where the *.br file is located.
1. To compile code called sum.br:
where
2. This the brook+ compiler / preprocessor creates the following in the same directory.
sum.cpp
sum.h
sum_gpu.h
3. A few notes to consider
i) There are two compilers: brcc and brcc_d. They correspond to brook.lib/dll and brook_d.lib/dll respectively.
Using the wrong combination may crash the program during execution.
ii) The -k option generates intermediate code that may be useful for use with the AMD's Stream Kernel Analyzer.
iii) The C/C++ code that are generated need to be compiled using standard C/C++ compilers and link to the proper libraries and dlls, hence the next section.
iv) Before v1.3, C/C++ wrapper functions, also known as host side code, exist within the *.br source file. As of v1.3, the host side code can be written in C++ and exist in a separate normal C++ file, provided it is configured with the proper include and lib directory information.
Compiling the C/C++ code
==========================
This step produces a win32 DLL from the C/C++ code that are generated by Brook+. The resultant DLL should be
able to be used by other win32 applications (eg C++ or Fortran).
1. From Visual Studio .Net, Open a new solution / project by:
Add Project -> Visual C++ -> Win32 -> Win32 project.
In the Application Settings dialog, select DLL, Export Symbols
2. Add the *.br and the files generated by the Brook+ compiler into the current project by using
"Add existing file".
3. Under the Project Property configuration pages, add the following settings:
C++ -> Additional Include Directories:
C++ -> Code Generation -> Runtime Library: Multi-threaded Debug DLL (/MDd)
C++ -> Advanced -> Calling Convention: __cdecl (/Gd)
Linker -> Additional Library Directories:
Linker -> Input -> Additional Dependencies:
4. When the *.br is modified, compile the *.br files in Command Console, then compile the generated c/c++ code from within the VisualStudio.Net environment.
Some Notes:
i) One can configure VisualStudio.Net to accept *.br files and compile using the Brook+ compiler. However, I find
that it still requires the user to manually initiate compilation for Brook files and then for C/C++ files. Hence,
I don't find it to be any efficient than compiling by command line.
ii) The *.br source files can be added to the project and can be edited using the VS.Net environment.
The C/C++ driver or library wrapper
====================================
The Brook+ functions need to be wrapped or called directly from C/C++ functions. For the purpose of creating DLL functions, we will put C/C++ wrappers over the Brook+ functions.
The usage of the Brook+ functions involve 3 steps. Each of these step are described with examples here:
Declaring and sizing variables - the meaning and reason for the declarations will become clear in the following sections.
// Normal C/C++ variables
float input_a[10][10];
float input_b[10][10];
float input_c[10][10];
float input_a1[10];
float input_b1[10];
float input_c1[10];
// For dimensioning Brook+ variables
unsigned int ileng = 10;
unsigned int dims[2] = {10,10};
unsigned int dim1[1] = {10};
// Equivalent Brook+ variables
brook::Stream
brook::Stream
brook::Stream
brook::Stream
brook::Stream
brook::Stream
brook::Stream
// Assign values to normal C/C++ vectors and matrices for:
// input_a1, input_b1, input_a, input_b
..................
1. Reading normal C/C++ variables into Brook+ variables
a.read(input_a);
b.read(input_b);
a1.read(input_a1);
b1.read(input_b1);
This step transforms a normal C/C++ variable into a Brook+ variable which the GPU can understand. No other manipulation need to be done to the Brook+ variable.
2. Performing the computation by calling the Brook+ function
sumaa(a,b,*d); // operating on a matrix
sumaa(a1,b1,d1); // operating on a vector
3. Writing the output from Brook+ into normal C/C++ variables
// old method
streamWrite(*d, input_c);
streamWrite(d1, input_c1);
// new method
d->write(input_c);
d1.write(input_c1);
Once the Brook+ variable has been copied back to a normal C/C++ variable, one can perform other standard operations to the normal C/C++ variable as desired.
Note the use of pointer d* and non-pointer d1 is just to show that both ways are possible.
Using with Fortran
====================
Brook+, being like an extension to C/C++, is better called from C/C++ functions. But, provided that C/C++ wrappers are built for the Brook+ functions and then packaged into a DLL library, then any other language, eg Fortran, can use the GPU by calling on the C/C++ wrappers in the DLL.
Brook+ functions <--- C/C+ wrappers <--- Windows DLL / Unix shared objects <--- Fortran