Global Sources
EE Times - IndiaWebsite
eeBlog-Article content Home / eeBlog / 

Profile
Nickname: n_raj     Articles(8)    Visits(14152)    Comments(1)    Votes(1)    RSS
I have over two decades of experience in the software industry. I have worked extensively in product engineering focusing on system software, embedded and networking technologies. I am currently working as product management consultant in a startup. I was till recently VP in the R&D Services business unit of MindTree. Prior to MindTree, I was with Wipro’s R&D. My other interests include photography and traveling.
Blog Archive:
2011 -  Dec.,  Apr
2010 -  Jul.,  May.,  Mar.,  Feb.,  Jan
View All

Posted: 03:58:40 PM, 12/03/2010

OpenMP and multicore

   

Dear Readers,

In the last article, we saw about the so called “embrassingly parallel operations” which can easily take advantage of multicore systems. In this article, let us see one more way of getting performance improvements on multicore systems. OpenMP is one of them.

 

OpenMP specifications was originally defined by industry vendors like Sun, Intel in 1997. It was popular in Symmetric Multiprocessing (SMP) systems. A typical SMP system is a multiprocessor computer hardware where two or more identical processors are connected to a single shared memory and are all processors run same OS instance.

 

Surprisingly, today's multicore systems are similar to the SMP architecture. Instead of multiple processors, we have multiple cores. All cores access the common shared memory and run same OS instance. That is why, a solution like OpenMP which is from SMP era is suddenly finding a renewed interest in the multicore systems of today.

 

The OpenMP specification is defined for C/C++/Fortran languages. It consists of three parts: compiler directives, runtime library and environment variables. The code is instrumented with directives and it gets compiled with the openMP supported compiler. The code is linked with the runtime library for generating the executable. There are some runtime environment variables that control the code execution.

 

An OpenMP program works like this:

 

  1. Start as a single process called the master thread. The master thread executes sequentially like any other normal program, until the first parallel region construct is encountered.
  2. The master thread then creates a team of parallel threads
  3. The statements in the parallel region construct are then executed in parallel among the various threads created
  4. When the individual threads complete the statements in the parallel region construct, they synchronize and terminate, leaving only the master thread

 

Since the process of creation, starting and joining of threads is done automatically, programmers are relieved of the complexities. The model also allows variables to be locked and shared between the threads and supports fairly advanced features.

Here is an example code from Wikipedia:

 

int main(int argc, char **argv) {
    const int N = 100000;
    int i, a[N];
 
    #pragma omp parallel for    - Compiler directive
    for (i = 0; i < N; i++)
        a[i] = 2 * i;
 
    return 0;
}

 

As first step, the code is compiled with OpenMP enabled compiler. An environment variable, something like OMP_NUM_THREADS is set to the number of threads and the program is executed. Suppose, OMP_NUM_THREADS is set to 4. Code starts normally, but when it reaches the for loop, it creates 4 threads, and each thread does the matrix multiplication for 100000/4=25000 different entries. This speeds up the processing as four threads work in parallel, on different cores.

As we keep increasing the the value of OMP_NUM_THREADS, one could see a decrease in time and improvement in performance, till system bus  bottlenecks start showing up.

The advantages of the openMP include:

  1. Learning curve is low as it builds on existing languages through #pragma commands
  2. It hides thread semantics
  3. one could do incremental parallelization across the code and see the effects. This “Change and See” approach gives confidence to programmers
  4. It supports of good set platforms (C/C++/Fortran on Linux/Windows)
  5. supports both coarse/fine grained parallelism.

 

Main disadvantage of OpenMP is that it needs specific tool chains (compilers, runtime). Not all compiler tool chains support OpenMP. Popular ones include Sun Studio tool chain and GNU 4.3.1.

 

OpenMP can get a big performance improvement on a multicore systems as each thread can run on each core separately and hence translates to better performance.

 

Then why OpenMP is not so well known in mainstream? It is because OpenMP gives big performance gains to mainly mathematical and scientific computing needs like large matrix multiplications. For a desktop application or server application, OpenMP may not be of great help unless the application logic has such code.

 

 

 

 

 

Views(827) Comments(0)
Total [0] users voted     
[Last update: 04:08:00 PM, 12/03/2010]
CONTACT US TO OWN A BLOGNew!  

Have Your Say!

Got something to say? Why not share itwith other engineers?

CONTACT US TO OWN ONE!

SEE WHAT OTHERS HAVE SAID?

Top eeBlog Keywords

1.  book review

2.  fpga

3.  processor

4.  ipad 2

5.  ipad

Voted Article
Datasheets

Datasheets

Looking for parts to specify for your design project? Browse our library of datasheets NOW!

  • LT1640AH Negative Voltage Hot Swap Controller
  • LT4220 Dual Supply Hot Swap Controller
  • LT4250H Negative 48V Hot Swap Controller
  • LT4254 Positive High Voltage Hot Swap Controller with Open-Circuit Detect
  • LTC1643A PCI-Bus Hot Swap Controller

...more datasheets

Industry News

...more

eeBlogs
eeForum homepage