Programming Languages for High-performance Computing

Printer-friendly version

This is a module of the Capita Selecta of Software Engineering course.

In this module, we will look at various models (and languages) for parallel computing, in the broad domain of high-performance computing (HPC).

Module contents

Partitioned Global Address Space

Partitioned Global Address Space (PGAS) is a parallel programming model where large data objects are partitioned across different processors (each processor has a “local” part of the data), yet made globally available (each processor can access all of the data). As such, the PGAS model preserves most of the convenience of shared memory programming, but explicitly considers data locality, which is crucial for achieving good performance.

In the domain of high-performance computing (HPC), the PGAS model is a promising approach to program computer systems with distributed memory (e.g. a cluster of computers on a local network), or “multi-core” and “many-core” architectures where memory access is no longer uniform across different computing cores (Non-Uniform Memory Access (NUMA) architecture).

Programming languages and libraries have integrated the PGAS model in a number of different ways. Unified Parallel C, Co-Array Fortran and Global Arrays are PGAS programming approaches that follow the SPMD (=single program, multiple data) model of parallel programming. On the other hand, X10 and Chapel implement an asynchronous PGAS model which provides a richer execution framework with a number of concepts such as task pools and asynchronous task creation.

Schedule

  • Monday, 14 October 2013, 15:00-17:00, room E2.04
  • Monday, 21 October 2013, 15:00-17:00, room E2.04
  • Monday, 28 October 2013, 15:00-17:00, room E2.04
  • Monday, 04 November 2013, 15:00-17:00, room E2.04

The classes will take place on VUB campus Etterbeek. The room is in building E (map).

Asignments 

There are two assignments for this module:

    • The report is an assignment for all students (i.e. KUL, UA and VUB students) with deadline 13 January 2014, 9:00 AM  -  (The report for the second session has to be handed in by the 17 of August 2014, 9:00 AM)
    • The project is an assignment only for the VUB students, also with deadline 13 January 2014, 9:00 AM

For details regarding the assignments, look at the end of “session 2 slides” below.

Report

As inspiration, here are some examples of problems for which you could consider a parallel implementation:

    1. Box or Guassian blur (link)
    2. Discrete Fourier transform (link)
    3. Ray tracing with line-sphere intersection (link)
    4. Conway's game of life (link)
    5. Power method to estimate eigenvalue of infinite matrix (link #3,link)

Project

Project assignment document

You are tasked to parallelize the Approximate Boyer-Moore-Horsepool string matching algorithm. An short overview and context of the project assignment has been presented during the session 4 lecture. You are given working source code implementing the algorithm in a sequential fashion. This source code is bundled with a realistic test case (+ data files) for testing and debugging purposes.

You are free to choose your parallelization toolkit from the following list:
 OpenMP, Intel TBB, Cilk++ or Cilk.
 Alternatives are allowed only by permission of the teaching staff. The provided source code is in C++, with alternative programming languages again allowed only by permission. The code you hand in should not depend on commercial applications or libraries (e.g. VisualStudio or XCode project file).

Turn in the following via Pointcarré before the deadline:

    1. A tarball (.tar.gz) of your source code. (do not include the big data files)
    2. A short (max. 5 pages) report containing:
      1. how to run compile and run your code, (hint: a basic makefile is already provided)
      2. explain what parallelization approach you have taken,
      3. report on your experiments and obtained results (scaling, parallel speedup) and
      4. write down your conclusions and insights.

The source code and data files for the programming assignment can be found on http://pointcarre.vub.ac.be/.

Material

    • Session 1 slides [ pdf ], programs [zip]
    • Session 2 slides [pdf], programs [zip]
    • Session 3 slides [pdf]
    • Session 4 slides [pdf]

Pointers for UPC and GA:

Pointers for X10 and Chapel:

Pointers for OpenMP, Cilk and Intel TBB:

Pointers for Approximate Boyer-Moore-Horsepool: