Parallelize one of your favorite applications and Show speedup on a CMP processor


  1. We may choose application “em3d” or “treeadd” to parallelize from Olden suite.
    • em3d : models the propagation of electromagnetic waves through objects in three dimensions.
    • treeadd : recursively sums the values stored in each of the nodes of trees.
  2. The value of each E node is updated by a weighted sum of neighboring H nodes and vice versa.

Thus, the dependencies between E and H nodes form a bipartite graph.


  • node structure
  1 typedef struct node_t {
  2   double        value;      /* Field value */
  3   int           edge_count;
  4   double        *coeffs;    /* Edge weights */
  5   double        *(*values); /* Dependency list */
  6   struct node_t *next;
  7 } graph_node;



  • Advantage(compare with multiprocessor)

1. Fast deal with cache coherency:

    Cache coherency circuitry can operate at a much higher clock rate.

2. Less PCB:

    Multi-core CPU designs require much less Printed Circuit Board (PCB) space than multi-chip SMP designs. 

3. Low power:

    Because of the increased power required to drive signals external to the chip and because the 
    smaller silicon process geometry allows the cores to operate at lower voltages.
  • Disadvantage

1.Software adjustment:

   In addition to operating system (OS) to support, adjustments to existing 
   software are required to maximize utilization of the computing resources provided 
   by multi-core processors.


   They are more difficult to manage thermally than lower-density single-chip designs.

3.Using of silicon surface:

   From an architectural point of view, ultimately, single CPU designs may make better 
   use of the silicon surface area than multiprocessing cores

4.Memory bandwidth and system bus:

   Two processing cores sharing the same system bus and memory bandwidth limits the real-world
   performance advantage.
   If a single core is close to being memory bandwidth limited, going to dual-core 
   might only give 30% to 70% improvement. If memory-bandwidth is not a problem a 90% improvement can be expected.
parallelize_application/parallelize_application/this_week.txt · Last modified: 2010/05/22 09:20 (external edit)
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki