Towards Peak Performance with IBM POWER3

Contents

  1. Towards Peak Performance with IBM POWER3
  2. Topics
  3. The POWER3 Residency at Austin, Texas
  4. RED BOOK: SG24-5155-00 RS/6000 Scientific and Technical Computing: POWER3 Intro
  5. RS/6000 Processor Roadmap
  6. POWER3 Architecture
  7. Stride and Data Cache Structure
  8. Stride in Fortran loops
  9. Data Cache structure
  10. Effects of stride on performance
  11. The 4-Way Set Associative POWER2 Data Cache
  12. The 128-Way Set Associative POWER3 Data Cache
  13. Translation Lookaside Buffer (TLB)


  14. Superscalar Floating Point Units
  15. POWER3 Floating Point Unit - Superscalar Pipeline
  16. IBM XL Fortran - new with V5
  17. XLF V5 64-bit Support
  18. IBM XLF V5 Compiler Optimisation Improvements
  19. XLF V5 Compiler - SMP support
  20. Compiling and linking for SMP
  21. XLF V5 SMP Directives
  22. Tuning for Maximum Megaflops
  23. Hand tuning techniques
  24. "Common sense" example
  25. Sequential processing
  26. M x N unrolling (matrix multiply)
  27. PERFORMANCE Recent Customer Benchmark
  28. Benchmark Performance POWER3 vs POWER2
  29. Benchmark Performance XLF V5 vs XLF V3
  30. SMP Performance: Throughput
  31. XLF V5 SMP Performance 2-way parallel
  32. MPI
  33. MPI vs XLF V5 parallelisation within an SMP
  34. HPF "High Performance Fortran"
  35. MPI within an SMP
  36. SMP nodes in RS/6000 SP
  37. SMP/SP programming paradigms
  38. SMP/SP paradigms