Challenges and Opportunities in Academic HPC Systems Research in 2030

Eurolab4HPC supporting HPC innovation.

Expert view: Martin Schulz

  • Power bound computing – use more nodes, etc. and find best configuration (makes more sense with the slides)
  • Network contention: harder to predict execution times but cheaper in terms of hardware.
  • Manage complex workflows
  • Monitor the application and hardware stack at all points. [Seems more comprehensive than many do, but very interesting].
  • Grafana
  • Malleability 
  • Dynamic processing management in MPI not really used. It needs to be more adaptive. C.f. PVM (too dynamic). Bubbles in MPI 4. Could enable on-the-fly adjustability, workflow support and other things, ability to remove a group without global impact or duplicate them (cf. Apache Spark?). Applications need to understand how to work with this unless use very high-level abstractions.
  • Questions:
    • Heterogeneous systems? Code tightly or more abstract so more flexible? (cf. Chapel, X-25 which is relatively high abstraction). This exists a bit but needs to be done more, e.g. to leverage accelerators where they exist and so on.

Expert view: John D. Davis

  • LOCA: HW/SW co-design for IoT to HPC
  • Using ARM a decade ago
  • European Processor Initiative (RISC-V)
  • Questions:
    • Co-design of what?
      • Reconfigurable hardware?
      • BLAS is still the base of very many calculations so given this perhaps there is less variability in workloads that might initially seem to be the case.
    • Patent risk?
      • Open source: some things are hard to open source.

Discussion

  • See slides
  • New HPC applications?
    • Libraries, frameworks.
    • ML for simulation?
    • Scaling – where speed needed. But some things scale weakly. Can use ensemble.
    • New algorithms: multi-model, multi-scale, coupled simulations.
  • Climate modelling – how much processing power required? 100 Exa FLOPS?
  • Programming complexity?
    • Neuromorphic?
    • MPI, OpenMP likely to continue. But these change over time.
    •  
  •  

Conclusions