Democratisation of HPC with Portals

Panel Introductions

Open OnDemand

  • Vision: web based access, easy to use. In use since 2016. Good use for the Ohio machines through this. Can also create GUIs.

SGCI

  • Associated with XSEDE. Science gateways project member. ‘Full Stack’ approach. Apache Airavata. Federate multiple clusters, clients. 

ActiveEon

  • HPC + AI solutions. Portals. SLURM, PBS. LSF, on premise, cloud, containers. Easy access to RDMA (not sure what they really mean there).
  • Different portal for relevant use cases.
  • Tensorboard GUI, job dependencies GANTT, etc, Jupyter

NIMBIX

  • Headquartered in Dallas.
  • Kubernetes – new, but experience with containers.
  • HyperHub
  • JARVICE – end user portal.

Altair

  • PBS Works – includes portals (Altair Access) with 3D remote visualisation.
  • Config management and HPC monitoring.
  • Analytics for job data
  • Scheduling simulation

Rescale

  • HPC Cloud Agility. 
  • Offers portals, but also IT department control.

Atos

  • XCS – Extreme Computing Studio portal.
    • Supports all major schedulers
    • Role-based, customisable.
    • Mobile available.

Historical Comment

One comment I would make here is that I saw very similar things 5 years ago. If you look at the old NGS portal, some of this 10 years ago. It seems easier to use, more flexible, but it’s taken a while. Hopefully the products shown will stick around. Missing are things like UniCore, UberCloud, etc.

Discussion

Top features

  • Open OnDemand: HTML5 so easier to set up and keep secure. Secure installation.
  • Know your user community that will use the portal. User experience consultations encouraged (cf. CARMEN project which formally did this for the neuroscience community).
  • Capture metadata (provenance). Again cf. CARMEN (2007-13) and sharing of metadata (ditto). So this is not a new idea, and the permission groupings suggested are exactly the same.
  • Science Gateway versus portal?
  • Openness and extendability important.
  • Needs to be a seamless extension to their current workflows – not worrying too much about the details.
  • Self-service.
  • Security
  • Ease-of-use – HPC as another tool for those who aren’t HPC experts.
  • Remote visualisation is critical due to data locality.
  • Universal accessibility – can use wherever they are.
  • ‘Make HPC tolerable’
  • ‘Not a black box’ – tell me what’s going on.
  • Integrate with existing (e.g. with AD or LDAP you are already using).
  • Some automation is easier on command line. Open OnDemand supports this, but could also hook into APIs to create other automations. This is useful.
  • Some people want to use the Matlab GUI.

Other Discussions

  • Handling of OpenGL, etc for remote visualisation.
    • Atos uses on technology – better than VNC.
    • Open OnDemand – not sure
    • SciGaP – work out on an individual use case basis
    • Rescale: uses GPU on the remote side, then HTML5 to the browser – very flexible as long as application supports OpenGL. VirtualGL. If very heavy visualisation user then local is still more responsive.
    • Remote visualisation now actually good.
    • Need to ensure that the visualisation integrates into the workflow, e.g. seeing visualisation when the job is running (already see this at Loughborough).
  • Will a Web GUI compromise performance – does ‘dumbing down’ the options potentially lead to a reduction of performance as best performance lies in those small tweaks.
    • ‘It depends’. If the platform is good, no? Portals as an entry way. [It pushes the need for intelligence back to the platform?]. 
    • SciGaP – work with cluster owners. Also a ‘power user’ option.
    • Input validation useful.
  • [A question I would ask is whether any of these tools, where a workflow is composed (and they didn’t seem to emphasise workflows) can dump out a workflow description that can be used elsewhere? The YouShare workflow tool spat out an abstract workflow that could then be loaded onto the portal and given the appropriate inputs and output locations to turn it into a concrete one. I would hope that these tools would support that, but export would be good as that means that if a tool can no longer be afforded, etc., the workflow could be saved, ideally in an open format, or convertible to one. Also it’s important if someone wanted to take a workflow used locally and move it to a national resource and scale it up, for example].
  • Can use on premise? Yes
  • Standardised interface to schedulers. [Surely this would be DRMAA?]. 
    • For the components of the web elements – use REST
    • Don’t expect to standardise REST APIs for web interfaces.

Future

  • Standardisation at highest level
  • App-centric
  • Less need for DevOps/admins, RSEs, etc.
  • More domain-specific
  • AI and HPC?
  • HPC ubiquity