HPC-SIG Meeting 2018-09-24


The Agenda for this meeting was:

10:00 – 10:30 Arrival & Refreshments
10:30 – 10:35 Welcome from Chair
10:35 – 10:55 Leeds’ HPC for Secure Data (Martin Callaghan)
10:55 – 11:15 Managing Secure Data at Swansea  (Simon Thompson)
11:15 – 11:35 Panel Discussion
11:35 – 12:30 Breakout Sessions on the challenges of managing secure data with HPCs
12:30 – 13:30 Lunch
13:30 – 14:30 Feedback from Groups
14:30 – 15:00 ‘Tell me what you want, what you really, really want !’  –  Potential SIG Priorities.
15:00 – 15:30 Coffee
15:30 – 16:00 ‘Here are the results of the Norwegian jury !’ – Voting on the priority levels of identified SIG activities
16:00 – 16:15 SIG Business and Close


The notes from the meeting follow:

10:30 – 10:35 Welcome from Chair

The initial address by Christine Kitchen outlined the format for the day (breakouts, feedback). There was a reminder about subscriptions, and updates on the Terms of Reference and the website.

10:35 – 10:55 Leeds’ HPC for Secure Data (Martin Callaghan)

See the presentation which will be attached here.

Other notes include:

  • 11000 cores + more coming 
  • ISO27001 VM farm for Windows 
  • Four data type classification (w.r.t. security) 
  • Problems and solutions:
    • ISO27001 complex for users and for admins to modify/update. 
    • Not Linux for secure data. 
    • Alces Flight – disposable isolated HPC clusters 
    • Kubernetes on Azure cloud 
    • Azure batch on Azure cloud 
    • Cloud needed to support security affordably, and a big departure for a ‘tin hugger’? 
    • Cloud: funding model challenges/changes (no longer free at the point of use?) 
  • Questions 
    • Cliff Addison asked about Cyber Essentials. Leeds is working through this on user education issues. 
    • Cost effective for data that has to remain in the UK? Is cost the best metric? Well, it still needs to be taken into account. Requires working with researchers to examine the new landscape. Ongoing. Chris K: storage is an issue – UK and encryption of data becomes very expensive. 
    • Export controlled data? Not handled at Leeds, but is a secondary. Aaron gave a brief overview. 
    • Martin noted a secure safe rooms. 
    • Why not just do it on the cloud themselves? They can, and it’s not an issue. RC want to make it easy for people who don’t have the technical or governance skills. Cliff: there can be compliance issues and a central source is helpful for demonstrating compliance. Protects reputation of university and researchers 
    • Governance models: retained 
    • Why will it need to be chargeable: existing budget is fixed and spent for the next two years. Real cloud usage would be outside this. Costs need to be controlled. Jacky: is there a consultancy service to allow researchers to build costs into grant proposals? Martin: this will be coming. A need to be a realistic of costs, and value:cost issues, and the costs imposed on grant proposals. 
    • Andy Turner asked: beginning of the end of local hosting? Martin: probably not, but who knows. Chris: support staff are the important element, not the data centre. 

Other questions: On sli.do to be answered

10:55 – 11:15 Managing Secure Data at Swansea  (Simon Thompson)

The presentation will be available here.

Other notes include:

  • 5PB medical imaging wales project 
  • CLIMB: related 
  • NLP and NHS free text data 
  • Federation: processing near data, as can’t securely move the data. 
  • Cloud and USA ownership? 
  • Questions: 
    • Data egress: human intensive, and not very scalable. 
    • Intrusion detection and data transport: SCP, SFTP, etc. Intrusion detection creates a lot of FP. Everything is logged, and algs being built to detect it. 
    • Data providence: Trust the data that is incoming to be setup to be ingested 
    • Audits cover automation, etc. 
    • SQL instances: need more explanation on this answer with respect to the answer. 
    • NHS regulations: actually helps as NHS is used to sharing data securely. It -is- risk averse, though. 
    • Use of blockchain: not yet, not can see much use. 
    • Patriot act concerns: all main vendors US owned and could take the data. This is an issue, but NHS Digital ok, provided you do the diligence. UK Cloud might be used (govt. Owned). Also UK data centres. Pricing? Cloud and cost of storage for 15 years compliance (10x the cost of on-premise). 
    • Will the federated model continue? This is being built now as a framework for late 2018, as a pilot. So federated assumed

11:15 – 11:35 Panel Discussion

Some topics discussed included:

  • Patriot act + Brexit: Current framework relies on EU membership – will this break the use of US-based cloud providers? (Eduroam too). SSH + federated security for HPC. Different RCs have different requirements. 
  • What are the user requirements – system should prevent this? System admins: need secrity clearances. System admins being ‘owned’ by the team as reputational risk to the team doing the project. 
  • Is it trust or a ‘blame framework’? Simon: not blame.  
  • Different security levels? The tagging comes from the data provider.  
  • How is data that shouldn’t be there determined? This is hard to police sometimes. Some automation of this possible. 

11:35 – 12:30 Breakout Sessions on the challenges of managing secure data with HPCs

SIG-Breakout Groups-v1

There were lively discussions that continued over lunch.

13:30 – 14:30 Feedback from Groups

This was presented to the SIG, but not captured.

14:30 – 15:00 ‘Tell me what you want, what you really, really want !’  –  Potential SIG Priorities.

Some topics discussed included:

  • Supplier representation at HPC-SIG. Don’t need to be members – but could a subscription by vendors be used to do something useful (bursaries, etc.) 
  • How useful was the SIG for Hull? Personal contact, 2010 report very, very useful.  
  • Authority of advice and expertise is useful. 
  • Introductions – lack of was intentional – does that work or not. 
  • Bios on the web pages (links to LinkedIn or similar)? Some don’t like LinkedIn. Need to check with GDPR – or share some other profile? Can we link to LinkedIn according to the LinkedIn T&Cs? 
  • What is the purpose of the website – promote research, careers, resources. Drive traffic to it. But has to be a communal responsibility.

15:30 – 16:00 ‘Here are the results of the Norwegian jury !’ – Voting on the priority levels of identified SIG activities

This was conducted electronically.

The following questions were asked and responses received:

Question Reponse
Is there interest in a mentorship / buddy scheme? 86% Yes of 29 responding.
Is the slack channel useful? 70% Yes of 30 responding.
Should we be looking at other social media channels?(if yes, please provide suggestions to the SIG committee) 29% Yes of 28 responding.
Is there interest in promoting the diverse members with links back to webpages as well as a service synopsis (couple of paragraphs) to help new sites learn about the cores services/mission of the centres 100% Yes of 32 responding.
Is there interest in producing some impact case studies or a SIG Annual Report? 97% Yes of 31 responding.
Should the SIG have a dedicated session in future to promote achievements of early stage careers impact in service delivery? (Sponsorship via suppliers to provide prizes potential or using some of the subscription costs to provide this)? 90% Yes of 29 responding.
Should we open the membership to technical representatives from the Supplier Community (technology / software) under strict conditions (NO MARKETING! and no more than two reps per meeting) with two open sessions (supplier allowed) and two closed sessions (HEIs and current affiliated members) per annum? 44% Yes of 32 responding.
Membership fees: Do the current costs need revising? 67% Yes of 27 responding.
Is the current scheduling of meetings optimal (Start and End Times / Locations)? 100% of 27 responding.
Knowledge Exchange:Is there interest in creating an F.A.Q / Knowledge base to support the development of best practice across centres? This was based on strongly disagree to strongly agree. Overall result 1.03 (agree) over 33 respondents. 3 disagreed, and 3 were neutral.
Is it possible to establish best practices and standardisation across the member sites? 71% Yes of 28 responding.
Can we identify a top 3 issues for the HPC-SIG to focus efforts on? (if so, we can provide post-it notes / paper at the back of the room for sites to produce the top issues and the three most frequently mentioned will be presented back to the SIG as the top 3 issues for us to focus on!) Comments received included:

  • Mentorship
  • Training
  • Website
  • Skills development
  • Sharing best practices
  • Community building
  • Encouraging junior members of staff to attend meetings
  • Cross-site mentoring
  • Identifying common training for new sysadmin staff and developing open source materials to be shared
  • Skills Development
  • Case Studies / Annual Report
  • Repository of best practice materials (to support justifications and provide common language across institutions)
  • Finding others who use the same technology
  • Big process things like how people have done ISO 27001
  • Promoting HPC to the wider community
  • Support for tier 3 institutions with modest HPC resources
  • Skills/competencies for hpc staff, along with a recommended recognition/accreditation routes
  • Profile of the SIG and the work it contributes to (already mentioned, but case studies etc)
  • Careers visibility, training, specific technical sessions
  • Dealing with sensitive data
  • Career development and professionalisation of HPC
  • Promoting the use of HPC by our institutions to major stakeholders and decision makers
  • New technologies, efficient use of resources, science driven technology selection
  • Building communities with other HPC interest groups (users/RSE)
  • Collaborating on training materials/workshops regionally/nationally
  • I think networking is by far the most valuable aspect of the group.
  • I think the previous report was useful but I’m not sure the effort of producing one annually is justified.
  • Recruitment, Cloud engineering approach (with cloud-wg?)
  • Identity – who are we and what we do, and how it is changing over time
  • Attracting new talent
  • Re-connect with funders/government for proper representation of the community’s concerns, interests and whom we seek to support
  • Procurement, training resources, impact reporting
  • Cloud solutions and how to make them easily usable by an average HPC user without learning all about the cloud
  • Recruitment/retention
  • Federated authentication
  • Moving beyond bare metal HPC
  • Value of HPC / support for computational intensive computing to Universities and why its people need care and feeding.
  • Case studies of things that work / did not work [never see these] partly for the group’s benefit but also to make the value arguments more concrete.
  • How can HPC and related people best cope with tsunami of different challenges like security, diverse workflows, specialised hardware (e.g. GPU) needs


16:00 – 16:15 SIG Business and Close

There was no additional SIG business.