question-mark-1495858_1280

Wednesday 15th January Q&A

Leadership and Career Development

These are questions from the audience during the sessions, not all were addressed. Please contact us if you feel that any/some of them would make good discussion topics for future DRI retreats.


The impact of AI and machine learning on DRI workflows

  • As we are not yet sure how this lands re: research integrity, how much do we promote vs encourage reflection re: RIO guidelines
  • How will AI use in learning affect the quality of candidates coming into the field? Do we need to change our entry routes/onboarding to account for this?
  • Will AI models replace traditional numerical models?
  • What role should DRI professionals take in framing responsible use of AI in research (and education)?
  • Data Privacy Impact Assessments are a GDPR requirement, how much are these meaningfully done when looking at AI projects?
  • What are the ethics of ai-assisted code? Is there a difference between code used internally and code published alongside research?
  • If we can't tell the difference between human and AI-generated content, are any restrictions around its use moot?
  • Where do you think we are in between the initial explorative gold rush and established methods that get applied routinely?
  • Watermayer, Lancloss & Phipps have already published on unethical behaviours of our colleagues re: Gen AI use, can we help get that genie back in the bottle?
  • What do you do when someone believes that strawberry reasons?
  • How culpable are funding bodies in the pressure to use AI for every problem?
  • Do we need to buy into that vision, do we have the appetite to push back? We did it with Turnitin detection tool, will research do the same?
  • AI training is GPU intensive. Are we at risk of institutions falling behind as they cannot keep up with investment in local data centres, GPUs and energy?
  • Are we risking all AI/ML tools being seen by the public and governments as just chatgpt vs more sophisticated preconditioners etc?
  • Is ethical advice/guidance _really_ a DRI activity? Does the deep embedding of AI/ML in research practice (because AI/ML) make it so?
  • What do the panel see as the greatest threat from AI? Infra security, human obsolescence, mass media manipulation?
  • Providing tools/support for reproducibility is a fundamental part of DRI/RSE teams. What (new) tools do we need to support to enable that for AI/ML workloads?
  • Could you give an example where as a DRI professional you gave input into the design, adaptation or use of AI in a research process? Are there any learning points?
  • To what extent will AI replace us DRI professionals?
  • If you replace junior devs with AI now then how will you get senior devs in the future?
  • Any thoughts on AI for code generation and code transformation eg. For optimising code, porting legacy code to new platforms

The environmental impacts of infrastructure

  • Can we meaningfully use data centres as district heating?
  • How does your institution balance the benefits of computing power with the environmental costs - what wins and why?
  • Given Unis have small impact compared to industry, is reducing our impact useful in itself or more a learning experience for future industrial employees?
  • Is immersion cooling an option at your institution? If yes, why, if no, why not?
  • What are your thoughts on large tech companies investing heavily in nuclear fission reactors to power their data centres? e.g 3 mile island
  • If we struggle to understand/communicate electricity use, how can we make progress on production and minerals which are more complex issues?
  • Can we afford to wait for good measurement before making improvements?
  • Should we restrict the size of LLM models used by researchers/students to reduce environmental impact?
  • Comment not question, but this website gives a great overview of the UK electricity grid and the carbon impact over the last decade https://grid.iamkate.com/
  • Why doesn't all data centre kit have a mandatory logging stream for real time energy use?
  • What do you do (or can you do) with a life-expired HPC cluster? The hardware is still very performant, but not nearly as efficient as newer hardware.
  • What do suppliers need to tell us about environmental cost of manufacture that will enable better procurement decisions?
  • What is the environmental impact of data archiving and cold data storage? What policies have your institutions got in place to manage this?
  • What role should governments have in establishing/enforcing sustainable environmental practices?
  • Would charging researchers for larger data use be a feasible way to raise awareness of the environmental impacts (incorporating power costs, for example)?
  • ML research often aims at incremental improvement of efficiency. How do you hold users accountable to the research value VS environmental cost?
  • What are the low hanging fruit in environmental impact and what can we do today to improve? While also gathering data...
  • What is the energy impact of the shift from Fortran/C to Python/R?
  • If every job submission on every HPC reported estimated CO2 impact at the top of the output log, would users read it/use it/try to improve their code?
  • How much of our data do we actually need to keep?
  • When we optimise codes used by researches, they often just scale problems up to jobs take the same duration. How can we prevent this? Should we?
  • Thoughts on N8 and Universities collaborating to build a larger datacentre for collocating tier 3 HPCs to share (financial / ecological) benefits?
  • Is anything related to sustainability like energy usage, efficiency, etc. built into grant proposals or business plans generally? Does this tend to cost more?
  • Are researchers aware of these challenges? Would they make a more conscious use of the HPC if they were?
  • So we should put HPCs in the Outer Hebrides near all the wind turbines, and DRI professionals can live in beautiful rural Scotland?
  • I'd love to have a shared Data Centre/HPC facility for all Universities - how do we get this going?🙂
  • Should we prioritize hpc usage to projects that can prove societal impact?

Security and compliance

  • How to you balance data restrictions to comply with GDPR (for example) and other demands surrounding open data and open source?
  • Confidentiality is taken very seriously because of penalties. Is there anything that we can do so integrity and availability are not left behind?
  • To what extent do you feel that security is forced to be reactionary based on attackers at the moment?
  • If/When a software supply chain attack occurs and we are no longer able to install untrusted packages from pip etc, how will digital research be possible?
  • Fear of supply chain attacks risks shutting down easy sharing of research software. What technical/social things can we do to mitigate the fear & attacks?
  • What advice should N8CIR be giving to our institutions' IT services to better cater for the security and compliance aspects of research data?
  • How important are human resources (ie. people with relevant expertise) to the security of systems? what roles and skills are going to be most crucial?
  • Care to comment on the link between under-resourcing, burn out and security?
  • Security is important as are env. concerns, code quality, fair principles etc. How do we (and help users) identify what is most important without overwhelming it?
  • Is it common for others to hear that red teaming would never finish due to so many holes, as I experienced when asking the infosec team at my University?
  • What would world IT security look like if zero-day exploits were made public as soon as they are discovered?
  • How can we train a broader range of new DRI professionals in minimum good security practices, or encourage them to take up security as a main focus?

Preparing for the next generation of research computing infrastructure

  • Is the next generation of research computing infrastructure going to be affordable for HEIs, or do we need to think of different models of compute funding?
  • Do funders provide sufficient support for software/data/people required to take advantage of next generation physical infra? If not, how can we change this?
  • What keys skills are our DRI teams or users missing that would help them effectively use next generation computing?
  • Does the investment in Tier 1 (Dawn and Isambard) and the lack of a plan for Tier 2 imply a shift in the UK compute ecosystem?
  • What's the most exciting (non-AI) piece of tech or DRI on the horizon?
  • In a resource-constrained environment, is there an argument for doing fewer things well rather than trying to do everything? Will the next generation of DRI use more or less energy?


Return to article index