The impact of AI and machine learning on DRI workflows
- As we are not yet sure how this lands re: research integrity, how much do we promote vs encourage reflection re: RIO guidelines
- How will AI use in learning affect the quality of candidates coming into the field? Do we need to change our entry routes/onboarding to account for this?
- Will AI models replace traditional numerical models?
- What role should DRI professionals take in framing responsible use of AI in research (and education)?
- Data Privacy Impact Assessments are a GDPR requirement, how much are these meaningfully done when looking at AI projects?
- What are the ethics of ai-assisted code? Is there a difference between code used internally and code published alongside research?
- If we can't tell the difference between human and AI-generated content, are any restrictions around its use moot?
- Where do you think we are in between the initial explorative gold rush and established methods that get applied routinely?
- Watermayer, Lancloss & Phipps have already published on unethical behaviours of our colleagues re: Gen AI use, can we help get that genie back in the bottle?
- What do you do when someone believes that strawberry reasons?
- How culpable are funding bodies in the pressure to use AI for every problem?
- Do we need to buy into that vision, do we have the appetite to push back? We did it with Turnitin detection tool, will research do the same?
- AI training is GPU intensive. Are we at risk of institutions falling behind as they cannot keep up with investment in local data centres, GPUs and energy?
- Are we risking all AI/ML tools being seen by the public and governments as just chatgpt vs more sophisticated preconditioners etc?
- Is ethical advice/guidance _really_ a DRI activity? Does the deep embedding of AI/ML in research practice (because AI/ML) make it so?
- What do the panel see as the greatest threat from AI? Infra security, human obsolescence, mass media manipulation?
- Providing tools/support for reproducibility is a fundamental part of DRI/RSE teams. What (new) tools do we need to support to enable that for AI/ML workloads?
- Could you give an example where as a DRI professional you gave input into the design, adaptation or use of AI in a research process? Are there any learning points?
- To what extent will AI replace us DRI professionals?
- If you replace junior devs with AI now then how will you get senior devs in the future?
- Any thoughts on AI for code generation and code transformation eg. For optimising code, porting legacy code to new platforms
The environmental impacts of infrastructure
- Can we meaningfully use data centres as district heating?
- How does your institution balance the benefits of computing power with the environmental costs - what wins and why?
- Given Unis have small impact compared to industry, is reducing our impact useful in itself or more a learning experience for future industrial employees?
- Is immersion cooling an option at your institution? If yes, why, if no, why not?
- What are your thoughts on large tech companies investing heavily in nuclear fission reactors to power their data centres? e.g 3 mile island
- If we struggle to understand/communicate electricity use, how can we make progress on production and minerals which are more complex issues?
- Can we afford to wait for good measurement before making improvements?
- Should we restrict the size of LLM models used by researchers/students to reduce environmental impact?
- Comment not question, but this website gives a great overview of the UK electricity grid and the carbon impact over the last decade https://grid.iamkate.com/
- Why doesn't all data centre kit have a mandatory logging stream for real time energy use?
- What do you do (or can you do) with a life-expired HPC cluster? The hardware is still very performant, but not nearly as efficient as newer hardware.
- What do suppliers need to tell us about environmental cost of manufacture that will enable better procurement decisions?
- What is the environmental impact of data archiving and cold data storage? What policies have your institutions got in place to manage this?
- What role should governments have in establishing/enforcing sustainable environmental practices?
- Would charging researchers for larger data use be a feasible way to raise awareness of the environmental impacts (incorporating power costs, for example)?
- ML research often aims at incremental improvement of efficiency. How do you hold users accountable to the research value VS environmental cost?
- What are the low hanging fruit in environmental impact and what can we do today to improve? While also gathering data...
- What is the energy impact of the shift from Fortran/C to Python/R?
- If every job submission on every HPC reported estimated CO2 impact at the top of the output log, would users read it/use it/try to improve their code?
- How much of our data do we actually need to keep?
- When we optimise codes used by researches, they often just scale problems up to jobs take the same duration. How can we prevent this? Should we?
- Thoughts on N8 and Universities collaborating to build a larger datacentre for collocating tier 3 HPCs to share (financial / ecological) benefits?
- Is anything related to sustainability like energy usage, efficiency, etc. built into grant proposals or business plans generally? Does this tend to cost more?
- Are researchers aware of these challenges? Would they make a more conscious use of the HPC if they were?
- So we should put HPCs in the Outer Hebrides near all the wind turbines, and DRI professionals can live in beautiful rural Scotland?
- I'd love to have a shared Data Centre/HPC facility for all Universities - how do we get this going?🙂
- Should we prioritize hpc usage to projects that can prove societal impact?
Security and compliance
- How to you balance data restrictions to comply with GDPR (for example) and other demands surrounding open data and open source?
- Confidentiality is taken very seriously because of penalties. Is there anything that we can do so integrity and availability are not left behind?
- To what extent do you feel that security is forced to be reactionary based on attackers at the moment?
- If/When a software supply chain attack occurs and we are no longer able to install untrusted packages from pip etc, how will digital research be possible?
- Fear of supply chain attacks risks shutting down easy sharing of research software. What technical/social things can we do to mitigate the fear & attacks?
- What advice should N8CIR be giving to our institutions' IT services to better cater for the security and compliance aspects of research data?
- How important are human resources (ie. people with relevant expertise) to the security of systems? what roles and skills are going to be most crucial?
- Care to comment on the link between under-resourcing, burn out and security?
- Security is important as are env. concerns, code quality, fair principles etc. How do we (and help users) identify what is most important without overwhelming it?
- Is it common for others to hear that red teaming would never finish due to so many holes, as I experienced when asking the infosec team at my University?
- What would world IT security look like if zero-day exploits were made public as soon as they are discovered?
- How can we train a broader range of new DRI professionals in minimum good security practices, or encourage them to take up security as a main focus?
Preparing for the next generation of research computing infrastructure
- Is the next generation of research computing infrastructure going to be affordable for HEIs, or do we need to think of different models of compute funding?
- Do funders provide sufficient support for software/data/people required to take advantage of next generation physical infra? If not, how can we change this?
- What keys skills are our DRI teams or users missing that would help them effectively use next generation computing?
- Does the investment in Tier 1 (Dawn and Isambard) and the lack of a plan for Tier 2 imply a shift in the UK compute ecosystem?
- What's the most exciting (non-AI) piece of tech or DRI on the horizon?
- In a resource-constrained environment, is there an argument for doing fewer things well rather than trying to do everything? Will the next generation of DRI use more or less energy?