There are numerous Machine Learning techniques, but they all allow the programmer to do something similar and remarkable. Usually, a human programmer needs to explicitly tell the computer how to complete a task before it is able to do so. Using Machine Learning, the programmer instead defines a learning objective, and allows the computer to search for its own solution. Until recently, Machine Learning was generally inapplicable to Humanities problems. This has changed. The Humanities are becoming more data-rich, computer scientists are becoming more interested in creative applications of AI, and today, Machine Learning is an increasingly important weapon in the armoury of the digital humanist.
In this series of workshops, we will cover some basic concepts of Machine Learning, and see how several Machine Learning algorithms can be applied to Humanities data. Basic familiarity with Python is assumed, you may find it useful to look over some of the training resources here: https://n8cir.org.uk/events/online-training/
Participants are not required to install any software for the course, but they will need a Google Account in order to access Google CoLab.
Week 1: Basic Concepts of Machine Learning
In the first session, we will cover basic concepts of Machine Learning, including: supervised vs. unsupervised methods; learning vs. inference; generative vs. discriminative models; metrics (accuracy, precision, recall, and others); the rise of ‘deep learning’. We will then take a look at some of the basic Python packages for Machine Learning: numpy, scikit-learn, gensim and TensorFlow.
Week 2: Topic Modelling
In the second session, we will look at one of the most popular ‘big data’ methods in the Humanities: Topic Modelling. Roughly speaking, Topic Modelling allows scholars to determine what texts are about. We will discuss the structure of the most popular Topic Modelling technique (latent Dirichlet allocation), see how to apply it to a corpus of documents with only a few lines of simple code, and then consider the right way to interpret the output of such a model.
Week 3: Deep Learning (1): Text Generation
In the third session, we will begin our discussion of the most popular of all Machine Learning techniques: Deep Learning, also known as Artificial Neural Networks. In this session, we will consider one of the first ways that Deep Learning was applied to Humanities data: to create generative models of language. As an example, we will build a poetry generator from a corpus of poems, and discuss how language modelling might inform debates about meaning and interpretation.
Week 4: Deep Learning (2): Word Vectors
In the final session, we will discuss another Deep Learning technique, Word Vectors. Like Topic Modelling, Word Vectors allow the scholar to model the meaning of words directly. But in this case, instead of sorting related words into ‘topics’, the computer tries to represent the meaning of each word individually as a set of numbers. We will apply the popular word2vec algorithm to a corpus of texts, and see what aspects of meaning the model enables us to explore.
When booking onto this course, you will be automatically registered on all sessions. If you are unable to attend all sessions please do not apply.
As part of the application process, you will be asked to provide a brief explanation of how attending this workshop will benefit your research. You may find it useful to write this piece before attempting to register for the event.
After the application deadline has passed, submissions will be considered, and successful applicants will be offered a place by e-mail. This process will help to ensure that each of the N8 universities are represented at, and benefit from the course.
Accepting a place and subsequently failing to attend without notifying N8 CIR may affect your eligibility for future N8 CIR events.
This event is only open to those working or studying at one of the N8 Research Partnership universities. Please register using your academic (.ac.uk) e-mail address to help verify your eligibility for this course.