Inter-faculty series of lectures: Computers, people, language
Inter-faculty series of lectures: Computers, people, language
Winter term 2023/2024
We cordially invite you to attend our highly topical series of lectures, which is being organised for the second time by colleagues from the Institute for German Studies/Dutch Studies and the Department of Computing Science across faculties. It is localised at the interface between linguistics/corpus linguistics and Computing Science.
On the one hand, UOL experts working at this interface from Faculties II (Department of Computing Science), III (Linguistics) and VI (Medicine and Health Sciences) will participate.
On the other hand, experts (from universities, the Fraunhofer IDMT) from the fields of computational linguistics/digitallinguistics, corpus linguistics, digital language processing (also in speech therapy), language models, language interfaces will participate. There will also be lectures with a school-related/didactic perspective on these topics.
The lecture is intended to strengthen students' key competences in the field of digital research methods for language data collection, processing and analysis as well as in the field of application perspectives (in science, but also in education). Students should thus gain an interdisciplinary insight into research and application perspectives on the interface between linguistics/language technology/computing science and be able to reflect on its opportunities and limitations. It is also intended to contribute to stronger networking between the various disciplines and perspectives involved.
The course is located in the area of specialisation and can be attended by students from all Schools. In the Bachelor's programme, it is assigned to modules pb331 and pb332 (Key Competences in Linguistics and Literary Studies and their Professional Fields), in the Master's programme to module ipb611 (Free Module). Students can find the course under the course number 10.31.501 in Stud.IP.
Interested colleagues from the university are also welcome to attend the series of lectures, as are external guests.
The lecture Computer - Human - Language is offered digitally and takes place on Mondays from 14:15 to 15:45 . A 60-minute lecture is followed by a 30-minute discussion on the respective topic.
16 October 2023
Prof Dr Wolfram Wingerath
(Institute for Computing Science, University of Oldenburg)
What You Say is What You Get - Handsfree Coding in 2023
Natural language interpretation and synthesis software is used daily by millions of people who use smart home assistants or simply prefer dictating to typing on their mobile phones. But while interfaces for hands-free operation have long since become established among consumers, they are still mostly regarded as a gimmick by IT professionals or not considered at all for professional use - quite wrongly!
In this presentation, I will describe a zero-cost setup for hands-free programming that is extremely powerful and convenient to use. I'll start by explaining the basics of controlling the computer and navigating applications using only your voice (and eyes!) and cover best practices and common pitfalls. To speed up the learning process, I will demonstrate in a short coding session how to extend your own voice command library using voice commands. During the talk, I will share my personal experiences and explain how I have adapted my own setup to enable and optimise individual workflows. At the end of the talk, I will discuss how I use hands-free coding in my work and how you can get started with little effort.
Software for interpreting and synthesising natural language is used by millions of people every day who use smart home assistants or simply prefer dictating over typing on their mobile phones. But while hands-free interfaces have found widespread adoption among consumers, IT professionals still mostly consider them gimmicks or do not consider them at all for the purpose of software development - unjustly so!
In this presentation, I will describe a zero-cost setup for hands-free coding that is extremely powerful and convenient to use. First, I will cover the basics of how to control your computer and navigate applications using just your voice (and eyes!), sharing best practices and common pitfalls. To jumpstart the learning process, I will then demonstrate how to use your own voice command library as dog food by extending the default functionality in a brief voice-coding session. Throughout the talk, I will share my personal experiences and explain how I tweaked my own setup to accommodate customised workflows. I will close the talk with how I use hands-free coding at my job and a list of pointers for getting started.
Literature tip: Wingerath, Wolfram/Gebauer, Michaela (2021): Talking is the new clicking. Software development without mouse and keyboard. In: iX (9). S. 70-73.
23 October 2023
M.Sc. Cennet Oguz
(German Research Centre for Artificial Intelligence, DFKI)
Word embeddings and language models with NLP tasks
In this lecture, we will learn how word and sentence vectors are extracted for downstream NLP tasks. We will start with one-hot word encoding to average word embeddings like Word2Vec. We will end with contextualised word embeddings. We will also see the named-entity recognition and sentiment classification NLP tasks for analysing the word embeddings.
Literature tip: Sahlgren, Magnus (2008): The distributional hypothesis. In: Rivista di Linguistica (20)1. pp. 33-53.
30 October 2023
Dr Kyoko Sugisaki
(Digital Linguistics, University of Oldenburg)
Computers, people and language
This lecture introduces the topic of the series of lectures "Computers, Humans, Language". What does interdisciplinary research at the interface between computers, humans and language look like? I will approach this question using examples from my research in the fields of linguistics (corpus linguistics), computational linguistics and human-machine interaction (MMI). There will also be an outlook and a first localisation of upcoming lectures in this series of lectures. The series of lectures and our own research projects will make it clear how different the individual scientific fields can be in terms of their objectives, methods, ways of thinking and discourses. This area of tension does pose some challenges for research. At the same time, it opens up interesting insights and explanations in linguistics, new approaches in language technology and objectives in MMI.
Literature tip: Sugisaki, Kyoko (2022): How users solve problems with presence and affordance in interactions with conversational agents: A wizard of Oz study.
06 November 2023
Dipl.-Ing. Hannes Kath
(Institute for Computing Science, University of Oldenburg)
Lost in Dialogue: A Review and Categorisation of Current Dialogue System Approaches and Technical Solutions
Dialogue systems are an important and very active research area with many practical applications. However, researchers and practitioners new to the field may have difficulty with the categorisation, number and terminology of existing free and commercial systems. Our paper aims to achieve two main objectives. Firstly, based on our structured literature review, we provide a categorisation of dialogue systems according to the objective, modality, domain, architecture, and model, and provide information on the correlations among these categories. Secondly, we summarise and compare frameworks and applications of intelligent virtual assistants, commercial frameworks, research dialogue systems, and large language models according to these categories and provide system recommendations for researchers new to the field.
Literature tip: Kath, Hannes/ Lüers, Bengt/ Gouvêa, Thiago S./ Sonntag, Daniel (2023): Lost in Dialogue: A Review and Categorisation of Current Dialogue System Approaches and Technical Solutions. In: Seipel, Dietmar/ Steen, Alexander (eds.): AI 2023: Advances in Artificial Intelligence. 46th German Conference on Al Berlin, Germany, September 26-29, 2023 Proceedings. Cham: Springer. S. 98-113.
The document can be found at https://link.springer.com/chapter/10.1007/978-3-031-42608-7_9 and can be downloaded free of charge by UOL students with "Access via your institution".
13 November 2023
M.Sc. Stefan Gerd Fritsch
(German Research Centre for Artificial Intelligence, DFKI)
How Do Computers Learn to "Understand" Natural Language?
Today, artificial neural networks are among the central tools for machine processing of natural language (NLP). Their areas of application range from language translation, spam filtering, language assistants and search engines to automatic spell checking. But how exactly do these artificial neural networks work? Are they really "intelligent"? How do they learn from data? And how can a computer develop a deep "understanding" of language?
This lecture aims to get to the bottom of these questions. In addition to an introduction to the basic functioning of neural networks, we will look at their many possible applications in the field of language processing. Furthermore, we will take a look at the latest advances in the field of so-called Large Language Models (LLMs) with billions of parameters, such as ChatGPT.
The talk will conclude with a discussion of the concept of "intelligence" in the context of machine learning and an in-depth examination of the ethical issues and challenges that have arisen, particularly since the advent of increasingly powerful LLMs. Particular attention is paid to the responsibility and careful handling of technologies such as ChatGPT, which not only harbour enormous potential, but also potential risks and dangers.
Literature tip: Lanz, Markus/ Precht, Richard David (2023): Podcast - ChatGPT and AI - does everyone really benefit?
20 November 2023
Prof Dr Sara Rezat & Dr Sebastian Kilsbach
(Institute for German Studies and Comparative Literature, Paderborn University)
Annotation and rating of argumentative learner texts for the generation of automated feedback
Arguing is an important communicative cultural competence. In terms of acquisition, written argumentation and in particular the ability to take up and refute counter-arguments is often a critical skill in all age groups. One way of promoting argumentation skills in a school context is computer-based feedback.
This is where an ongoing DFG project entitled "Computational Support for Learning Argumentative Writing in Digital School Education" (lead: Sara Rezat/Henning Wachsmuth) comes in. The project is developing algorithmic methods for the automatic analysis of argumentative learner texts with the aim of automatically generating learner-sensitive feedback on the structure of argumentative texts.
The first step is to manually annotate a learner corpus with regard to the argumentative macro- and microstructure. Such coding is associated with an orientation towards typical patterns both on a macro-structural (e.g. structure of the text from arguments and counter-arguments) and micro-structural level (e.g. argumentative text procedures) - ultimately, these patterns are to be algorithmically recorded and further processed. From a computational linguistic point of view, an assessment of the content quality of the texts and a corresponding rating is indispensable for the mining of the texts in a further step.
The presentation will begin with an overview of the project and its objectives. This will be followed by a presentation of the results of the first two phases of the project, namely the annotation and rating of the learner texts. How learner-sensitive automated feedback can be designed will be discussed in the outlook and in the subsequent discussion. The suggested reading below is intended to provide a basis for discussion.
Literature tip: Wagner, Salome/ Lachner, Andreas (2021): Feedback - Yes, of course? Digital media to promote writing skills. In: leseforum.ch 3/2021.
27 November 2023
Prof Dr Katrin Lehnen
(Institute for German Studies, University of Giessen)
Ghostwriting revisited. On the automation of writing
ChatGPT and similar AI programs are associated with a 'leap' in the automation and hybridisation of multimodal text production, which makes writing appear like ghostwriting in many places. This raises a number of questions about the theory and methodology of writing (and reading) (Robinson 2023), which are particularly relevant in educational contexts. Relevant German didactic models - process as well as competence and acquisition models - are potentially no longer adequate for description here. This is because they focus on the cognitive and linguistic processes of individuals and reduce digitality to tools for the technical realisation of these processes. Based on some theoretical considerations on the "culture of digitality" (Stalder 2016), the article discusses changes in writing through digitalisation and discusses consequences for research and writing didactics.
Literature tip: Lehnen, Katrin/Steinhoff, Torsten (2023): Digital reading and writing. Appears in: Androutsopoulos, J./Vogel, F. (eds.): Handbuch Sprache und digitale Kommunikation. Berlin/Boston: de Gruyter. Preprint.
04 December 2023
Prof Dr Rico Sennrich
(Institute for Computational Linguistics, University of Zurich)
A look behind the scenes of ChatGPT. Insights from machine translation
Language models such as ChatGPT have established themselves as very flexible tools in language processing, but their popularity also poses challenges for users and developers: Biases or even hallucinated facts in the output cannot be ruled out and require careful handling of language models.
In this lecture, I will shed some light on how language models work, which are technically closely related to machine translation models, and explain how different types of errors occur. In the second part of the lecture, I will report on current research in machine translation that aims to better understand and reduce types of errors such as biases in output or translation hallucination. The talk will end by relating the findings from machine translation back to language models such as ChatGPT.
Literature tip: Vamvas, Jannis/Sennrich, Rico (2021): Contrastive Conditioning for Assessing Disambiguation in MT: A Case Study of Distilled Bias. In: Association for Computational Linguistics (ed.): Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, pp. 10246-10265.
11 December 2023 (16:15 - 17:45)
Prof. Dr Susanne Boll
(Department of Computing Science, UOL)
Language and AI in the interaction between people and digital technologies
Language plays an important role in the use of digitalised systems in several respects. On the one hand, it serves as written or spoken input into digitised systems and as a dialogue between the user and the digitised system. It also serves as an output for the communication of information and accompanies interaction and communication in human-technology interaction.
In this lecture I will present aspects of the design of the interaction between computer, human and language. What role does language play in the use of interactive systems and what accesses does language enable? Language, fluency, linguistic competence, expressiveness, but also industry-specific vocabulary are of great importance in this design of language in interaction. What role does AI play in this interaction? Results from our research projects in the fields of education, health and public administration illustrate the role that language and dialogue play in shaping human-technology interaction.
Literature tip: Lunte, Tobias/Boll, Susanne (2020): Towards a gaze-contingent reading assistance for children with difficulties in reading. In: Proceedings of the 22nd International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '20). Association for Computing Machinery, New York, NY, USA, Article 83. pp. 1-4. doi.org/10.1145/3373625.3418014
18 December 2023
Prof Dr Lisa Schüler
(School of Linguistics and Literary Studies, Bielefeld University )
Keyboard writing: Utilising the potential of digital writing for written language learning
The lecture presents the concept and initial results from the TasDi project (Didactics of keyboard writing and word processing). The project is an international co-operation between Germany (Prof. Lisa Schüler, University of Bielefeld), Switzerland (Dr Nadja Lindauer, University of Teacher Education FHNW) and Austria (Thomas Schroffenegger, University of Teacher Education Vorarlberg). This transnational perspective is interesting because keyboarding is implemented very differently in German, Austrian and Swiss schools. The TasDi project aims to develop and evaluate learning modules for keyboarding that are based on decidedly linguistic and didactic writing criteria (e.g. word/letter frequencies, writing system regularities, writing tasks that promote learning). To this end, findings from writing process research, spelling didactics and writing promotion, which have not yet been specifically utilised for the teaching of 10-finger writing, are brought together. The article demonstrates that both motor and linguistic (in the sense of writing structure) principles are important in the development of learning units for keyboard writing. However, analyses of learning materials indicate that there has so far been an imbalance in favour of motor principles. The first drafts of learning units are presented in which motor and linguistic principles are balanced and which are intended to release the potential for written language learning in keyboarding. The discussion of the designs will then also focus on the question of how the creation of these (linguistically and motorically) optimised learning units can be automated according to various criteria (e.g. selection of typical German bigrams that address both hands equally or specific fingers in training).
Literature tip: Schüler, Lisa/Lindauer, Nadja/Schroffenegger, Thomas (2023).Keyboard writing courses - a void in writing didactics? MiDU - Media in the German classroom (5)2. pp. 1-23.
08 January 2024
Philipp Gur
(legiety GmbH, Oldenburg)
Practical application: An AI-based search engine for answering legal questions in the field of tenancy law.
The presentation introduces the software from the legal tech startup legiety, which uses artificial intelligence and natural language processing to automatically answer legal questions from lawyers in the legal field of tenancy law. The aim is as follows: The AI-based search engine is intended to reduce the research time of lawyers and thus facilitate and optimise their daily work. The search engine very quickly finds suitable standards and court judgements for questions such as "Am I allowed to keep a cat in my home?". It is able to analyse huge data sets in a very short time and return correspondingly well-founded results. The presentation will be rounded off with a practical live demo of the software. We will explain how the AI models work and how they are trained and show how the probability of hallucinations can be minimised and the quality of the answers increased.
Literature tip: Tunstall, Lewis/von Werra, Leandro/Wolf, Thomas (2022): Natural Language Processing with Transformers. Building Language Applications with Hugging Face. Revised Edition. Sebastopol: O'Reilly (Chapter 7: Question Answering).
15 January 2024
Prof Dr Jörg Peters
(Institute for German Studies, UOL)
Acoustic measures of voice quality and their relevance for multilingualism research
Interest in voice quality spans disciplines as diverse as phonetics, otorhinolaryngology, phoniatrics, musicology and psychology. Voice quality is also becoming increasingly important for speech synthesis, automatic speech recognition and hearing research. Due to the complex relationships between perception, acoustics and physiology of the human voice, research into voice quality is considered an exciting but also challenging field of research.
The lecture provides an overview of acoustic measures for recording vocal variation and illustrates the relevance of voice quality research for linguistics using the example of multilingualism research. To this end, data from a research project will be presented in which acoustic characteristics of the voice were used as indicators of cognitive load in the bilingual use of High and Low German. The lecture will show that this research opens up new ways of assessing the degree of endangerment of regional and minority languages.
Literature tip: d'Alessandro, Christophe (2006): Voice source parameters and prosodic analysis. Methods in Empirical Prosody Research. In: Stefan Sudhoff et. al. (eds.): Methods in Empirical Prosody Research. Berlin/New York: de Gruyter. S. 63-87.
22 January 2024
Dr rer. nat. Jens E. Appell, M.Sc. Laura Tuschen & Dipl. Ing. Jan Wellmann
(Fraunhofer Institute for Digital Media Technology IDMT)
Sprachverarbeitungstechnologien für die Sprachtherapie /Speech Processing Technologies for Speech and Language Therapy
Systems based on artificial intelligence (AI) are ubiquitous and are being researched and utilised in all areas of life. The research and development of AI systems in the healthcare sector is accelerating and intensifying. The technological approaches are based on machine learning, neural networks or natural language processing and are integrated into mobile applications (apps), robots and other software environments. Much of the current research is aimed at the development of diagnostic and screening procedures as well as medical imaging.
In contrast, there are few studies on AI-based complementary or supportive speech therapy. Challenges for research and development of algorithm-based procedures lie, on the one hand, in the standardisation and protection of the privacy of medical data. It also initially appears to be a contradiction that such procedures require large amounts of data, while the inter- and intra-individuality of speech pathological data is high. Finally, AI-based medical systems perform worse in clinical validation in real-life situations compared to experts (doctors, therapists). The current study situation also shows that more research is being conducted for adults than for children in this context. Children's language also shows a higher variance in physiological development.
Nevertheless, the potential and added value are high: on the one hand, therapy frequency can be increased via app-based training and individualised and direct feedback can be implemented via these systems. On the other hand, screenings as well as therapy progression can be objectified and therapy success can be increased and costs reduced. Both children and adults are addressed.
Literature tips:
Wang, Anran/ Xiu, Xiaolei/ Liu, Shengyu/ Qian, Qing/ Wu, Sizhu (2022): Characteristics of Artificial Intelligence Clinical Trials in the Field of Healthcare: A Cross-Sectional Study on ClinicialTrials.gov. In: International Journal of Environmental Research and Public Health (19)20. p. 13691. Available online at: www.mdpi.com/1660-4601/19/20/13691
Askin, Scott/ Burkhalter, Denis/ Calado, Gilda/ El Dakrouni, Samar (2023): Artificial Intelligence Applied to clinical trials: opportunities and challenges. In: Health and Technology (13)2. p. 203-213. Available online at: link-springer-com.proxy02a.bis.uni-oldenburg.de/article/10.1007/s12553-023-00738-2
Tuschen, Laura (2022): Use of speech processing technologies in speech and language therapy. In: Speech- Voice- Hearing (46)1. p. 33-39.
29 January 2024
Prof. Dr rer. nat. Jochem Rieger (Applied Neurocognitive Psychology, University of Oldenburg)
& Prof. Dr Esther Ruigendijk (Dutch Linguistics, University of Oldenburg)
Cortical representations of function versus content words while listening to speech in natural soundscapes at different levels of simulated hearing loss
Listening to naturalistic auditory stimuli elicits changes in brain activity which represent processing of speech from simple acoustic features to complex linguistic processes. We used linguistic features, namely the distinction between function and content words from a recording of listening to natural speech, to examine the differences in stimulus processing between clear and degraded conditions.
We recorded fMRI data of 30 healthy, normal-hearing participants listening to the audio description of the movie "Forrest Gump" in German (Hanke et al., 2014, doi.org/10.1038/sdata.2014.3). The audio movie was presented three times in three different recording sessions in eight segments with different levels of simulated hearing loss (CS - clear stimulus, S2 - mild degradation, N4 - heavy degradation). The degraded stimuli were produced according to Bisgaard et al, 2010 (doi.org/10.1177/1084713810379609). In each session, participants listened to the whole movie but with a randomised sequence of stimulus degradation levels.
We used speech annotations for the audio movie provided by Häusler and Hanke (2021, doi.org/10.12688/f1000research.27621.1) to categorise each spoken word into word classes - function words, content words, and rest (hard to classify, like interjections). These categories were then used as regressors to predict BOLD time courses. We added a word duration regressor, and orthogonalised the other regressors. This allowed us to obtain word class specific activation estimates unbiased by average word length differences between classes. In addition, we did another analysis where pronouns were split from function words into a separate regressor.
Our results indicate distinctive spatial patterns of BOLD activity in response to function versus content words. Function words elicited significant frontal and temporal activations, whereas content words elicited significant parietal activations. Pronouns displayed the same patterns as the other function words, but with larger effect sizes. Word duration explained a lot of the activity in the temporal lobe presumably related to early auditory processing. Most activations remained across stimulus degradation levels, but with smaller effect sizes. The highest degradation level N4 showed some additional effects - with a slight increase in motor cortex activity.
Our study reveals distinct but overlapping cortical representations of content and function words when listening to continuous speech in natural soundscapes with different simulated hearing capabilities. It also points out the importance of accounting for differences in average word length between word classes.
Literature tip: Häusler, Christian Olaf/Hanke, Michael (2001): A studyforrest extension, an annotation of spoken language in the German dubbed movie "Forrest Gump" and its audio-description [version 1; peer review: 1 approved, 2 approved with reservations]. In: F1000Research 2021, 10:54.(https://doi.org/10.12688/f1000research.27621.1)
Winter semester 2022/2023 - 1st run of the series of lectures
We cordially invite you to attend a highly topical series of lectures organised by colleagues from the Institute for German Studies/Dutch Studies and the Department of Computing Science across faculties. It is localised at the interface between linguistics/corpus linguistics and Computing Science.
On the one hand, UOL experts working at this interface from Faculties II (Department of Computing Science), III (Linguistics) and VI (Medicine and Health Sciences) will be invited.
On the other hand, experts (from universities, the Fraunhofer IDMT, from industry) from the fields of computational linguistics, digital linguistics, corpus linguistics, automatic speech recognition, digital speech processing, human-robot/computer interaction will participate.
The lecture aims to strengthen students' key competences in the field of digital research methods for language data collection, processing and analysis as well as in the field of application perspectives (in science, but also in education). Students should thus gain an interdisciplinary insight into research and application perspectives on the interface between linguistics/language technology/computing science and be able to reflect on its opportunities and limitations. It is also intended to contribute to stronger networking between the various disciplines and perspectives involved.
The course is located in the area of specialisation and can be attended by students from all Schools. In the Bachelor's programme, it is assigned to modules pb331 and pb332 (Key Competences in Linguistics and Literary Studies and their Professional Fields), in the Master's programme to module ipb611 (Free Module). Students can find the course under the course number 10.31.501 in Stud.IP.
Interested colleagues from the university are also welcome to attend the series of lectures, as are external guests.
The lecture Computer - Human - Language is offered digitally and takes place on Mondays from 14:15 to 15:45 . A 60-minute lecture is followed by a 30-minute discussion on the respective topic.
17 October 2022
Prof Dr Wolfram Wingerath
(Institute for Computing Science, University of Oldenburg)
What You Say is What You Get - Handsfree Coding in 2022
Natural language interpretation and synthesis software is used daily by millions of people who use smart home assistants or simply prefer dictating to typing on their mobile phones. But while interfaces for hands-free operation have long since become established among consumers, they are still mostly regarded as a gimmick by IT professionals or not considered at all for professional use - quite wrongly!
In this presentation, I will describe a zero-cost setup for hands-free programming that is extremely powerful and convenient to use. I'll start by explaining the basics of controlling the computer and navigating applications using only your voice (and eyes and facial expressions and more!) and cover best practices and common pitfalls. To speed up the learning process, I will demonstrate how to extend your own voice command library using voice commands in a short coding session. During the talk, I will share my personal experiences and explain how I have adapted my own setup to enable and optimise individual workflows. At the end of the talk, I will discuss how I use hands-free coding in my work and how you can get started with little effort.
Literature tips: Speaking is the new clicking
24 October 2022
Prof Dr Kerstin Fischer
(Department of Design and Communication, Social Design & Interaction Research, University of Southern Denmark)
Doing Linguistics Using Robots
Robots make excellent confederates in the experimental investigation of the meanings and functions of linguistic features, especially interactional and interpersonal effects, which are otherwise hard to identify empirically. This is because robots are controllable in ways humans are not and present exactly the same stimuli for all participants in identical ways, while at the same time being embodied and somewhat unknown interaction partners, which makes methodological constraints natural and acceptable.
In this talk, I present examples from several years of work in which I have used robots to study aspects of language and speech.
Literature tip: Studying Language Attitudes Using Robots
07 November 2022
Prof. Dr Noah Bubenhofer
(German Department, University of Zurich)
The language does not exist. The computer shows this.
In linguistics, extensive collections of texts have been analysed quantitatively and qualitatively as so-called "corpora" for some time in order to be able to model language use. This is linked to the premise that the focus of modern linguistics is less on the idea of language as a system and more on language as language use in society. The statistical analysis of large text corpora lends itself particularly well to such a perspective.
What methods are used to analyse such data? And which research questions can be answered with them? In the lecture, I will present research from various areas, e.g. on language use in the corona pandemic. It will become clear that it does not make sense to assume that "the language" exists and to want to model it, but that it is more plausible to model and analyse different language usages. Modern methods of distributional semantics (so-called word embeddings based on neural learning) lend themselves to this - but only if they are accompanied by a corresponding theoretical understanding.
Literature tips: Exploration of semantic spaces in the corona discourse
14 November 2022
M.Sc. Stefan Gerd Fritsch
(Embedded Intelligence Research Department, DFKI Kaiserslautern)
Natural Language Processing: How Do Computers Learn to "Understand" Natural Language?
Today, artificial neural networks are among the most important systems used for machine processing of natural language. Whether language translators, spam filters, language assistants, search engines or automatic spell checkers - we use neural networks every day, often without realising it. But what are artificial neural networks anyway? How does data learning work? How can computers develop an "understanding" of language? This lecture aims to get to the bottom of these questions (and a few more). In addition to so-called feedforward neural networks, we will also familiarise ourselves with recurrent architectures and their possible applications for language processing. In addition, we will look at the latest developments in the field of transformer-based language models up to the largest models with several billion parameters. The talk will conclude with a critical discussion of the ethical issues and problems associated with the use of machine learning for language processing.
Literature tip: Five sources of bias in natural language processing.
21 November 2022
Prof. Dr rer. nat. Jochem Rieger
(Applied Neurocognitive Psychology, University of Oldenburg)
Artificial and biological intelligence: How machine learning can advance our understanding of speech processing in the human brain
Literature tips:
- Generalisable dimensions of human cortical auditory processing of speech in natural soundscapes: A data-driven ultra high field fMRI approach
- Encoding and Decoding Models in Cognitive Electrophysiology(Paper & Tutorial)
- Rapid tuning shifts in human auditory cortex enhance speech intelligibility
- Categorical representation of phonemes in the human superior temporal gyrus
28 November 2022 (16:15 - 17:45)
Prof Dr Peter Birkholz
(Institute for Acoustics and Speech Communication, TU Dresden)
Articulatory speech synthesis with VocalTractLab
This lecture gives an introduction to articulatory speech synthesis with a focus on the VocalTractLab system. In contrast to the currently widespread methods of concatenation synthesis or speech synthesis based on neural vocoders, articulatory synthesis uses models of the vocal tract, vocal folds, aerodynamics and acoustics, as well as the control of the model articulators. It is therefore a simulation of the speech production process with a high degree of flexibility with regard to the speech produced. In the lecture, the individual sub-models of the system will be presented and the new research questions arising from the simulation will be explained. In addition, various applications of articulatory synthesis will be presented.
Literature tips: Modelling Consonant-Vowel Coarticulation for Articulatory Speech Synthesis
05 December 2022
Dipl.-Ing. Hannes Kath
(Institute for Computing Science, University of Oldenburg)
Creation of a machine-annotated language corpus (CARInA); from thesis to academia
This lecture presents the results and the process of a Diplom thesis from finding a topic to publication. The subject of the thesis was the creation of a German language corpus, which can be used for speech synthesis, among other things, and thus requires annotations at different levels. The language material of the corpus was taken from the Wikipedia platform and the annotations were added automatically using various tools. Typical applications of spoken corpora are the training of neural models for speech recognition, person recognition, speech synthesis and speech analysis. The creation of high-quality hand-annotated corpora is both time-consuming and cost-intensive, which is why machine-generated corpora are preferred for many applications.
Literature tip: CARINA - A Corpus of Aligned German Read Speech Including Annotations
12 December 2022
Dr Thomas Schmidt
(MusicalBits GmbH)
Oral corpora - Manual and automated approaches to conversations and spoken language
Oral corpora - i.e. audio or video recordings of natural, spontaneous interaction that are made accessible for analysis through transcription and annotation - have been collected by linguists since at least the 1950s, for example to analyse variation (dialects) or pragmatic aspects of verbal interaction. In pre-digital times, working with oral corpora meant handling tapes, typewriters and card indexes. Today, there are sophisticated workflows that gradually add information to "digitally born" recordings in manual processes that are partly automated and partly supported by digital tools, so that these can then be used for analyses or passed on via the web - again supported by digital tools. Nevertheless, the collection and indexing of spoken language and corpus linguistic work with spoken language remains a time-consuming and complex endeavour.
In my contribution to the series of lectures, I will present the workflow used to build the research and teaching corpus Gesprochenes German (FOLK) (Schmidt 2016). I will then discuss the role that machine methods such as automatic speech recognition and other methods from computational linguistics, natural language processing or "artificial intelligence" currently play in working with oral corpora and how they can be used in the future.
Literature tips: Construction and Dissemination of a Corpus of Spoken Interaction - Tools and Workflows in the FOLK project
09 January 2023
M.Sc. Jan Manemann
Dr Daniel Schlitt
(worldiety GmbH Oldenburg)
AudioCAT - Competence testing via voice-controlled computer dialogue systems
Effective communication is the basis for successful teamwork. This is also crucial when it comes to challenging or stressful moments within the profession. The current situation within medical care is emblematic of this problem. The demanding basic requirements are often difficult to reconcile for teams with different native languages and diverse cultural backgrounds. This is not just about specialist knowledge, but also about personal decision-making processes, which are represented within a team by so-called "soft skills".
This is where the AudioCAT project comes in by developing an automated process for the evaluation of expertise and soft skills. AudioCAT is a joint project between worldiety GmbH, the Fraunhofer IDMT and Flensburg University of Applied Sciences. The project is funded by the Federal Ministry of Education and Research.
The core of the AudioCAT project is human-machine interaction. Test subjects go through the test scenarios as an interactive questionnaire. The respondents' answers are recorded directly via a microphone and transcribed using a speech recognition algorithm. The system can also react to answers and - if possible - ask a meaningful follow-up question.
The transcribed answers from the test scenarios are used to train an artificial intelligence. To train the AI, the answers must first be categorised as good, average or bad by a human. Once the AI has been trained, it should be able to assign the classification to new answers itself. It is also important to define which metrics and characteristics of an answer indicate good soft skills and which do not in order to develop an AI algorithm that is as resilient as possible and delivers good results in the long term even without human intervention.
Literature tips:
16 January 2023
Dr Kyoko Sugisaki
(Digital Linguistics, University of Oldenburg)
Computer, Man, Language - Research at the interface
What does interdisciplinary research at the interface between computers, humans and language look like? Using examples from my research in the fields of linguistics, computational linguistics and human-machine interaction (HMI), I will provide insights and refer to previous lectures in the series of lectures. It will become clear how different the individual scientific fields can be in terms of their objectives, methods, ways of thinking and discourses. This area of tension does pose some challenges for research. At the same time, it opens up exciting insights and explanations in linguistics, new approaches in language technology and objectives in MMI.
Literature tips:
- Language technology: Building a Corpus from Handwritten Picture Postcards: Transcription, Annotation and Part-of-Speech Tagging
- Linguistics: Tracing Changes in Thematic Structure of Holiday Picture Postcards from 1950s to 2010s
- Human-machine interaction: How users solve problems with presence and affordance in interactions with conversational agents: A wizard of Oz study
23 January 2023
Dr rer. nat. Jens E. Appell & Laura Tuschen
(Fraunhofer Institute for Digital Media Technology IDMT)
Speech technologies for pathological speech
Founded in 2008 by Prof. Dr Dr Birger Kollmeier and Dr Jens Appell, the Fraunhofer IDMT's Fraunhofer Site for Hearing, Speech and Audio Technology (HSA) has been working on speech technologies from automatic speech recognition and the prediction of speech intelligibility under various conditions to the development of computer-aided methods for voice and speech analysis for digital assessment and training applications in the fields of speech therapy, language acquisition and language learning.
The lecture introduces the work of the HSA in the field of speech and language and places a special focus on the work of the Assistive Speech and Language Analysis group headed by Laura Tuschen. Basic concepts and approaches to speech assessment from a speech therapy perspective will be presented and their potential for computer-aided assessment will be discussed. The technological developments will be explained and placed in the respective application context and discussed on the basis of the projects and research work currently being carried out in this group. Key technical elements are the use of technologies for automatic speech recognition, classical acoustic measures for the evaluation of voice and, in general, the use of AI methods for analysis. Application scenarios currently being researched include digital training systems to promote speech intelligibility after a stroke with individualised feedback and automatic assessments of reading skills in primary school pupils to identify individual support needs. Acoustic measures are implemented in automated analysis approaches to assess voice quality and make such systems accessible in healthcare for prevention and diagnostics. In addition, spoken language is being investigated as a biomarker in order to provide indications of profound developmental disorders such as autism spectrum disorders via automated analyses and to promote early diagnosis in those affected.
Literature tips:
30 January 2023
Prof Dr Detmar Meurers
(Department of Linguistics, University of Tübingen)
Computational linguistic analysis of the linguistic complexity of reading and learner texts
The complexity of language is relevant from different perspectives
: for analysing the readability of texts or for
characterising the linguistic competence of individuals on the basis of their
language production. In this talk, we present an empirically
broad, computational linguistic approach for automatically analysing
linguistic complexity, which is also generally usable via a web application
(http://ctapweb.com). In addition to introductory
analyses, we then discuss some factors that have so far received less
attention: How do the properties
of a task affect the complexity of the written text? What
happens when linguistic complexity meets mathematical complexity
in text tasks? Does the linguistic complexity of a
reading text influence the complexity of a subsequently written text?
Literature tips:
- Broad linguistic modelling is beneficial for German L2 proficiency assessment
- Secondary school textbook texts on the linguistic test bench: Analysing the complexity of educational language as a function of school type and grade level
- Further publications on this topic: http: //www.sfs.uni-tuebingen.de/~dm/complexity.html