Samia Touileb
Position
Associate Professor, Natural Language Processing
Affiliation
Research groups
Research
Samia Touileb is an Associate Professor in Natural Language Processing (NLP). Prior to this she was a researcher in MediaFutures WP5 on Norwegian Language Technologies, and a Postdoc at the Language Technology Group (LTG), Department of Informatics, at the University of Oslo. She holds a PhD in NLP from the University of Bergen, and has been working within research in and applications of NLP for almost a decade.
Her main research interests are bias and fairness in NLP, information extraction, summarization, and applications of NLP and machine learning methods to tasks within social science research. She also mainly works on under- and mid-resourced languages such as Norwegian.
Publications
Academic anthology/Conference proceedings
- Habash, Nizar; Bouamor, Houda; Eskander, Ramy et al. (2024). Proceedings of The Second Arabic Natural Language Processing Conference. (external link)
- Galimullin, Rustam; Touileb, Samia (2023). Proceedings of the 5th Symposium of the Norwegian AI Society (NAIS 2023). (external link)
- Habash, Nizar; Bouamor, Houda; Hajj, Hazem et al. (2021). Proceedings of the Sixth Arabic Natural Language Processing Workshop. (external link)
Poster
- Mahmood, Bilal; Elahi, Mehdi; Vadiee, Farhad et al. (2024). A Supervised Machine Learning Approach for Supporting Editorial Article Selection. (external link)
- Touileb, Samia; Øvrelid, Lilja; Velldal, Erik (2021). Using Gender- and Polarity-informed Models to Investigate Bias. (external link)
- Touileb, Samia; Pedersen, Truls Andre; Sjøvaag, Helle (2018). Automatically identifying names of unrecognized politicians. (external link)
- Touileb, Samia; Steskal, Lubos (2015). A computational approach to organize and analyze online communication data. (external link)
- Salway, Andrew; Hofland, Knut; Touileb, Samia (2013). Applying Corpus Techniques to Climate Change Blogs. (external link)
Academic chapter/article/Conference paper
- Fares, Murhaf; Touileb, Samia (2024). BabelBot at AraFinNLP2024: Fine-tuning T5 for Multi-dialect Intent Detection with Synthetic Data and Model Ensembling. (external link)
- Skulstad, Aud Solbjørg; Touileb, Samia (2024). Large Language Models and their usage in EAL education. (external link)
- Touileb, Samia; Murstad, Jeanett; Mæhlum, Petter et al. (2024). EDEN: A Dataset for Event Detection in Norwegian News. (external link)
- Simon, Étienne; Olsen, Helene Bøsei; You, Huiling et al. (2024). Generative Approaches to Event Extraction: Survey and Outlook. (external link)
- Olsen, Helene Bøsei; Touileb, Samia; Velldal, Erik (2023). Arabic dialect identification: An in-depth error analysis on the MADAR parallel corpus. (external link)
- Touileb, Samia; Øvrelid, Lilja; Velldal, Erik (2023). Measuring normative and descriptive biases in language models using census data. (external link)
- Samuel, David; Kutuzov, Andrei; Touileb, Samia et al. (2023). NorBench – A Benchmark for Norwegian Language Models. (external link)
- Barnes, Jeremy Claude; Touileb, Samia; Mæhlum, Petter et al. (2023). Identifying Token-Level Dialectal Features in Social Media. (external link)
- You, Huiling; Touileb, Samia; Øvrelid, Lilja (2023). JSEEGraph: Joint Structured Event Extraction as Graph Parsing. (external link)
- Sheikhi, Ghazaal; Opdahl, Andreas Lothe; Touileb, Samia et al. (2023). Making sense of nonsense : Integrated gradient-based input reduction to improve recall for check-worthy claim detection. (external link)
- Sheikhi, Ghazaal; Touileb, Samia; Khan, Sohail Ahmed (2023). Automated Claim Detection for Fact-checking: A Case Study using Norwegian Pre-trained Language Models. (external link)
- You, Huiling; Samuel, David; Touileb, Samia et al. (2022). EventGraph: Event Extraction as Semantic Graph Parsing. (external link)
- You, Huiling; Samuel, David; Touileb, Samia et al. (2022). EventGraph at CASE 2021 Task 1: A General Graph-based Approach to Protest Event Extraction. (external link)
- Touileb, Samia; Nozza, Debora (2022). Measuring Harmful Representations in Scandinavian Language Models. (external link)
- Touileb, Samia (2022). Exploring the Effects of Negation and Grammatical Tense on Bias Probes . (external link)
- Touileb, Samia (2022). NERDz: A Preliminary Dataset of Named Entities for Algerian. (external link)
- Mæhlum, Petter; Kåsen, Andre; Touileb, Samia et al. (2022). Annotating Norwegian language varieties on Twitter for Part-of-speech. (external link)
- Kutuzov, Andrei; Touileb, Samia; Mæhlum, Petter et al. (2022). NorDiaChange: Diachronic Semantic Change Dataset for Norwegian. (external link)
- Touileb, Samia; Øvrelid, Lilja; Velldal, Erik (2022). Occupational Biases in Norwegian and Multilingual Language Models. (external link)
- Touileb, Samia; Barnes, Jeremy (2021). The interplay between language similarity and script on a novel multi-layer Algerian dialect corpus. (external link)
- Touileb, Samia; Øvrelid, Lilja; Velldal, Erik (2021). Using Gender- and Polarity-Informed Models to Investigate Bias. (external link)
- Barnes, Jeremy; Mæhlum, Petter; Touileb, Samia (2021). NorDial: A Preliminary Corpus of Written Norwegian Dialect Use. (external link)
- Touileb, Samia (2020). LTG-ST at NADI Shared Task 1: Arabic Dialect Identification using a Stacking Classifier. (external link)
- Touileb, Samia; Øvrelid, Lilja; Velldal, Erik (2020). Gender and sentiment, critics and authors: a dataset of Norwegian book reviews. (external link)
- Lison, Pierre; Barnes, Jeremy; Hubin, Aliaksandr et al. (2020). Named Entity Recognition without Labelled Data: A Weak Supervision Approach . (external link)
- Adouane, Wafia; Touileb, Samia; Bernardy, Jean-Philippe (2020). Identifying Sentiments in Algerian Code-switched User-generated Comments. (external link)
- Rodina, Julia; Bakshandaeva, Daria; Fomin, Vadim et al. (2019). Measuring Diachronic Evolution of Evaluative Adjectives with Word Embeddings: the Case for English, Norwegian, and Russian. (external link)
- Barnes, Jeremy Claude; Touileb, Samia; Øvrelid, Lilja et al. (2019). Lexicon information in neural sentiment analysis: a multi-task learning approach. (external link)
- Touileb, Samia; Pedersen, Truls Andre; Sjøvaag, Helle (2018). Automatic identification of unknown names with specific roles. (external link)
- Velldal, Erik; Øvrelid, Lilja; Bergem, Eivind Alexander et al. (2018). NoReC: The Norwegian Review Corpus. (external link)
- Touileb, Samia; Salway, Andrew (2014). Constructions: a new unit of analysis for corpus-based discourse analysis . (external link)
Academic article
- Mahmood, Bilal; Elahi, Mehdi; Steskal, Lubos et al. (2024). Can Large Language Models Support Editors Pick Related News Articles?. (external link)
- Mahmood, Bilal; Elahi, Mehdi; Vadiee, Farhad et al. (2024). A Supervised Machine Learning Approach for Supporting Editorial Article Selection. (external link)
- Blum, Sophie; Koudijs, Raoul; Ozaki, Ana et al. (2023). Learning Horn envelopes via queries from language models. (external link)
- Touileb, Samia; Steskal, Lubos (2016). ADIOS LDA: When Grammar Induction Meets Topic Modeling. (external link)
- Salway, Andrew; Touileb, Samia (2014). Applying grammar induction to text mining. (external link)
- Salway, Andrew; Touileb, Samia; Tvinnereim, Endre (2014). Inducing Information Structures for Data-driven Text Analysis. (external link)
Lecture
- Touileb, Samia (2023). Benchmarking the societal and ethical implications of large language model. (external link)
- Touileb, Samia (2023). Sosiale og etiske utfordringer med språkmodeller som ChatGPT. (external link)
- Touileb, Samia (2023). The Societal and Ethical Implications of Language Models. (external link)
- Touileb, Samia (2023). ChatGPT: teknologien, datasettet, og det vi (ikke) vet. (external link)
- Touileb, Samia; Fahlvik, Morten; Berg, John Arthur (2023). ChatGPT & AI in education. (external link)
- Touileb, Samia (2023). ChatGPT: teknologien, datasettet, og det vi (ikke) vet.. (external link)
- Touileb, Samia; Schjøll, Anita; Throndsen, Eivind et al. (2023). The Ethics of Large Language Models. (external link)
- Touileb, Samia; Åkernes, Hanne Louise (2023). Når kunstig intelligens inntar redaksjonen. (external link)
- Touileb, Samia (2023). Demystifying ChatGPT and language models. (external link)
- Touileb, Samia; Lemaire, Pauline Marguerite (2023). Big Science Gullgruve eller fallgruve?. (external link)
- Touileb, Samia; Duarte, Katherine (2016). Getting to know large newsflows: Automatically induced information structures as keyphrases for news content analysis. (external link)
- Touileb, Samia; Elgesem, Dag; Steskal, Lubos (2012). Networks of texts and people. (external link)
Popular scientific lecture
- Touileb, Samia (2023). Store språkmodeller: muligheter og utfordringer. (external link)
- Goodwin, Morten; Touileb, Samia; Bøhn, Einar Duenger (2023). Blir vi overflødige? En samtale om kunstig intelligens og utdanning. (external link)
- Touileb, Samia (2023). Hva er ChatGPT og hvordan fungerer det og lignende verktøy?. (external link)
- Touileb, Samia (2023). Sosiale og etiske utfordringer med språkmodeller . (external link)
Feature article
Academic lecture
- Touileb, Samia (2023). Large Language models: What are they, and what are their ethical implications?. (external link)
- Sjøvaag, Helle; Pedersen, Truls Andre; Touileb, Samia (2018). Operationalising Diversity for Big Data Policy Research. (external link)
- Pedersen, Truls Andre; Touileb, Samia; Sjøvaag, Helle (2017). Finding Voices in the Margins: Computer-Assisted Discovery of Naturally Belonging Names . (external link)
- Iversen, Magnus Hoem; Pedersen, Truls Andre; Stavelin, Eirik et al. (2015). Computer supported deliberation and argumentation online. Proposing a system for online argumentation.. (external link)
- Touileb, Samia (2013). Inducing local grammars from n-grams. (external link)
Doctoral dissertation
Projects
OPINION COST action: https://www.cost.eu/actions/CA21129/
MediaFutures: https://mediafutures.no/2021/01/20/postdoc-samia-touileb/
NorDial: https://github.com/jerbarnes/nordial