Samia Touileb
Stilling
Førsteamanuensis, Språkteknologi
Tilhørighet
Forskergrupper
Forskning
Samia Touileb er førsteamanuensis innen språkteknologi (Natural Language Processing på Engelsk). Før dette var hun forsker ved MediaFutures (WP5 -- norsk språkteknologi), og postdoktor ved Språkteknologigruppen (LTG), Institutt for informatikk ved Universitetet i Oslo. Hun har en doktorgrad i språkteknologi fra Universitetet i Bergen.
Hennes hoved forskningsinteresser inkluderer skjevhet og rettferdighet i modeller innen språkteknologi, informasjonsekstraksjon, automatisk generering av sammendrag, og anvendelser av språkteknologiske- og maskinlæringsmetoder innen samfunnsvitenskapelig forskning.
Publikasjoner
Faglig kapittel
Vitenskapelig Kapittel/Artikkel/Konferanseartikkel
- Touileb, Samia; Murstad, Jeanett; Mæhlum, Petter et al. (2024). EDEN: A Dataset for Event Detection in Norwegian News. (ekstern lenke)
- Barnes, Jeremy Claude; Touileb, Samia; Mæhlum, Petter et al. (2023). Identifying Token-Level Dialectal Features in Social Media. (ekstern lenke)
- Samuel, David; Kutuzov, Andrei; Touileb, Samia et al. (2023). NorBench – A Benchmark for Norwegian Language Models. (ekstern lenke)
- Touileb, Samia; Øvrelid, Lilja; Velldal, Erik (2023). Measuring normative and descriptive biases in language models using census data. (ekstern lenke)
- Sheikhi, Ghazaal; Touileb, Samia; Khan, Sohail Ahmed (2023). Automated Claim Detection for Fact-checking: A Case Study using Norwegian Pre-trained Language Models. (ekstern lenke)
- Sheikhi, Ghazaal; Opdahl, Andreas Lothe; Touileb, Samia et al. (2023). Making sense of nonsense : Integrated gradient-based input reduction to improve recall for check-worthy claim detection. (ekstern lenke)
- Olsen, Helene Bøsei; Touileb, Samia; Velldal, Erik (2023). Arabic dialect identification: An in-depth error analysis on the MADAR parallel corpus. (ekstern lenke)
- You, Huiling; Touileb, Samia; Øvrelid, Lilja (2023). JSEEGraph: Joint Structured Event Extraction as Graph Parsing. (ekstern lenke)
- Touileb, Samia; Øvrelid, Lilja; Velldal, Erik (2022). Occupational Biases in Norwegian and Multilingual Language Models. (ekstern lenke)
- Touileb, Samia; Nozza, Debora (2022). Measuring Harmful Representations in Scandinavian Language Models. (ekstern lenke)
- You, Huiling; Samuel, David; Touileb, Samia et al. (2022). EventGraph: Event Extraction as Semantic Graph Parsing. (ekstern lenke)
- You, Huiling; Samuel, David; Touileb, Samia et al. (2022). EventGraph at CASE 2021 Task 1: A General Graph-based Approach to Protest Event Extraction. (ekstern lenke)
- Touileb, Samia (2022). Exploring the Effects of Negation and Grammatical Tense on Bias Probes . (ekstern lenke)
- Mæhlum, Petter; Kåsen, Andre; Touileb, Samia et al. (2022). Annotating Norwegian language varieties on Twitter for Part-of-speech. (ekstern lenke)
- Touileb, Samia (2022). NERDz: A Preliminary Dataset of Named Entities for Algerian. (ekstern lenke)
- Kutuzov, Andrei; Touileb, Samia; Mæhlum, Petter et al. (2022). NorDiaChange: Diachronic Semantic Change Dataset for Norwegian. (ekstern lenke)
- Barnes, Jeremy; Mæhlum, Petter; Touileb, Samia (2021). NorDial: A Preliminary Corpus of Written Norwegian Dialect Use. (ekstern lenke)
- Touileb, Samia; Barnes, Jeremy (2021). The interplay between language similarity and script on a novel multi-layer Algerian dialect corpus. (ekstern lenke)
- Touileb, Samia; Øvrelid, Lilja; Velldal, Erik (2021). Using Gender- and Polarity-Informed Models to Investigate Bias. (ekstern lenke)
- Touileb, Samia (2020). LTG-ST at NADI Shared Task 1: Arabic Dialect Identification using a Stacking Classifier. (ekstern lenke)
- Touileb, Samia; Øvrelid, Lilja; Velldal, Erik (2020). Gender and sentiment, critics and authors: a dataset of Norwegian book reviews. (ekstern lenke)
- Lison, Pierre; Barnes, Jeremy; Hubin, Aliaksandr et al. (2020). Named Entity Recognition without Labelled Data: A Weak Supervision Approach . (ekstern lenke)
- Adouane, Wafia; Touileb, Samia; Bernardy, Jean-Philippe (2020). Identifying Sentiments in Algerian Code-switched User-generated Comments. (ekstern lenke)
- Rodina, Julia; Bakshandaeva, Daria; Fomin, Vadim et al. (2019). Measuring Diachronic Evolution of Evaluative Adjectives with Word Embeddings: the Case for English, Norwegian, and Russian. (ekstern lenke)
- Barnes, Jeremy Claude; Touileb, Samia; Øvrelid, Lilja et al. (2019). Lexicon information in neural sentiment analysis: a multi-task learning approach. (ekstern lenke)
- Velldal, Erik; Øvrelid, Lilja; Bergem, Eivind Alexander et al. (2018). NoReC: The Norwegian Review Corpus. (ekstern lenke)
- Touileb, Samia; Pedersen, Truls Andre; Sjøvaag, Helle (2018). Automatic identification of unknown names with specific roles. (ekstern lenke)
- Touileb, Samia; Salway, Andrew (2014). Constructions: a new unit of analysis for corpus-based discourse analysis . (ekstern lenke)
Kronikk
Populærvitenskapelig foredrag
- Goodwin, Morten; Touileb, Samia; Bøhn, Einar Duenger (2023). Blir vi overflødige? En samtale om kunstig intelligens og utdanning. (ekstern lenke)
- Touileb, Samia (2023). Hva er ChatGPT og hvordan fungerer det og lignende verktøy?. (ekstern lenke)
- Touileb, Samia (2023). Store språkmodeller: muligheter og utfordringer. (ekstern lenke)
- Touileb, Samia (2023). Sosiale og etiske utfordringer med språkmodeller . (ekstern lenke)
Vitenskapelig antologi/Konferanseserie
Vitenskapelig artikkel
- Blum, Sophie; Koudijs, Raoul; Ozaki, Ana et al. (2023). Learning Horn envelopes via queries from language models. (ekstern lenke)
- Touileb, Samia; Steskal, Lubos (2016). ADIOS LDA: When Grammar Induction Meets Topic Modeling. (ekstern lenke)
- Salway, Andrew; Touileb, Samia; Tvinnereim, Endre (2014). Inducing Information Structures for Data-driven Text Analysis. (ekstern lenke)
- Salway, Andrew; Touileb, Samia (2014). Applying grammar induction to text mining. (ekstern lenke)
Faglig foredrag
- Touileb, Samia (2023). ChatGPT: teknologien, datasettet, og det vi (ikke) vet. (ekstern lenke)
- Touileb, Samia; Schjøll, Anita; Throndsen, Eivind et al. (2023). The Ethics of Large Language Models. (ekstern lenke)
- Touileb, Samia (2023). The Societal and Ethical Implications of Language Models. (ekstern lenke)
- Touileb, Samia; Fahlvik, Morten; Berg, John Arthur (2023). ChatGPT & AI in education. (ekstern lenke)
- Touileb, Samia (2023). ChatGPT: teknologien, datasettet, og det vi (ikke) vet.. (ekstern lenke)
- Touileb, Samia (2023). Sosiale og etiske utfordringer med språkmodeller som ChatGPT. (ekstern lenke)
- Touileb, Samia; Åkernes, Hanne Louise (2023). Når kunstig intelligens inntar redaksjonen. (ekstern lenke)
- Touileb, Samia; Lemaire, Pauline Marguerite (2023). Big Science Gullgruve eller fallgruve?. (ekstern lenke)
- Touileb, Samia (2023). Benchmarking the societal and ethical implications of large language model. (ekstern lenke)
- Touileb, Samia (2023). Demystifying ChatGPT and language models. (ekstern lenke)
- Touileb, Samia; Duarte, Katherine (2016). Getting to know large newsflows: Automatically induced information structures as keyphrases for news content analysis. (ekstern lenke)
- Touileb, Samia; Elgesem, Dag; Steskal, Lubos (2012). Networks of texts and people. (ekstern lenke)
Vitenskapelig foredrag
- Touileb, Samia (2023). Large Language models: What are they, and what are their ethical implications?. (ekstern lenke)
- Sjøvaag, Helle; Pedersen, Truls Andre; Touileb, Samia (2018). Operationalising Diversity for Big Data Policy Research. (ekstern lenke)
- Pedersen, Truls Andre; Touileb, Samia; Sjøvaag, Helle (2017). Finding Voices in the Margins: Computer-Assisted Discovery of Naturally Belonging Names . (ekstern lenke)
- Iversen, Magnus Hoem; Pedersen, Truls Andre; Stavelin, Eirik et al. (2015). Computer supported deliberation and argumentation online. Proposing a system for online argumentation.. (ekstern lenke)
- Touileb, Samia (2013). Inducing local grammars from n-grams. (ekstern lenke)
Poster
- Touileb, Samia; Øvrelid, Lilja; Velldal, Erik (2021). Using Gender- and Polarity-informed Models to Investigate Bias. (ekstern lenke)
- Touileb, Samia; Pedersen, Truls Andre; Sjøvaag, Helle (2018). Automatically identifying names of unrecognized politicians. (ekstern lenke)
- Touileb, Samia; Steskal, Lubos (2015). A computational approach to organize and analyze online communication data. (ekstern lenke)
- Salway, Andrew; Hofland, Knut; Touileb, Samia (2013). Applying Corpus Techniques to Climate Change Blogs. (ekstern lenke)
Doktorgradsavhandling
Prosjekter
OPINION COST action: https://www.cost.eu/actions/CA21129/
MediaFutures: https://mediafutures.no/2021/01/20/postdoc-samia-touileb/
NorDial: https://github.com/jerbarnes/nordial