
Tallinn, Estonia, courtesy of Maximilian Schich
Overview
I worked at CUDAN Open Lab from September 2019 to June 2023 as a Junior Researcher. CUDAN (Cultural Data Analytics) is a €2.5 million Horizon 2020 project based at Tallinn University, conducting research at the intersection of data science, cultural studies, and computational methods.
My focus was on understanding digital culture practices through quantitative and qualitative analysis, with published results in peer-reviewed journals.
Research Areas
Data Extraction & Analysis
- Conducted large-scale Data MiningProcess of discovering patterns in large data sets using computational methods at the intersection of statistics, database systems, or machine learning. from social media platforms (Instagram, TikTok, Tinder)
- Developed data processing pipelines for cultural data analysis
- Applied statistical methods to identify patterns in digital behaviour
Machine Learning
- Applied Machine LearningBranch of statistics and computer science, which studies algorithms and architectures that learn from observed facts. techniques to classify and analyse cultural content
- Built models for understanding social media engagement and self-representation patterns
- Implemented Natural Language ProcessingField of computer science and linguistics concerned with the interactions between computers and human language. methods for text analysis
- Co-authored research on gendered self-representation using ML on Tinder profile data (published in Springer Nature, 2024)
Data Visualisation
- Created publication-quality data visualisations to communicate research findings
- Developed interactive visualisations for exploring cultural datasets
- Used Python visualisation libraries (Matplotlib, Seaborn) and UMAP for dimensionality reduction
Content Analysis
- Investigated digital culture practices on TikTok, Instagram, and Tinder
- Analysed user behaviour patterns and content consumption trends
- Published research on migrant communities on TikTok (International Journal of Communication, 2022)
Methodologies
- Data MiningProcess of discovering patterns in large data sets using computational methods at the intersection of statistics, database systems, or machine learning. and Web ScrapingData scraping used for extracting data from websites.
- Machine LearningBranch of statistics and computer science, which studies algorithms and architectures that learn from observed facts. and Natural Language ProcessingField of computer science and linguistics concerned with the interactions between computers and human language.
- Statistical analysis
- Network AnalysisMethod to study the relations of actors or other entities in a mediated network. The resulting network is made up of nodes (entities) and edges (relations).
- Data VisualisationCreation and study of the visual representation of data.
- Ethnographic digital methods