Research
My research interests sit at the intersection of applied machine learning, collective intelligence, and robust decision-making under real-world constraints.
See my Master’s Thesis
Effect of Type of Source on Response Change
This experimental study investigates how the type of information source (expert vs anonymous) influences participants’ response changes, confidence levels, and trust. The study focuses on belief updating mechanisms when individuals are exposed to explanations attributed to sources with differing credibility.
- Research question: How does source type affect response change, confidence, and trust?
- Design: Within-subject experimental design using Qualtrics
- Methods: Randomization, counterbalancing, confidence and trust scales
- Key variables: Source type (expert vs anonymous), response accuracy, confidence, trust
- Validity controls: Block randomization, counterbalancing to reduce order effects
Results indicate higher confidence and greater response change when explanations are attributed to expert sources. The study discusses limitations related to fatigue, familiarity with questions, and the need for formal statistical testing.
Relationship Between Personality Traits and Academic Motivation
This study examines the relationship between the Big Five personality traits and academic motivation (intrinsic and extrinsic) among university students at Mohammed VI Polytechnic University. The objective was to identify which personality dimensions are statistically associated with higher levels of student motivation.
- Research question: What is the relationship between Big Five personality traits and academic motivation?
- Sample: 39 university students (after data cleaning)
- Design: Within-group quantitative study using survey instruments
- Measures: Big Five personality traits (matrix scale), intrinsic/extrinsic motivation (7-point Likert)
- Analysis: Normality testing (Shapiro–Wilk), variance testing (Bartlett), ANOVA/Welch tests, logistic regression
Results indicate that conscientiousness is the only personality trait significantly associated with academic motivation. Logistic regression analysis shows that students with conscientious traits are more than twice as likely to be motivated compared to those without these traits.
The study discusses limitations related to sample size, convenience sampling, and generalisability, and highlights the need for replication on a larger and more representative population.
CoLabCareers: Crowd-Based Evaluation of Professional Content
This project investigates how collective intelligence mechanisms can be used to improve the quality and fairness of professional content evaluation in early-career job markets. The study explores whether peer-based crowd review can reduce bias, encourage constructive feedback, and iteratively improve candidate profiles.
- Research question: How can crowd wisdom be leveraged to evaluate and improve professional content fairly and effectively?
- Approach: Design and implementation of a crowd-based review platform
- Collective intelligence mechanisms: Multi-voting, Bag of Stars (BoS), anonymised peer review
- Key design principles: Anonymity, coopetition, iterative improvement
- Evaluation criteria: Reliability, user satisfaction, and cost-effectiveness
Results indicate that anonymised crowd review reduces perceived bias related to personal attributes (name, gender, nationality) and fosters constructive feedback. Peer-generated ratings and comments enabled iterative improvement of submitted professional content.
Identified limitations include reliance on sustained user engagement, administrative moderation requirements, and scalability challenges. Future work proposes integrating company feedback, incentive mechanisms, and large-scale data analysis to improve robustness and long-term sustainability.
Water Connectivity Issues in Rural Morocco
This study designs and evaluates a policy intervention aimed at improving water access in rural Morocco, focusing on 28 rural communes across the Provinces of Safi, Youssoufia, and Sidi Bennour. The research addresses persistent water scarcity driven by lack of household connections, transportation constraints, and poverty.
- Research questions:
- Does a motor-tricycle–based water delivery system improve water access?
- How does improved water access affect subjective well-being and happiness?
- Does water delivery reduce health risks such as waterborne diseases?
- Intervention: Credit-financed motor tricycles for decentralized water delivery
- Design: Cluster randomized controlled trial (28 communes)
- Methodology: Encouragement design, ITT and LATE estimation
- Power analysis: ICC = 0.1, effect size = 0.4, power = 0.8, α = 0.05
- Outcomes: Water consumption, time savings, well-being, school attendance
The evaluation design explicitly accounts for spillovers by randomizing at the commune level. The analysis plan includes baseline balance checks, descriptive statistics, ITT estimation, covariate adjustment, and sensitivity analyses for attrition and compliance.
Expected results suggest increased water consumption, reduced time burden, improved subjective well-being, and lower school dropout rates, while not affecting drinking water quality.
Crowd Forecasting for S&P 500 Trading Decisions
This study evaluates whether collective intelligence–based forecasting strategies can outperform conventional technical indicators in predicting short-term movements of the S&P 500 index. The research compares crowd-based aggregation methods with traditional trading indicators using accuracy and probabilistic performance metrics.
- Research question: Can aggregated crowd forecasts outperform traditional technical indicators in market direction prediction?
- Baseline strategies: RSI, Moving Average, MACD
- Crowd-based strategies: Aggregation by share price, trade quantity
- Aggregation models: Mean and median aggregation of crowd signals
- Evaluation metrics: Accuracy and Brier score
- Data: S&P 500 daily trading data (2022)
Results show that conventional technical indicators achieved accuracies below 50%, while crowd-based aggregation methods significantly improved predictive performance. Median aggregation of crowd forecasts achieved the highest performance, with accuracy above 80% and a Brier score of 0.07, indicating superior probabilistic calibration.
The study highlights the robustness of aggregated human forecasts compared to single-indicator strategies and discusses limitations related to crowd size, diversity, and temporal stability. Future work proposes integrating larger and more diverse forecaster populations and combining crowd signals with machine learning models.
Error Analysis of the Vision 2030 Deliberation Map
This research internship focused on the systematic analysis of deliberation quality within the Vision 2030 question map. The objective was to identify, classify, and quantify structural and semantic errors in large-scale participatory deliberation data, with the aim of improving the quality, usability, and interpretability of collective intelligence outputs.
- Research objective: How can error taxonomy and frequency analysis improve the quality of large-scale deliberation maps?
- Dataset: 746 user-generated contributions across three themes (Research, Teaching & Student Life, Entrepreneurship & Social Impact)
- Method: Manual qualitative coding using a predefined error taxonomy
- Error taxonomy: 9 major error categories with tagged subtypes (e.g. unclear, misplaced, redundant, insubstantive)
- Analysis: Frequency analysis and cross-category comparison
Results show that nearly half of the contributions were affected by unclear errors, primarily due to missing or incomplete titles and descriptions. Other frequent issues included misplaced content, redundancy, and insubstantive responses. These findings highlight structural weaknesses in unconstrained deliberation processes and the need for improved guidance and validation mechanisms.
Based on the analysis, the study proposes several improvements, including stricter hierarchy enforcement, multi-stage cleaning pipelines, respondent randomization to mitigate fatigue, and the potential use of automated or AI-assisted detectors to identify low-quality or malformed contributions.
Participatory Budgeting in South Dublin County Council (€300K Have Your Say)
This case study analyses Ireland’s first local participatory budgeting (PB) initiative, “€300K Have Your Say”, implemented by the South Dublin County Council in 2017. The study examines how participatory mechanisms enable citizens to directly influence public spending decisions and evaluates the democratic quality of the process.
- Research objective: To evaluate the effectiveness and democratic quality of a local participatory budgeting process
- Context: Allocation of €300,000 in public funds across local community projects
- Methods: Qualitative case study based on Participedia framework and official evaluation reports
- Evaluation framework: Democratic Institutional Goods (Graham Smith, 2009)
- Criteria analysed: Inclusiveness, popular control, considered judgment, transparency, efficiency, transferability
The analysis highlights strong citizen engagement, high participation in workshops and voting, and positive perceptions of transparency and ownership. However, limitations were identified regarding outreach to marginalized groups, clarity of project eligibility criteria, voting integrity, and feedback mechanisms for unsuccessful proposals.
The study concludes that while the pilot was largely successful and widely supported by participants, future PB initiatives would benefit from enhanced scoping, clearer communication, improved data integrity safeguards, and stronger mechanisms to support considered judgment and inclusivity.
Transmission of Argument Forms in Scientific and Fictional Narratives
This experimental study investigates how different logical argument forms are transmitted, transformed, and preserved across transmission chains in both scientific and fictional narratives. The research examines whether specific argument structures are more robust to information loss during repeated retelling.
- Research question: How do different argument forms influence information loss and transformation during transmission?
- Design: Transmission chain experiment with repeated retelling
- Argument forms tested: Modus Ponens, Modus Tollens, Hypothetical Syllogism, Disjunctive Syllogism
- Conditions: Scientific vs fictional narratives; positive vs negative valence
- Pre-registration: Hypotheses and analysis plan registered on OSF
The study hypothesizes that argument forms differ in their susceptibility to information loss, with Modus Ponens expected to exhibit greater stability and lower rates of invalid transformation. Additional hypotheses examine whether arguments tend to converge toward simpler logical forms during transmission.
By analysing distortions and transformations across transmission chains, the research contributes to understanding how reasoning structures persist or degrade in cultural communication. The findings have implications for science communication, moral storytelling, and the design of robust explanatory narratives.
Transmission of Argument Forms in Scientific and Fictional Narratives
This experimental study investigates how different logical argument forms are transmitted, transformed, and preserved across transmission chains in both scientific and fictional narratives. The research examines whether specific argument structures are more robust to information loss during repeated retelling.
- Research question: How do different argument forms influence information loss and transformation during transmission?
- Design: Transmission chain experiment with repeated retelling
- Argument forms tested: Modus Ponens, Modus Tollens, Hypothetical Syllogism, Disjunctive Syllogism
- Conditions: Scientific vs fictional narratives; positive vs negative valence
- Pre-registration: Hypotheses and analysis plan registered on OSF
The study hypothesizes that argument forms differ in their susceptibility to information loss, with Modus Ponens expected to exhibit greater stability and lower rates of invalid transformation. Additional hypotheses examine whether arguments tend to converge toward simpler logical forms during transmission.
By analysing distortions and transformations across transmission chains, the research contributes to understanding how reasoning structures persist or degrade in cultural communication. The findings have implications for science communication, moral storytelling, and the design of robust explanatory narratives.
Sentiment Analysis of Video Game Reviews Using LSTM Models
This study investigates the use of deep learning–based sentiment analysis to extract structured insights from large-scale, user-generated product reviews. Focusing on video game reviews from the Steam platform, the research evaluates whether Long Short-Term Memory (LSTM) models can effectively capture sentiment polarity in noisy, informal textual data.
- Research objective: To evaluate the effectiveness of LSTM architectures for sentiment classification in informal, user-generated text
- Dataset: Steam video game reviews covering 64 game titles
- Preprocessing: Tokenization, lemmatization, normalization, stop-word removal, noise reduction
- Feature representation: Word embeddings
- Model: Long Short-Term Memory (LSTM) neural network
- Evaluation metrics: Accuracy, precision, recall
Results indicate that LSTM-based models outperform simpler baselines in capturing sentiment polarity in long and context-dependent reviews, demonstrating robustness to long-range dependencies in sequential text data. The findings support the use of recurrent neural architectures for sentiment analysis in domains characterized by informal language and heterogeneous writing styles.
Identified limitations include sensitivity to slang, emojis, spelling errors, and mixed-language content. Future research directions include multilingual modeling, improved normalization strategies, and integration of sentiment outputs into higher-level collective intelligence and decision-support systems.
Effect of Type of Source on Response Change
This experimental study investigates how the perceived source of an explanation (expert vs anonymous) influences response change, confidence updating, and trust. The research examines belief revision processes when participants are exposed to identical explanations attributed to sources with differing credibility.
- Research questions:
- How does source type influence response change after receiving an explanation?
- How does confidence updating differ between expert and anonymous sources?
- How does perceived trust vary across source conditions?
- Design: Between-subject experimental design
- Sample: 50 participants (pilot study)
- Platform: Qualtrics; recruitment via Prolific
- Stimuli: Identical explanations attributed to expert vs anonymous sources
- Controls: Randomization, counterbalancing, confidence and trust scales (1–7)
- Measures: Response change, confidence before/after, trust in source
Results show that explanations attributed to expert sources lead to higher confidence updates and greater trust compared to anonymous sources, while response change occurs frequently in both conditions. The findings support models of belief updating in which source credibility plays a central role.
Limitations include the small sample size, question difficulty effects, and potential participant fatigue due to repeated exposure. The study outlines future work involving larger samples, statistical significance testing, and refined stimulus difficulty calibration.
Effects of Smoking, BMI, Age, and Their Interactions on Insurance Charges
This study investigates how individual characteristics and their interactions influence health insurance charges. Using a publicly available U.S. insurance dataset, the research focuses on the joint effects of smoking behaviour, body mass index (BMI), and age, with particular attention to interaction effects that are not captured by additive models.
- Research questions:
- Which variables have the strongest association with insurance charges?
- Do interactions between smoking and BMI significantly affect insurance costs?
- Dataset: Kaggle health insurance dataset (1,338 observations, 7 variables)
- Preprocessing: Variable recoding, centering (age, BMI), categorical encoding
- Methods: Multiple linear regression, polynomial regression, interaction terms
- Model selection: AIC-based comparison across competing models
Exploratory analysis revealed strong correlations between insurance charges and smoking status, as well as notable interaction effects between smoking and BMI. Regression results show that while BMI has a modest effect for non-smokers, its impact increases substantially among smokers, indicating a multiplicative rather than additive relationship.
Models incorporating interaction terms achieved lower AIC values than additive models, supporting the inclusion of interaction effects. The final specification highlights smoking status as the dominant predictor, with age and BMI contributing additional explanatory power through interaction with smoking behaviour.
Limitations include the absence of key confounding variables such as income, occupation risk, lifestyle factors, and chronic health conditions, preventing causal attribution. The analysis therefore, focuses on association rather than causal inference.