Data analysis

Multiple regression for data missing not at random – R

Analysis of human decision-making with multiple interdependent predictors. Using a panel regression to analyse time-series data in R, with participant entered as a fixed effect.

Unlike tightly controlled experiments, the conditions that participants experience (e.g.  choosing a high risk option then winning) are dependent upon the participants choices. This brings two associated statistical issues:

  1. Participants do not contribute equally to all conditions of interest, and many participants have data missing from conditions. This problem is exacerbated as we include multiple predictors and interactions.
  2. There can be a dependency between these empty cells, and the behaviour of interest, so it can be challenging to separate the (within-participant) effect of interest  from irrelevant (between participant) factors.

A novel solution, which has previously been adopted by other fields (such as economics), is to model participant as a fixed affects variable, instead of random effects. This way a participant acts as their own control, and only contributes to conditions they experienced.

This analysis method has been widely adopted within the Centre for Gambling Research. Code examples (in R) available online:

Extraction of events from a genuine slot machine – Python

Allowed the analysis of gambling behaviour using slot machines instead of laboratory simulations.

Using Python 2.7 and OpenCV, I led the development of a program to extract machine events from a video capture of slot machine gambling. The program processes every frame of video, looking for a ‘Good Luck’ symbol that indicates the start of a bet using template matching in OpenCV. Once triggered, it reads the machine balance and bet amount, notes the time at which the bet started, when the spin ends, and what the outcome is. This program has been widely adopted within the Centre for Gambling Research, and has allowed the analysis of fine-grained behavioural data observed on slot machines. Code to be published on GitHub shortly.