Using affordable eye tracking methods to study reading: the role of sampling rate

Talk presented at the 22nd European Conference on Eye Movements, Maynooth University

Bernhard Angele¹, Zeynep Gunes Ozkan², Marina Serrano-Carot¹, and Jon Andoni Duñabeitia¹

1. Universidad Nebrija; 2. University of Valencia

Introduction

Presentation available at: https://bangele.quarto.pub/ecem-2024-sampling-rate. Scan QR code for link.

Eye movements are a window into the reading process

  • Recording eye movements has led to a wealth of findings about the cognitive processes involved in reading
  • For example, we know that skilled readers pre-process upcoming words extensively (e.g. McConkie & Rayner, 1975) and that readers can extract orthographic, phonological, and semantic information from upcoming words (e.g. Schotter et al., 2012)

Eye movements are a window into the reading process

Illustrating the eye-movement gap

  • Data from Scopus: Searching for “eye” and “track(er/ing)” or “movement(s)” and “reading” in the title, abstract, and keywords
  • Publications from 1974 to 2024
  • Counting country affiliations by author (publications can have multiple authors and author affiliations)
  • Excluding affiliations with missing country names
  • Details: Angele & Duñabeitia (2024)

Publications 1974-2024

Publications 1974-2024 (Cartogram)

Evolution over time: 1974 – 2000

Cartogram: 1974 – 2000

Evolution over time: 2001 – 2010

Cartogram: 2001 – 2010

Evolution over time: 2011 – 2020

Cartogram: 2011 – 2020

Evolution over time: 2021 – 2024

Cartogram: 2021 – 2024

Summary: The eye-movement gap

  • For the last 50 years, the literature on eye movements in reading has been dominated by a handful of countries
  • This is still true in the last three years, with one major change: China has become a major center of reading research
  • There are also a few more European countries with a significant number of publications
  • For the vast majority of other countries, the situation now is the same as it was in the 1970s, 80s, and 90s

What are we missing?

  • General issue of WEIRD research (Henrich et al., 2010): Western participants may not be representative of all readers or even the majority of readers

  • English, German, French, Spanish, Italian etc. are similar languages in many respects and all share the same writing system

  • Studying just one more language (Chinese) has forced us to think about issues such as word segmentation, processing of character components, the relationship between semantic and orthographic processing, and many more.

What are we missing?

  • We can also appreciate many similarities between reading in English (and other alphabetic languages) and Chinese

  • What other aspects of reading do we take for granted?

  • How can we identify universals if we only study a handful of languages?

Why is there so little eye-movement research except in a few countries?

  • Eye-trackers are expensive
  • Modern infra-red based eye-trackers are very complex devices
  • Most expensive component: High-speed, high-resolution cameras
  • Sampling rate is a key bottleneck
  • Can we study reading at lower sampling rates?

Our approach

  • We take a practical approach: Which is the lowest sampling rate that allows us to find evidence of cognitive processing?
  • We need a benchmark effect – a phenomenon that is well-studied and whose existence (and effect size) is clear
  • The word frequency effect on fixation duration is ideal for this

Method

  • 32 participants read 400 sentences in Spanish

  • Eye movements are recorded by an SR Research Eyelink Portable Duo

  • Four sampling rates 250 Hz, 500 Hz, 1000 Hz, and 2000 Hz (100 sentences each)

  • Frequency manipulation: each sentence has a target word that was manipulated to be either

    • high frequency (mean frequency 47/million)

    • low frequency (mean frequency 2/million)

  • The context up to the target word was identical for both versions of the sentence.

    • Context after the target word was allowed to vary.

Results: Example trial 1000 Hz

  • At 1000 Hz, this trial has 6393 samples.

Results: Example trial 1000 Hz

Crop out end of trial

  • At 1000 Hz, this trial has 6393 samples.

Example trial 1000 Hz

With Eyelink detected fixations

  • At 1000 Hz, this trial has 6393 samples.

Example trial 500 Hz

With Eyelink fixations

  • At 500 Hz, this trial has 2657 samples.

Example trial 250 Hz

  • With Eyelink fixations
  • At 250 Hz, this trial has 1029 samples.

Results: Example trial 2000 Hz

  • With Eyelink fixations
  • At 2000 Hz, this trial has 11416 samples.

Results: Means

250 and 500 Hz

First fixation duration

Gaze duration

Mean SD Mean SD
250 Hz
high frequency 242 85 332 169
low frequency 245 85 373 209
Effect 3
40
500 Hz
high frequency 238 84 329 182
low frequency 246 92 366 229
Effect 8
36
1000 and 2000 Hz

First fixation duration

Gaze duration

Mean SD Mean SD
1000 Hz
high frequency 234 83 319 171
low frequency 244 85 365 222
Effect 10
45
2000 Hz
high frequency 230 80 313 165
low frequency 236 85 344 197
Effect 7
31

Evidence for frequency effect in FFD on the target word

Evidence for frequency effect in GD on the target word

Simulating even lower sampling rates

  • We can drop samples to simulate a lower sampling rate

  • For example, if we drop 15 out of every 16 samples from a 2000 Hz trial, we get a simulated 125 Hz trial

  • For example, remember this trial at 2000 Hz has 11416 samples:

Removing samples

  • If we drop out 15 out of every 16 samples, we get the equivalent of a 125 Hz trial (714 out of 11416 samples):

Removing samples

  • Detecting fixations in simulated 125 Hz data:

Removing even more samples

  • We can remove 39 out of every 40 samples to get the equivalent of a 50 Hz trial (286 out of 11416 samples):

Removing even more samples

  • We can remove 63 out of every 64 samples to get the equivalent of a 31.25 Hz trial (179 out of 11416 samples):

Simulations: 100 trials per subject

Simulated low sampling rates

First fixation duration

Gaze duration

Mean SD Mean SD
31.25 Hz
high frequency 214 101 269 143
low frequency 222 107 307 189
Effect 8
39
50 Hz
high frequency 221 88 291 151
low frequency 228 92 333 200
Effect 7
42
125 Hz
high frequency 233 81 314 158
low frequency 237 83 356 213
Effect 4
42

Simulated low sampling rates (FFD)

100 trials per subject

Simulated low sampling rates (GD)

100 trials per subject

Simulations: 400 trials per subject

Simulated low sampling rates

First fixation duration

Gaze duration

Mean SD Mean SD
31.25 Hz
high frequency 214 99 274 154
low frequency 225 108 313 193
Effect 12
39
50 Hz
high frequency 223 89 299 162
low frequency 233 96 337 205
Effect 9
38
125 Hz
high frequency 234 84 322 170
low frequency 241 88 363 219
Effect 7
41

Simulated low sampling rates (FFD)

400 trials per subject

Simulated low sampling rates (GD)

400 trials per subject

Discussion

  • The frequency effect is very robust, especially in GD and TVT

    • Apparently, not even 30 Hz is too low to detect word frequency effects!
  • The effect is more sensitive to sampling rate in FFD

    • Aggregate measures such as GD and TVT seem to be less affected by random noise
  • This is encouraging for researchers who cannot afford a 1000 Hz eye-tracker!

Discussion

  • Simulating low sampling rates from data collected using a very accurate eye-tracker is not the same as actually using an affordable eye tracker.

  • But we have shown that sampling rate is not a hard bottleneck for studying reading.

  • If you are not sure about whether your eye-tracker is good enough to study reading, maybe just do a pilot study looking for the frequency effect in GD

    • Or wait for us to do it!

Recommendations

  • You can compensate for low sampling rates by increasing sample size

  • In our case, 1,600 observations per condition were enough to detect the frequency effect in GD (about 30 ms) at all sampling rates (similar to Brysbaert & Stevens (2018))

  • To consistently detect the effect in FFD (<10 ms) at lower sampling rates, we needed 6,400 observations per condition

Recommendations

  • In summary: If you want to use affordable eye-trackers to study reading:
    • target effects that are large enough
    • choose aggregate measures (GD, TVT, go-past time, etc.)
    • plan to use a large sample
    • consider running a pilot study with a known effect (word frequency) and do a power analysis based on the results

Thank you

Presentation available at: https://bangele.quarto.pub/ecem-2024-sampling-rate or scan QR code

Example trial 1000 Hz

With fixations according to the Engbert & Kliegl (2003) algorithm

Example trial 1000 Hz

With Eyelink and Engbert & Kliegl (2004) fixations plotted on top of each other

Why investigate this?

  • Most eye tracking labs have eye trackers that can track at 1000 Hz or even 2000 Hz.
    • Why think about lower sampling rates?
  • Eye trackers that can track at 1000 Hz are very expensive.
    • Sampling rate is a “hard” bottleneck
      • Cameras that can record 1000 frames per second are quite rare
      • We can perhaps improve low-quality images using techniques such as machine learning, but we cannot create extra samples
  • The cost of high sampling rate eye trackers limits the study of reading (and other processes)
    • Outside of the lab
    • With more diverse populations
    • With new paradigms (e.g. multi-person)

Removing samples

  • Comparisons between samples detected in 125 Hz and samples detected in the original 2000 Hz.

Simulating low sampling rates

  • Is dropping samples an appropriate way to simulate low sampling rates?
    • Maybe averaging over samples instead of dropping them is more appropriate
  • Let’s compare!

Simulating low sampling rates: dropping samples

Simulating low sampling rates: averaging samples

Average algorithm: FFD

100 trials/subject

Average algorithm: GD

100 trials/subject

Average algorithm: FFD

400 trials/subject

Average algorithm: GD

400 trials/subject

References

Angele, B., & Duñabeitia, J. A. (2024). Closing the eye-tracking gap in reading research. Frontiers in Psychology, 15. https://doi.org/10.3389/fpsyg.2024.1425219
Bicknell, K., & Levy, R. (2010). A rational model of eye movement control in reading. 11681178. http://dl.acm.org/citation.cfm?id=1858800
Brysbaert, M., & Stevens, M. (2018). Power Analysis and Effect Size in Mixed Effects Models: A Tutorial. Journal of Cognition, 1(1). https://doi.org/10.5334/joc.10
Engbert, R., Longtin, A., & Kliegl, R. (2002). A dynamical model of saccade generation in reading based on spatially distributed lexical processing. Vision Research, 42(5), 621636. https://doi.org/10.1016/S0042-6989(01)00301-7
Henrich, J., Heine, S. J., & Norenzayan, A. (2010). Most people are not WEIRD. Nature, 466(7302), 29–29. https://doi.org/10.1038/466029a
McConkie, G. W., & Rayner, K. (1975). The span of the effective stimulus during a fixation in reading. Perception & Psychophysics, 17, 578587.
Reichle, E. D., Pollatsek, A., Fisher, D. L., & Rayner, K. (1998). Toward a model of eye movement control in reading. Psychological Review, 105(1), 125–157. https://doi.org/10.1037/0033-295X.105.1.125
Schotter, E. R., Angele, B., & Rayner, K. (2012). Parafoveal processing in reading. Attention, Perception, & Psychophysics, 74(1), 5–35. https://doi.org/10.3758/s13414-011-0219-2
Snell, J., Van Leipsig, S., Grainger, J., & Meeter, M. (2018). OB1-reader: A model of word recognition and eye movements in text reading. Psychological Review, 125(6), 969–984. https://doi.org/10.1037/rev0000119