Lower sampling rates may be the key to closing the eye-movement gap in reading research

Talk presented at the Research colloquium: Computational and experimental psycholinguistics. Uni Stuttgart, SS 2024

Bernhard Angele¹ Zeynep Gunes Ozkan² Marina Serrano-Carot¹ and Jon Andoni Duñabeitia¹

1. Universidad Nebrija; 2. University of Valencia

Introduction

Eye movements are a window into the reading process

  • Eye-tracking has revolutionized the study of reading
  • Recording eye movements allows us to study the reading process in real time
  • In the last 50 years, this has led to a wealth of findings about the cognitive processes involved in reading
  • For example, we know that skilled readers pre-process upcoming words extensively (e.g. McConkie & Rayner, 1975) and that readers can extract orthographic, phonological, and semantic information from upcoming words (e.g. Schotter et al., 2012)
  • Models of eye-movement control in reading have been developed that provide theoretical accounts for many of these findings (Bicknell & Levy, 2010; R. Engbert et al., 2002; e.g. Reichle et al., 1998; Snell et al., 2018)
  • However, the eye-movement literature is dominated by studies from a small number of countries, investigating reading in a limited number of languages

Illustrating the eye-movement gap

  • Data from Scopus: Searching for “eye” and “track(er/ing)” or “movement(s)” and “reading” in the title, abstract, and keywords
  • Publications from 1974 to 2024
  • Counting country affiliations by author (publications can have multiple authors and author affiliations)
  • Excluding affiliations with missing country names

Publications 1974-2024

Publications 1974-2024 (Cartogram)

Evolution over time: 1974 – 2000

Evolution over time: 1974 – 2000 (Cartogram)

Evolution over time: 2001 – 2010

Evolution over time: 2001 – 2010 (Cartogram)

Evolution over time: 2011 – 2020

Evolution over time: 2011 – 2020 (Cartogram)

Evolution over time: 2021 – 2024

Evolution over time: 2021 – 2024 (Cartogram)

Summary: The eye-movement gap

  • For the last 50 years, the literature on eye movements in reading has been dominated by a handful of countries
  • This is still true in the last three years, with one major change: China has become a major center of reading research
  • There are also a few more European countries with a significant number of publications
  • For the vast majority of other countries, the situation now is the same as it was in the 1970s, 80s, and 90s

Why care about sampling rate?

A photo of the Eyelink Portable Duo.

This is an eye tracker

Why care about sampling rate?

A photo of the HTC Vive Pro.

This is an eye tracker, too

Why care about sampling rate?

A photo of an iPad Pro 11

This is an eye tracker, as well!

The role of sampling rate in reading experiments

  • In reading, a typical fixation is around 225 ms long.

  • A typical saccade in reading takes about 30 ms (Rayner, 1998).

  • How many samples are needed to reliably detect a saccade?

    • 30 (1000 Hz)?

    • 15 (500 Hz)?

    • 7 (250 Hz)?

    • 3 (100 Hz)?

Our approach

  • We take a practical approach: Which is the lowest sampling rate that allows us to find evidence of cognitive processing?
  • We need a benchmark effect – a phenomenon that is well-studied and whose existence (and effect size) is clear
  • The word frequency effect on fixation duration is ideal for this

Word frequency effect

  • First studied by Erdmann and Dodge in 1898 (as cited by Huey, 1908).

    • They found that readers make more pauses (fixations) for difficult material than easy and familiar material

    • Children also make more pauses

  • Today, we usually measure the word frequency effect using fixation duration (Rayner, 1998)

    • We calculate specific aggregated fixation time measures such as

      • First fixation duration (FFD, the duration of the first fixation on each word)

      • Gaze duration (GD, the duration of the first fixation plus any subsequent refixations on a word)

    • In this talk, I will focus on FFD and GD.

  • In experiments with a word frequency manipulation, the size of the word frequency effect has been estimated as 16 ms in FFD and 29 ms in GD (Inhoff & Rayner, 1986).

Method

  • 32 participants read 400 sentences in Spanish

  • Eye movements are recorded by an SR Research Eyelink Portable Duo

  • Four sampling rates 250 Hz, 500 Hz, 1000 Hz, and 2000 Hz (100 sentences each)

  • Frequency manipulation: each sentence has a target word that was manipulated to be either

    • high frequency (mean frequency 47/million)

    • low frequency (mean frequency 2/million)

  • The context up to the target word was identical for both versions of the sentence.

    • Context after the target word was allowed to vary.

Data analysis

  • Details: Please ask!

  • We analyzed the data just like we would with any other eye tracking experiment, using Bayesian linear mixed models (with brms)

Results: Example trial 1000 Hz

  • At 1000 Hz, this trial has 6393 samples.

Results: Example trial 1000 Hz

Crop out end of trial

  • At 1000 Hz, this trial has 6393 samples.

Example trial 1000 Hz

With Eyelink detected fixations

  • At 1000 Hz, this trial has 6393 samples.

Example trial 500 Hz

With Eyelink fixations

  • At 500 Hz, this trial has 2657 samples.

Example trial 250 Hz

  • With Eyelink fixations
  • At 250 Hz, this trial has 1029 samples.

Results: Example trial 2000 Hz

  • With Eyelink fixations
  • At 2000 Hz, this trial has 11416 samples.

Results: Means

250 and 500 Hz
First fixation duration Gaze duration
Mean SD Mean SD
250 Hz
high frequency 242 85 332 169
low frequency 245 85 373 209
Effect 3
40
500 Hz
high frequency 238 84 329 182
low frequency 246 92 366 229
Effect 8
36
1000 and 2000 Hz
First fixation duration Gaze duration
Mean SD Mean SD
1000 Hz
high frequency 234 83 319 171
low frequency 244 85 365 222
Effect 10
45
2000 Hz
high frequency 230 80 313 165
low frequency 236 85 344 197
Effect 7
31

Evidence for frequency effect in FFD on the target word

Evidence for frequency effect in GD on the target word

Evidence for frequency and length effect in FFD on all words

Evidence for frequency and length effect in GD on all words

Simulating even lower sampling rates

  • We can drop samples to simulate a lower sampling rate

  • For example, if we drop 15 out of every 16 samples from a 2000 Hz trial, we get a simulated 125 Hz trial

  • For example, remember this trial at 2000 Hz has 11416 samples:

Removing samples

  • If we drop out 15 out of every 16 samples, we get the equivalent of a 125 Hz trial (714 out of 11416 samples):

Removing samples

  • Detecting fixations in simulated 125 Hz data:

Removing even more samples

  • We can remove 39 out of every 40 samples to get the equivalent of a 50 Hz trial (286 out of 11416 samples):

Removing samples

  • We can remove 63 out of every 64 samples to get the equivalent of a 31.25 Hz trial (179 out of 11416 samples):

Simulated low sampling rates: Means

Simulated low sampling rates
First fixation duration Gaze duration
Mean SD Mean SD
31.25 Hz
high frequency 213 99 273 154
low frequency 225 108 311 192
Effect 12
38
50 Hz
high frequency 223 89 298 161
low frequency 232 96 336 204
Effect 9
38
125 Hz
high frequency 233 84 323 172
low frequency 240 88 362 220
Effect 7
40

Simulated low sampling rates (FFD)

Simulated low sampling rates (GD)

Simulating low sampling rates

  • Is dropping samples an appropriate way to simulate low sampling rates?
    • Maybe averaging over samples instead of dropping them is more appropriate
  • Let’s compare!

Simulating low sampling rates: dropping samples

Simulating low sampling rates: averaging samples

Evidence for the frequency effect with averaged samples: FFD

Evidence for the frequency effect with averaged samples: GD

Discussion

  • How low is too low?

    • Apparently, not even 30 Hz is too low to detect word frequency effects!
  • Of course, having a very accurate eye tracker and simply dropping samples is not the same as having an imprecise eye tracker with a low maximum sampling rate

    • But the frequency effect is extremely robust
      • Even averaging across every 64 samples (when going from 2000 to 31.25 Hz) does not remove it

Discussion

  • Of course, it is far from clear whether the same is true for other effects commonly observed in eye movements during reading:

    • Word length
    • Predictability
    • Preview benefit?
  • But this result is very encouraging for future research on inexpensive eye tracking technologies!

  • Sampling rate is one of the most important limitations of inexpensive eye tracking

Thank you

Word frequency effect size and distributional effects

  • Rayner & Duffy (1986) found a much stronger effect of 37 ms in FFD and 87 ms in GD, possibly because the sentences they used were more complex and contained lexically ambiguous target words.
  • Staub et al. (2010) noted that fixation time distributions, like those of reaction times, are usually right-skewed and may be better modelled as ex-gaussian distributions rather than normal distributions.
    • Staub et al. investigated the effect of word frequency on both the mean of the Gaussian component (μ) and the exponential parameter (τ).
    • They found similar size frequency effects in FFD (25 ms) and GD (27 ms), but the distributional analysis showed a stronger effect on μ in FFD (16 ms) than in GD (8 ms) and a stronger effect on τ in GD (20 ms) than in FFD (10 ms).

Problem: Detecting fixations

  • We can’t use the Eyelink algorithm

  • Alternative: Use algorithm from Ralf Engbert & Kliegl (2003), as implemented in Malsburg (2019).

  • The results of this algorithm are normally very similar to the built-in Eyelink algorithm.

Example trial 1000 Hz

With fixations according to the Engbert & Kliegl (2003) algorithm

Example trial 1000 Hz

With Eyelink and Engbert & Kliegl (2004) fixations plotted on top of each other

Why investigate this?

  • Most eye tracking labs have eye trackers that can track at 1000 Hz or even 2000 Hz.
    • Why think about lower sampling rates?
  • Eye trackers that can track at 1000 Hz are very expensive.
    • Sampling rate is a “hard” bottleneck
      • Cameras that can record 1000 frames per second are quite rare
      • We can perhaps improve low-quality images using techniques such as machine learning, but we cannot create extra samples
  • The cost of high sampling rate eye trackers limits the study of reading (and other processes)
    • Outside of the lab
    • With more diverse populations
    • With new paradigms (e.g. multi-person)

Removing samples

  • Comparisons between samples detected in 125 Hz and samples detected in the original 2000 Hz.

References

Bicknell, K., & Levy, R. (2010). A rational model of eye movement control in reading. 11681178. http://dl.acm.org/citation.cfm?id=1858800
Engbert, Ralf, & Kliegl, R. (2003). Microsaccades uncover the orientation of covert attention. Vision Research, 43(9), 1035–1045. https://doi.org/10.1016/S0042-6989(03)00084-1
Engbert, R., Longtin, A., & Kliegl, R. (2002). A dynamical model of saccade generation in reading based on spatially distributed lexical processing. Vision Research, 42(5), 621636. https://doi.org/10.1016/S0042-6989(01)00301-7
Huey, E. B. (1908). The psychology and pedagogy of reading, with a review of the history of reading and writing and of methods, texts, and hygiene in reading. New York : Macmillan. http://archive.org/details/psychologypedago00hueyiala
Inhoff, A. W., & Rayner, K. (1986). Parafoveal word processing during eye fixations in reading: Effects of word frequencency. Perception & Psychophysics, 40(6), 431439.
Malsburg, T. von der. (2019). Saccades: Detection of fixations in eye-tracking data. https://github.com/tmalsburg/saccades
McConkie, G. W., & Rayner, K. (1975). The span of the effective stimulus during a fixation in reading. Perception & Psychophysics, 17, 578587.
Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372422. https://doi.org/10.1037/0033-2909.124.3.372
Rayner, K., & Duffy, S. A. (1986). Lexical complexity and fixation times in reading: Effects of word frequency, verb complexity, and lexical ambiguity. Memory & Cognition, 14(3), 191–201.
Reichle, E. D., Pollatsek, A., Fisher, D. L., & Rayner, K. (1998). Toward a model of eye movement control in reading. Psychological Review, 105(1), 125–157. https://doi.org/10.1037/0033-295X.105.1.125
Schotter, E. R., Angele, B., & Rayner, K. (2012). Parafoveal processing in reading. Attention, Perception, & Psychophysics, 74(1), 5–35. https://doi.org/10.3758/s13414-011-0219-2
Snell, J., Van Leipsig, S., Grainger, J., & Meeter, M. (2018). OB1-reader: A model of word recognition and eye movements in text reading. Psychological Review, 125(6), 969–984. https://doi.org/10.1037/rev0000119
Staub, A., White, S. J., Drieghe, D., Hollway, E. C., & Rayner, K. (2010). Distributional effects of word frequency on eye fixation durations. Journal of Experimental Psychology. Human Perception and Performance, 36(5), 1280–1293. https://doi.org/10.1037/a0016896