Low sampling rate is not an obstacle to making reading research more accessible

Talk presented at the 1st Workshop on Replication in the Language Sciences, Frankfurt, 2025

Bernhard Angele¹, Zeynep Gunes Ozkan², Marina Serrano-Carot¹, and Jon Andoni Duñabeitia¹

1. Universidad Nebrija; 2. University of Valencia

Introduction

Presentation available at: https://bangele.quarto.pub/worela2025. Scan QR code for link.

Eye movements are a window into the reading process

Recording eye movements has led to a wealth of findings about the cognitive processes involved in reading
However, if we examine the eye-movement literature, we find that most results come from a small number of countries and involve reading in a limited number of languages

Illustrating the eye-movement gap

Data from Scopus: Searching for “eye” and “track(er/ing)” or “movement(s)” and “reading” in the title, abstract, and keywords
Publications from 1974 to 2024
Counting country affiliations by author (publications can have multiple authors and author affiliations)
Excluding affiliations with missing country names
Details: Angele & Duñabeitia (2024)

Cartogram: 1974 – 2000

Cartogram: 2021 – 2024

What are we missing?

General issue of WEIRD research (Henrich et al., 2010): Western participants may not be representative of all readers or even the majority of readers
English, German, French, Spanish, Italian etc. are similar languages in many aspects and all share the same writing system
Studying Chinese reading has forced us to think about new issues such as word segmentation, processing of character components and many more.

Why is there so little eye-movement research except in a few countries?

Eye-trackers are expensive
Modern infra-red based eye-trackers are very complex devices
Most expensive component: High-speed, high-resolution cameras
Sampling rate is a key bottleneck
Can we study reading at lower sampling rates?

Does eye-tracking at lower sampling rates make sense?

Obtaining meaningful reading data is dependent on accurate saccade detection, as most reading measures are based on saccades and fixations.
Nyquist-Shannon Theorem (Shannon, 1949): To accurately reconstruct a signal, the sampling rate must be at least twice its highest frequency.
Saccade Signal Frequency: Bahill et al. (1981) found the crucial information for saccade velocity (the signal bandwidth) is largely within ~74 Hz.
Theoretical Minimum: Based on 74 Hz, Nyquist-Shannon implies a 148 Hz sampling rate should suffice for saccade detection using velocity-based algorithms.
1000 Hz is likely overkill for saccade detection.

Detecting fixations at low sampling rates

Reading Research Focus: Primarily interested in fixation duration.
Error Dynamics (cf. Andersson et al. (2010), Fig. 3):
- Fixation duration is set by sampled saccade timings, each prone to error (e.g., +/-16ms, corresponding to a ~60 Hz sampling rate).
- These individual random errors (at fixation start and end) are centered around zero.
- Due to “partial cancellation”, the average net error in measured fixation duration also tends towards zero, though its variability reflects both error sources.
- Averaging multiple fixation estimates further reduces the impact of this random net error.
Key Takeaway: Larger sample sizes can compensate for lower sampling rates by improving mean estimate stability.

Detecting fixations at low sampling rates (2)

Three density plots: error at fixation start, error at fixation end, and net error in fixation duration. All are centered at 0, with the net error distribution being wider. — Figure 1

Our approach

We take a practical approach: Which is the lowest sampling rate that allows us to find evidence of cognitive processing?
We need a benchmark effect – a phenomenon that is well-studied and whose existence (and effect size) is clear
The word frequency effect on fixation duration is ideal for this

Method

32 participants read 400 sentences in Spanish
Eye movements are recorded by an SR Research Eyelink Portable Duo
Four sampling rates 250 Hz, 500 Hz, 1000 Hz, and 2000 Hz (100 sentences each)
Frequency manipulation: each sentence has a target word that was manipulated to be either
- high frequency (mean frequency 47/million)
- low frequency (mean frequency 2/million)
The context up to the target word was identical for both versions of the sentence.
- Context after the target word was allowed to vary.

Results: Example trial 1000 Hz

At 1000 Hz, this trial has 6393 samples.

Results: Example trial 1000 Hz

Crop out end of trial

At 1000 Hz, this trial has 6393 samples.

Example trial 1000 Hz

With Eyelink detected fixations

At 1000 Hz, this trial has 6393 samples.

Example trial 250 Hz

With Eyelink fixations

At 250 Hz, this trial has 1029 samples.

Results: Example trial 2000 Hz

With Eyelink fixations

At 2000 Hz, this trial has 11416 samples.

Data analysis

For each participant and sampling rate, we extract the fixations detected by the Eyelink saccade detection algorithm and aggregate them into word-based fixation time measures for the target word
- First fixation duration (FFD)
- Gaze duration (GD)
In order to evaluate the strength of evidence for the frequency effect, we then fitted Bayesian linear and generalized linear mixed models using the brms package (Bürkner, 2017).
- Fixed effect: frequency condition (coded as present = -.5; absent = .5)
- Random effects: all possible (intercepts and frequency condition by participant and item)
- As a rule of thumb, we considered an effect credible if more than 95% of the distribution are on one side of 0

Results: Means

250 and 500 Hz
	First fixation duration		Gaze duration
	Mean	SD	Mean	SD
250 Hz
high frequency	242	85	332	169
low frequency	245	85	373	209
Effect	3		40
500 Hz
high frequency	238	84	329	182
low frequency	246	92	366	229
Effect	8		36

1000 and 2000 Hz
	First fixation duration		Gaze duration
	Mean	SD	Mean	SD
1000 Hz
high frequency	234	83	319	171
low frequency	244	85	365	222
Effect	10		45
2000 Hz
high frequency	230	80	313	165
low frequency	236	85	344	197
Effect	7		31

Evidence for frequency effect in FFD on the target word

Evidence for frequency effect in GD on the target word

Simulating even lower sampling rates

We can drop samples to simulate a lower sampling rate
If we drop out 15 out of every 16 samples in a 2000 Hz trial, we get the equivalent of a 125 Hz trial (714 out of 11416 samples):

Removing samples

Detecting fixations in simulated 125 Hz data:

Removing even more samples

We can remove 39 out of every 40 samples to get the equivalent of a 50 Hz trial (286 out of 11416 samples):

Removing even more samples

We can remove 63 out of every 64 samples to get the equivalent of a 31.25 Hz trial (179 out of 11416 samples):

Downsampled data: Evidence for frequency effect in FFD

400 trials per subject

Downsampled data: Evidence for frequency effect in GD

400 trials per subject

Discussion

The frequency effect is very robust, especially in GD and TVT
- Apparently, not even 30 Hz is too low to detect word frequency effects!
The effect is more sensitive to sampling rate in FFD
- Aggregate measures such as GD and TVT seem to be less affected by random noise
This is encouraging for researchers who cannot afford a 1000 Hz eye-tracker!

Discussion

Simulating low sampling rates from data collected using a very accurate eye-tracker is not the same as actually using an affordable eye tracker.
But we have shown that sampling rate is not a hard bottleneck for studying reading.
You can compensate for low sampling rates by increasing sample size
If you are not sure about whether your eye-tracker is good enough to study reading, maybe just do a pilot study looking for the frequency effect in GD
- Or wait for us to do it!

Recommendations

If you want to use affordable eye-trackers to study reading:
- investigate effects that are large enough
- choose aggregate measures (GD, TVT, go-past time, etc.)
- plan to use a large sample
- consider running a pilot study with a known effect (word frequency) and do a power analysis based on the results

Thank you

Now in press at Behavior Research Methods!

Presentation available at: https://bangele.quarto.pub/using-affordable-eye-tracking-methods-to-study-reading-the-role-of-sampling-rate or scan QR code

Example trial 1000 Hz

With fixations according to the Engbert & Kliegl (2003) algorithm

Example trial 1000 Hz

With Eyelink and Engbert & Kliegl (2004) fixations plotted on top of each other

Why investigate this?

Most eye tracking labs have eye trackers that can track at 1000 Hz or even 2000 Hz.
- Why think about lower sampling rates?
Eye trackers that can track at 1000 Hz are very expensive.
- Sampling rate is a “hard” bottleneck
  - Cameras that can record 1000 frames per second are quite rare
  - We can perhaps improve low-quality images using techniques such as machine learning, but we cannot create extra samples
The cost of high sampling rate eye trackers limits the study of reading (and other processes)
- Outside of the lab
- With more diverse populations
- With new paradigms (e.g. multi-person)

Removing samples

Comparisons between samples detected in 125 Hz and samples detected in the original 2000 Hz.

Simulating low sampling rates

Is dropping samples an appropriate way to simulate low sampling rates?
- Maybe averaging over samples instead of dropping them is more appropriate
Let’s compare!

Simulating low sampling rates: dropping samples

Simulating low sampling rates: averaging samples

Average algorithm: FFD

100 trials/subject

Average algorithm: GD

100 trials/subject

Average algorithm: FFD

400 trials/subject

Average algorithm: GD

400 trials/subject

Simulated low sampling rates
	First fixation duration		Gaze duration
	Mean	SD	Mean	SD
31.25 Hz
high frequency	214	101	269	143
low frequency	222	107	307	189
Effect	8		39
50 Hz
high frequency	221	88	291	151
low frequency	228	92	333	200
Effect	7		42
125 Hz
high frequency	233	81	314	158
low frequency	237	83	356	213
Effect	4		42

Downsampled data: Evidence for frequency effect in FFD

100 trials per subject

Downsampled data: Evidence for frequency effect in GD

100 trials per subject

Downsampled data: 400 trials per subject

Error in UseMethod("group_by"): no applicable method for 'group_by' applied to an object of class "function"

Error: object 'filtered_data' not found

Error in `left_join()`:
! Join columns in `x` must be present in the data.
✖ Problem with `sentence_nr`, `word_nr`, and `cond`.

Simulated low sampling rates
	First fixation duration		Gaze duration
	Mean	SD	Mean	SD
31.25 Hz
high frequency	214	99	274	154
low frequency	225	108	313	193
Effect	12		39
50 Hz
high frequency	223	89	299	162
low frequency	233	96	337	205
Effect	9		38
125 Hz
high frequency	234	84	322	170
low frequency	241	88	363	219
Effect	7		41

References

Andersson, R., Nyström, M., & Holmqvist, K. (2010). Sampling frequency and eye-tracking measures: How speed affects durations, latencies, and more. Journal of Eye Movement Research, 3(3). https://doi.org/10.16910/jemr.3.3.6

Angele, B., & Duñabeitia, J. A. (2024). Closing the eye-tracking gap in reading research. Frontiers in Psychology, 15. https://doi.org/10.3389/fpsyg.2024.1425219

Bahill, A. T., Brockenbrough, A., & Troost, B. T. (1981). Variability and development of a normative data base for saccadic eye movements. Investigative Ophthalmology & Visual Science, 21(1), 116–125.

Bürkner, P.-C. (2017). brms: An R package for Bayesian multilevel models using Stan. Journal of Statistical Software, 80(1), 1–28. https://doi.org/10.18637/jss.v080.i01

Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33(2-3), 61–83. https://doi.org/10.1017/s0140525x0999152x

Shannon, C. E. (1949). Communication in the presence of noise. Proceedings of the IRE, 37(1), 10–21. https://doi.org/10.1109/JRPROC.1949.232969