Skip to content

OneStop RC

Reading Comprehension (OneStop)

Test

ModelUnseen Reader Balanced AccuracyUnseen Text Balanced AccuracyUnseen Text and Reader Balanced AccuracyAverage Balanced AccuracyUnseen Reader AUROCUnseen Text AUROCUnseen Text and Reader AUROCAverage AUROC
Majority Class / Chance50.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.0
Reading Speed50.6 ± 1.249.4 ± 0.848.0 ± 1.549.9 ± 0.649.8 ± 1.749.3 ± 0.947.7 ± 2.249.6 ± 0.8
Text-Only Roberta58.6 ± 1.751.4 ± 0.553.7 ± 1.655.0 ± 1.066.3 ± 1.155.2 ± 1.555.0 ± 2.861.1 ± 1.0
Logistic Regression [meziere2023using]51.1 ± 1.252.2 ± 0.452.4 ± 2.551.7 ± 0.752.2 ± 1.553.4 ± 0.954.3 ± 3.253.0 ± 0.8
SVM [hollenstein2023zuco]50.0 ± 0.951.6 ± 0.650.8 ± 1.550.7 ± 0.750.0 ± 0.951.6 ± 0.650.8 ± 1.550.7 ± 0.7
Random Forest [makowski2024detection]56.2 ± 0.853.5 ± 1.054.7 ± 1.555.1 ± 0.559.4 ± 0.956.0 ± 1.357.2 ± 1.958.0 ± 0.6
AhnRNN [ahn2020towards]50.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.0
AhnCNN [ahn2020towards]50.1 ± 0.150.1 ± 0.149.7 ± 0.950.0 ± 0.048.3 ± 1.551.7 ± 0.548.1 ± 3.149.7 ± 0.7
BEyeLSTM [reich_inferring_2022]53.0 ± 0.650.0 ± 0.751.6 ± 1.351.5 ± 0.554.8 ± 1.150.3 ± 1.251.0 ± 2.552.5 ± 0.8
PLM-AS [Yang2023PLMASPL]56.0 ± 0.950.9 ± 0.853.8 ± 2.253.5 ± 0.859.6 ± 0.952.1 ± 1.055.9 ± 1.956.1 ± 0.9
PLM-AS-RM [haller2022eye]58.0 ± 0.752.5 ± 0.856.2 ± 2.355.2 ± 0.462.0 ± 0.854.1 ± 1.358.6 ± 2.158.4 ± 0.5
RoBERTEye-W [Shubi2024Finegrained]58.4 ± 1.851.1 ± 0.854.1 ± 2.254.7 ± 1.066.5 ± 1.554.7 ± 1.557.2 ± 2.761.4 ± 0.9
RoBERTEye-F [Shubi2024Finegrained]56.6 ± 1.350.7 ± 0.452.0 ± 1.553.6 ± 0.967.3 ± 1.255.7 ± 1.154.8 ± 3.461.9 ± 0.8
MAG-Eye [Shubi2024Finegrained]58.3 ± 1.750.9 ± 0.450.1 ± 0.554.3 ± 0.967.7 ± 1.057.7 ± 0.557.8 ± 2.262.9 ± 0.5
PostFusion-Eye [Shubi2024Finegrained]57.0 ± 1.552.2 ± 0.752.2 ± 1.154.7 ± 0.964.5 ± 0.957.1 ± 1.154.7 ± 3.261.1 ± 0.6

Validation

ModelUnseen Reader Balanced AccuracyUnseen Text Balanced AccuracyUnseen Text and Reader Balanced AccuracyAverage Balanced AccuracyUnseen Reader AUROCUnseen Text AUROCUnseen Text and Reader AUROCAverage AUROC
Majority Class / Chance50.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.0
Reading Speed48.6 ± 1.350.0 ± 0.949.4 ± 1.549.4 ± 0.748.4 ± 1.849.8 ± 1.050.6 ± 2.349.2 ± 0.8
Text-Only Roberta59.2 ± 1.852.2 ± 0.853.0 ± 1.955.6 ± 1.367.4 ± 1.756.8 ± 1.258.9 ± 2.462.5 ± 1.2
Logistic Regression [meziere2023using]52.1 ± 1.252.9 ± 0.554.0 ± 2.152.7 ± 0.653.0 ± 1.553.8 ± 0.853.1 ± 3.053.6 ± 0.7
SVM [hollenstein2023zuco]50.9 ± 0.552.8 ± 0.852.8 ± 1.551.9 ± 0.550.9 ± 0.552.8 ± 0.852.8 ± 1.551.9 ± 0.5
Random Forest [makowski2024detection]59.2 ± 0.854.6 ± 1.153.8 ± 1.656.9 ± 0.461.4 ± 0.856.8 ± 1.356.6 ± 1.759.3 ± 0.5
AhnRNN [ahn2020towards]50.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.050.0 ± 0.0
AhnCNN [ahn2020towards]50.5 ± 0.550.2 ± 0.150.2 ± 0.250.3 ± 0.352.3 ± 0.751.3 ± 0.950.5 ± 2.051.8 ± 0.5
BEyeLSTM [reich_inferring_2022]53.7 ± 0.652.0 ± 0.952.3 ± 1.752.8 ± 0.556.8 ± 1.054.9 ± 1.554.9 ± 2.355.5 ± 1.0
PLM-AS [Yang2023PLMASPL]57.5 ± 1.252.0 ± 0.953.3 ± 1.254.8 ± 1.062.0 ± 1.254.0 ± 1.354.6 ± 1.958.1 ± 1.1
PLM-AS-RM [haller2022eye]59.7 ± 0.652.3 ± 0.660.5 ± 2.056.4 ± 0.663.7 ± 0.855.1 ± 0.861.3 ± 1.859.9 ± 0.8
RoBERTEye-W [Shubi2024Finegrained]59.3 ± 1.950.2 ± 0.752.5 ± 1.554.7 ± 1.168.1 ± 1.556.9 ± 1.057.8 ± 1.463.1 ± 1.0
RoBERTEye-F [Shubi2024Finegrained]57.5 ± 1.251.0 ± 0.750.8 ± 0.754.0 ± 0.867.9 ± 1.457.4 ± 0.958.9 ± 2.863.2 ± 0.8
MAG-Eye [Shubi2024Finegrained]58.4 ± 1.550.9 ± 0.551.4 ± 1.454.4 ± 0.868.5 ± 0.958.3 ± 0.960.4 ± 2.063.7 ± 0.8
PostFusion-Eye [Shubi2024Finegrained]59.1 ± 2.053.2 ± 0.855.3 ± 2.156.1 ± 1.366.2 ± 0.958.5 ± 1.259.8 ± 2.862.5 ± 0.7