SBSAT STD
Subjective Text Difficulty (SBSAT)
Test
| Model | Unseen Reader RMSE | Unseen Text RMSE | Unseen Text and Reader RMSE | Average RMSE | Unseen Reader MAE | Unseen Text MAE | Unseen Text and Reader MAE | Average MAE | Unseen Reader R² | Unseen Text R² | Unseen Text and Reader R² | Average R² |
|---|
| Majority Class / Chance | 0.71 ± 0.0 | 0.75 ± 0.1 | 0.71 ± 0.1 | 0.73 ± 0.0 | 0.52 ± 0.1 | 0.57 ± 0.0 | 0.55 ± 0.1 | 0.55 ± 0.0 | -0.08 ± 0.0 | -0.39 ± 0.1 | -0.78 ± 0.4 | -0.12 ± 0.1 |
| Reading Speed | 0.71 ± 0.0 | 0.82 ± 0.0 | 0.8 ± 0.1 | 0.77 ± 0.0 | 0.58 ± 0.1 | 0.7 ± 0.0 | 0.68 ± 0.0 | 0.65 ± 0.0 | -0.09 ± 0.1 | -0.73 ± 0.3 | -1.29 ± 0.5 | -0.25 ± 0.1 |
| Text-Only Roberta | 0.67 ± 0.0 | 0.75 ± 0.0 | 0.74 ± 0.1 | 0.72 ± 0.0 | 0.56 ± 0.0 | 0.64 ± 0.0 | 0.64 ± 0.1 | 0.61 ± 0.0 | 0.03 ± 0.0 | -0.46 ± 0.3 | -1.03 ± 0.6 | -0.07 ± 0.1 |
| Logistic Regression [meziere2023using] | 0.74 ± 0.0 | 0.89 ± 0.0 | 0.83 ± 0.1 | 0.82 ± 0.0 | 0.6 ± 0.0 | 0.76 ± 0.1 | 0.71 ± 0.1 | 0.68 ± 0.0 | -0.19 ± 0.1 | -1.1 ± 0.5 | -1.54 ± 0.7 | -0.42 ± 0.1 |
| SVM [hollenstein2023zuco] | 0.72 ± 0.0 | 0.73 ± 0.0 | 0.7 ± 0.1 | 0.73 ± 0.0 | 0.52 ± 0.1 | 0.54 ± 0.0 | 0.53 ± 0.1 | 0.53 ± 0.0 | -0.12 ± 0.1 | -0.31 ± 0.1 | -0.72 ± 0.3 | -0.1 ± 0.0 |
| Random Forest [makowski2024detection] | 0.74 ± 0.0 | 0.78 ± 0.1 | 0.75 ± 0.1 | 0.77 ± 0.0 | 0.59 ± 0.0 | 0.6 ± 0.1 | 0.6 ± 0.1 | 0.6 ± 0.0 | -0.18 ± 0.1 | -0.48 ± 0.2 | -0.92 ± 0.4 | -0.24 ± 0.1 |
| AhnRNN [ahn2020towards] | 0.69 ± 0.0 | 0.76 ± 0.0 | 0.72 ± 0.1 | 0.72 ± 0.0 | 0.55 ± 0.0 | 0.62 ± 0.0 | 0.6 ± 0.1 | 0.58 ± 0.0 | -0.02 ± 0.0 | -0.48 ± 0.3 | -0.98 ± 0.6 | -0.09 ± 0.0 |
| AhnCNN [ahn2020towards] | 0.69 ± 0.0 | 0.76 ± 0.1 | 0.72 ± 0.1 | 0.72 ± 0.0 | 0.55 ± 0.0 | 0.62 ± 0.1 | 0.6 ± 0.1 | 0.59 ± 0.0 | -0.02 ± 0.0 | -0.49 ± 0.3 | -1.0 ± 0.7 | -0.09 ± 0.1 |
| BEyeLSTM [reich_inferring_2022] | 0.67 ± 0.0 | 1.74 ± 1.0 | 1.61 ± 0.9 | 1.43 ± 0.7 | 0.51 ± 0.0 | 1.39 ± 0.8 | 1.4 ± 0.8 | 1.01 ± 0.5 | 0.03 ± 0.0 | -11.43 ± 9.8 | -17.16 ± 14.6 | -6.91 ± 6.1 |
| PLM-AS [Yang2023PLMASPL] | 0.7 ± 0.0 | 0.73 ± 0.0 | 0.7 ± 0.0 | 0.71 ± 0.0 | 0.56 ± 0.0 | 0.58 ± 0.0 | 0.58 ± 0.0 | 0.57 ± 0.0 | -0.06 ± 0.0 | -0.31 ± 0.1 | -0.74 ± 0.4 | -0.06 ± 0.0 |
| PLM-AS-RM [haller2022eye] | 1.2 ± 0.1 | 1.18 ± 0.1 | 1.11 ± 0.2 | 1.21 ± 0.0 | 1.04 ± 0.1 | 1.04 ± 0.1 | 1.02 ± 0.2 | 1.04 ± 0.0 | -2.13 ± 0.3 | -2.4 ± 0.5 | -3.16 ± 1.0 | -2.08 ± 0.2 |
| RoBERTEye-W [Shubi2024Finegrained] | 0.67 ± 0.0 | 0.74 ± 0.0 | 0.72 ± 0.1 | 0.71 ± 0.0 | 0.54 ± 0.0 | 0.6 ± 0.0 | 0.6 ± 0.1 | 0.58 ± 0.0 | 0.04 ± 0.0 | -0.4 ± 0.2 | -0.9 ± 0.5 | -0.05 ± 0.0 |
| RoBERTEye-F [Shubi2024Finegrained] | 0.67 ± 0.0 | 0.77 ± 0.0 | 0.75 ± 0.1 | 0.73 ± 0.0 | 0.56 ± 0.0 | 0.65 ± 0.0 | 0.64 ± 0.1 | 0.61 ± 0.0 | 0.02 ± 0.0 | -0.49 ± 0.2 | -1.07 ± 0.5 | -0.1 ± 0.0 |
| MAG-Eye [Shubi2024Finegrained] | 0.67 ± 0.0 | 0.76 ± 0.1 | 0.74 ± 0.1 | 0.72 ± 0.0 | 0.54 ± 0.0 | 0.62 ± 0.0 | 0.62 ± 0.1 | 0.59 ± 0.0 | 0.03 ± 0.0 | -0.44 ± 0.2 | -1.01 ± 0.5 | -0.09 ± 0.1 |
| PostFusion-Eye [Shubi2024Finegrained] | 0.71 ± 0.0 | 0.85 ± 0.1 | 0.85 ± 0.1 | 0.8 ± 0.1 | 0.57 ± 0.0 | 0.66 ± 0.1 | 0.69 ± 0.1 | 0.63 ± 0.1 | -0.08 ± 0.0 | -0.81 ± 0.3 | -1.8 ± 0.9 | -0.34 ± 0.2 |
Validation
| Model | Unseen Reader RMSE | Unseen Text RMSE | Unseen Text and Reader RMSE | Average RMSE | Unseen Reader MAE | Unseen Text MAE | Unseen Text and Reader MAE | Average MAE | Unseen Reader R² | Unseen Text R² | Unseen Text and Reader R² | Average R² |
|---|
| Majority Class / Chance | 0.73 ± 0.1 | 0.68 ± 0.0 | 0.64 ± 0.0 | 0.7 ± 0.0 | 0.55 ± 0.1 | 0.5 ± 0.1 | 0.48 ± 0.1 | 0.51 ± 0.0 | -0.21 ± 0.2 | -0.09 ± 0.0 | -0.36 ± 0.1 | -0.05 ± 0.0 |
| Reading Speed | 0.68 ± 0.0 | 0.77 ± 0.1 | 0.79 ± 0.1 | 0.74 ± 0.0 | 0.52 ± 0.0 | 0.66 ± 0.1 | 0.67 ± 0.1 | 0.61 ± 0.0 | -0.06 ± 0.0 | -0.48 ± 0.3 | -1.37 ± 0.7 | -0.17 ± 0.1 |
| Text-Only Roberta | 0.67 ± 0.0 | 0.67 ± 0.0 | 0.62 ± 0.0 | 0.66 ± 0.0 | 0.57 ± 0.0 | 0.55 ± 0.0 | 0.52 ± 0.0 | 0.55 ± 0.0 | -0.03 ± 0.1 | -0.06 ± 0.0 | -0.32 ± 0.2 | 0.06 ± 0.0 |
| Logistic Regression [meziere2023using] | 0.71 ± 0.0 | 0.77 ± 0.1 | 0.78 ± 0.1 | 0.75 ± 0.0 | 0.55 ± 0.0 | 0.66 ± 0.1 | 0.63 ± 0.1 | 0.61 ± 0.0 | -0.16 ± 0.0 | -0.55 ± 0.4 | -1.28 ± 0.7 | -0.22 ± 0.1 |
| SVM [hollenstein2023zuco] | 0.7 ± 0.0 | 0.72 ± 0.0 | 0.65 ± 0.1 | 0.7 ± 0.0 | 0.5 ± 0.1 | 0.52 ± 0.0 | 0.49 ± 0.1 | 0.51 ± 0.0 | -0.11 ± 0.1 | -0.2 ± 0.1 | -0.47 ± 0.2 | -0.06 ± 0.0 |
| Random Forest [makowski2024detection] | 0.64 ± 0.0 | 0.77 ± 0.0 | 0.81 ± 0.1 | 0.73 ± 0.0 | 0.49 ± 0.0 | 0.61 ± 0.0 | 0.66 ± 0.1 | 0.57 ± 0.0 | 0.05 ± 0.1 | -0.4 ± 0.2 | -1.37 ± 0.5 | -0.15 ± 0.1 |
| AhnRNN [ahn2020towards] | 0.72 ± 0.0 | 0.67 ± 0.0 | 0.61 ± 0.0 | 0.68 ± 0.0 | 0.59 ± 0.0 | 0.53 ± 0.0 | 0.49 ± 0.0 | 0.55 ± 0.0 | -0.2 ± 0.1 | -0.05 ± 0.0 | -0.26 ± 0.2 | -0.0 ± 0.0 |
| AhnCNN [ahn2020towards] | 0.72 ± 0.0 | 0.67 ± 0.0 | 0.6 ± 0.0 | 0.68 ± 0.0 | 0.59 ± 0.0 | 0.53 ± 0.0 | 0.49 ± 0.0 | 0.54 ± 0.0 | -0.18 ± 0.1 | -0.03 ± 0.0 | -0.22 ± 0.1 | 0.02 ± 0.0 |
| BEyeLSTM [reich_inferring_2022] | 0.67 ± 0.0 | 0.68 ± 0.0 | 0.59 ± 0.0 | 0.66 ± 0.0 | 0.51 ± 0.0 | 0.54 ± 0.1 | 0.49 ± 0.0 | 0.52 ± 0.0 | -0.03 ± 0.0 | -0.09 ± 0.0 | -0.17 ± 0.1 | 0.05 ± 0.1 |
| PLM-AS [Yang2023PLMASPL] | 0.69 ± 0.0 | 0.71 ± 0.0 | 0.66 ± 0.0 | 0.69 ± 0.0 | 0.56 ± 0.0 | 0.57 ± 0.0 | 0.53 ± 0.0 | 0.56 ± 0.0 | -0.09 ± 0.1 | -0.19 ± 0.1 | -0.53 ± 0.3 | -0.04 ± 0.0 |
| PLM-AS-RM [haller2022eye] | 1.2 ± 0.1 | 1.18 ± 0.1 | 1.12 ± 0.2 | 1.2 ± 0.0 | 1.05 ± 0.1 | 1.02 ± 0.1 | 1.02 ± 0.2 | 1.03 ± 0.0 | -2.33 ± 0.4 | -2.24 ± 0.4 | -3.18 ± 1.0 | -2.14 ± 0.3 |
| RoBERTEye-W [Shubi2024Finegrained] | 0.64 ± 0.0 | 0.67 ± 0.0 | 0.61 ± 0.0 | 0.65 ± 0.0 | 0.53 ± 0.0 | 0.54 ± 0.0 | 0.5 ± 0.0 | 0.53 ± 0.0 | 0.05 ± 0.1 | -0.07 ± 0.0 | -0.3 ± 0.2 | 0.09 ± 0.0 |
| RoBERTEye-F [Shubi2024Finegrained] | 0.62 ± 0.0 | 0.68 ± 0.0 | 0.65 ± 0.0 | 0.65 ± 0.0 | 0.52 ± 0.0 | 0.56 ± 0.0 | 0.55 ± 0.0 | 0.54 ± 0.0 | 0.1 ± 0.1 | -0.1 ± 0.1 | -0.48 ± 0.3 | 0.09 ± 0.0 |
| MAG-Eye [Shubi2024Finegrained] | 0.64 ± 0.0 | 0.67 ± 0.0 | 0.61 ± 0.0 | 0.65 ± 0.0 | 0.53 ± 0.0 | 0.52 ± 0.0 | 0.49 ± 0.0 | 0.52 ± 0.0 | 0.05 ± 0.1 | -0.04 ± 0.0 | -0.26 ± 0.1 | 0.09 ± 0.0 |
| PostFusion-Eye [Shubi2024Finegrained] | 0.7 ± 0.1 | 0.68 ± 0.0 | 0.63 ± 0.0 | 0.68 ± 0.0 | 0.57 ± 0.0 | 0.55 ± 0.0 | 0.52 ± 0.0 | 0.55 ± 0.0 | -0.13 ± 0.2 | -0.08 ± 0.0 | -0.34 ± 0.2 | 0.0 ± 0.0 |