PoTeC DE
Domain Expertise (PoTeC)
Test
| Model | Unseen Reader Balanced Accuracy | Unseen Text Balanced Accuracy | Unseen Text and Reader Balanced Accuracy | Average Balanced Accuracy | Unseen Reader AUROC | Unseen Text AUROC | Unseen Text and Reader AUROC | Average AUROC |
|---|---|---|---|---|---|---|---|---|
| Majority Class / Chance | 52.5 ± 2.3 | 49.9 ± 0.6 | 49.9 ± 1.3 | 51.4 ± 1.3 | 52.5 ± 2.3 | 49.9 ± 0.6 | 49.9 ± 1.3 | 51.4 ± 1.3 |
| Reading Speed | 59.2 ± 2.2 | 59.1 ± 4.0 | 57.7 ± 4.7 | 59.0 ± 1.0 | 60.2 ± 1.4 | 61.0 ± 5.5 | 56.8 ± 6.8 | 60.4 ± 1.7 |
| Text-Only Roberta | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 65.7 ± 4.4 | 57.6 ± 5.6 | 55.2 ± 2.7 | 62.0 ± 4.0 |
| Logistic Regression [meziere2023using] | 55.3 ± 1.6 | 50.6 ± 2.8 | 42.5 ± 5.0 | 51.6 ± 1.5 | 58.8 ± 2.0 | 53.1 ± 2.8 | 41.6 ± 7.8 | 54.0 ± 1.7 |
| SVM [hollenstein2023zuco] | 53.5 ± 2.2 | 57.0 ± 1.6 | 49.3 ± 4.4 | 54.5 ± 1.5 | 53.5 ± 2.2 | 57.0 ± 1.6 | 49.3 ± 4.4 | 54.5 ± 1.5 |
| Random Forest [makowski2024detection] | 56.9 ± 3.6 | 50.7 ± 0.6 | 51.7 ± 1.7 | 53.6 ± 1.8 | 69.2 ± 3.6 | 52.7 ± 7.4 | 60.2 ± 4.1 | 62.3 ± 3.4 |
| AhnRNN [ahn2020towards] | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.1 | 49.9 ± 0.1 | 50.0 ± 0.0 | 50.0 ± 0.1 |
| AhnCNN [ahn2020towards] | 50.6 ± 0.5 | 49.9 ± 0.1 | 49.8 ± 0.2 | 50.2 ± 0.2 | 60.7 ± 2.4 | 59.8 ± 7.3 | 60.8 ± 6.6 | 60.6 ± 3.4 |
| BEyeLSTM [reich_inferring_2022] | 64.1 ± 4.1 | 47.1 ± 3.3 | 50.5 ± 5.2 | 53.0 ± 2.7 | 65.7 ± 3.8 | 47.2 ± 6.8 | 46.8 ± 12.1 | 51.8 ± 3.5 |
| PLM-AS [Yang2023PLMASPL] | 53.0 ± 2.0 | 47.7 ± 1.1 | 50.2 ± 1.1 | 50.0 ± 0.6 | 52.6 ± 2.8 | 52.6 ± 2.5 | 49.0 ± 9.5 | 51.3 ± 2.4 |
| PLM-AS-RM [haller2022eye] | 55.5 ± 3.9 | 49.9 ± 0.1 | 50.0 ± 0.0 | 52.4 ± 1.7 | 64.7 ± 5.8 | 65.4 ± 3.6 | 60.7 ± 4.1 | 64.2 ± 4.0 |
| RoBERTEye-W [Shubi2024Finegrained] | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 65.3 ± 5.3 | 62.6 ± 9.8 | 61.3 ± 9.4 | 62.5 ± 7.3 |
| RoBERTEye-F [Shubi2024Finegrained] | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 71.3 ± 2.1 | 52.0 ± 6.5 | 66.7 ± 1.8 | 64.5 ± 3.1 |
| MAG-Eye [Shubi2024Finegrained] | 50.8 ± 0.7 | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.4 ± 0.4 | 65.2 ± 7.6 | 47.4 ± 9.3 | 48.9 ± 13.4 | 57.6 ± 7.1 |
| PostFusion-Eye [Shubi2024Finegrained] | 49.9 ± 0.1 | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 55.0 ± 4.0 | 50.7 ± 4.2 | 52.3 ± 4.7 | 53.6 ± 0.9 |
Validation
| Model | Unseen Reader Balanced Accuracy | Unseen Text Balanced Accuracy | Unseen Text and Reader Balanced Accuracy | Average Balanced Accuracy | Unseen Reader AUROC | Unseen Text AUROC | Unseen Text and Reader AUROC | Average AUROC |
|---|---|---|---|---|---|---|---|---|
| Majority Class / Chance | 52.9 ± 1.6 | 49.8 ± 0.6 | 49.8 ± 1.0 | 51.0 ± 0.5 | 52.9 ± 1.6 | 49.8 ± 0.6 | 49.8 ± 1.0 | 51.0 ± 0.5 |
| Reading Speed | 60.2 ± 4.7 | 56.8 ± 6.5 | 57.1 ± 3.9 | 58.9 ± 3.7 | 62.3 ± 3.3 | 56.6 ± 7.6 | 56.8 ± 6.8 | 60.1 ± 3.0 |
| Text-Only Roberta | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 69.8 ± 3.0 | 63.3 ± 9.3 | 62.8 ± 5.3 | 66.7 ± 3.8 |
| Logistic Regression [meziere2023using] | 55.9 ± 3.2 | 50.2 ± 2.7 | 47.0 ± 7.0 | 53.2 ± 1.4 | 54.6 ± 4.0 | 51.1 ± 2.9 | 46.0 ± 7.0 | 53.3 ± 1.5 |
| SVM [hollenstein2023zuco] | 56.4 ± 1.5 | 59.1 ± 2.5 | 59.0 ± 2.3 | 57.8 ± 1.8 | 56.4 ± 1.5 | 59.1 ± 2.5 | 59.0 ± 2.3 | 57.8 ± 1.8 |
| Random Forest [makowski2024detection] | 62.8 ± 6.0 | 53.3 ± 4.0 | 49.2 ± 2.0 | 58.4 ± 3.2 | 63.6 ± 5.1 | 59.4 ± 7.3 | 49.2 ± 5.5 | 60.9 ± 3.6 |
| AhnRNN [ahn2020towards] | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.2 | 50.1 ± 0.1 | 50.0 ± 0.0 | 50.0 ± 0.1 |
| AhnCNN [ahn2020towards] | 49.9 ± 0.1 | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 59.9 ± 5.5 | 60.0 ± 9.4 | 59.0 ± 5.3 | 61.1 ± 2.6 |
| BEyeLSTM [reich_inferring_2022] | 67.5 ± 1.9 | 60.0 ± 5.0 | 59.2 ± 4.9 | 63.6 ± 1.8 | 71.1 ± 1.9 | 71.5 ± 5.6 | 67.3 ± 3.3 | 71.7 ± 2.3 |
| PLM-AS [Yang2023PLMASPL] | 56.2 ± 3.4 | 52.5 ± 1.8 | 47.1 ± 0.7 | 54.4 ± 2.5 | 61.4 ± 6.5 | 55.5 ± 2.5 | 34.4 ± 6.3 | 56.2 ± 4.7 |
| PLM-AS-RM [haller2022eye] | 61.4 ± 5.9 | 50.0 ± 0.0 | 50.0 ± 0.0 | 57.0 ± 3.8 | 70.8 ± 3.8 | 48.5 ± 11.2 | 42.0 ± 10.8 | 57.9 ± 7.7 |
| RoBERTEye-W [Shubi2024Finegrained] | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 70.6 ± 6.3 | 67.0 ± 9.4 | 73.7 ± 7.3 | 69.9 ± 4.3 |
| RoBERTEye-F [Shubi2024Finegrained] | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.0 ± 0.0 | 73.3 ± 4.8 | 70.9 ± 5.9 | 57.8 ± 7.5 | 71.4 ± 1.6 |
| MAG-Eye [Shubi2024Finegrained] | 54.7 ± 4.1 | 50.0 ± 0.0 | 50.0 ± 0.0 | 52.5 ± 2.2 | 67.8 ± 3.5 | 74.4 ± 5.5 | 70.9 ± 8.2 | 71.6 ± 3.5 |
| PostFusion-Eye [Shubi2024Finegrained] | 50.6 ± 0.6 | 50.0 ± 0.0 | 50.0 ± 0.0 | 50.4 ± 0.4 | 59.5 ± 2.1 | 64.5 ± 6.7 | 54.8 ± 7.2 | 61.1 ± 4.2 |