A taxonomy and review of generalization research in NLP
Hupkes, D., Giulianelli, M., Dankers, V., Artetxe, M., Elazar, Y., Pimentel, T., Christodoulopoulos, C., Lasri, K., Saphra, N., Sinclair, A., Ulmer, D., Schottmann, F., Batsuren, K., Sun, K., Sinha, K., Khalatbari, L., Ryskina, M., Frieske, R., Cotterell, R., Jin, Z.
Nature Machine Intelligence, vol. 5, no. 10, pp. 1161-1174
Non-Repeatable Experiments and Non-Reproducible Results: The Reproducibility Crisis in Human Evaluation in NLP
Belz, A., Thomson, C., Reiter, E., Mille, S.
Findings of the Association for Computational Linguistics: ACL 2023. Rogers, A., Boyd-Graber, J., Okazaki, N. (eds.). Association for Computational Linguistics, pp. 3676-3687, 12 pages
Chapters in Books, Reports and Conference Proceedings: Chapters
Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP
Belz, A., Thomson, C., Reiter, E., Abercrombie, G., Alonso Moral, J. M., Arvan, M., Cheung, J., Cieliebak, M., Clark, E., van Deemter, K., Dinkar, T., Dušek, O., Eger, S., Fang, Q., Gatt, A., Gkatzia, D., González Corbelle, J., Hovy, D., Hürlimann, M., Ito, T., Kelleher, J. D., Klubicka, F., Lai, H., van der Lee, C., van Miltenburg, E., Li, Y., Mahamood, S., Mieskes, M., Nissim, M., Parde, N., Plátek, O., Rieser, V., Mosteiro Romero, P., Tetreault, J., Toral, A., Wang, X., Wanner, L., Watson, L., Yang, D.
Chapters in Books, Reports and Conference Proceedings: Conference Proceedings
State-of-the-art generalisation research in NLP: a taxonomy and review
Hupkes, D., Giulianelli, M., Dankers, V., Artetxe, M., Elazar, Y., Pimentel, T., Christodoulopoulos, C., Lasri, K., Saphra, N., Sinclair, A., Ulmer, D., Schottmann, F., Batsuren, K., Sun, K., Sinha, K., Khalatbari, L., Ryskina, M., Technology, H., Cotterell, R., Jin, Z.