This commentary tests a methodology proposed by Munk et al. (2022) for using failed predictions in machine learning as a method to identify ambiguous and rich cases for qualitative analysis. Using a dataset describing actions performed by fictional characters interacting with machine vision technologies in 500 artworks, movies, novels and videogames, I trained a simple machine learning algorithm (using the kNN algorithm in R) to predict whether or not an action was active or passive using only information about the fictional characters. Predictable actions were generally unemotional and unambiguous activities where machine vision technologies were treated as simple tools. Unpredictable actions, that is, actions that the algorithm could not correctly predict, were more ambivalent and emotionally loaded, with more complex power relationships between characters and technologies. The results thus support Munk et al.'s theory that failed predictions can be productively used to identify rich cases for qualitative analysis. This test goes beyond simply replicating Munk et al.'s results by demonstrating that the method can be applied to a broader humanities domain, and that it does not require complex neural networks but can also work with a simpler machine learning algorithm. Further research is needed to develop an understanding of what kinds of data the method is useful for and which kinds of machine learning are most generative. To support this, the R code required to produce the results is included so the test can be replicated. The code can also be reused or adapted to test the method on other datasets.
翻译:本文验证了Munk等人(2022)提出的方法论——利用机器学习中的失败预测来识别定性分析中的模糊而丰富的案例。基于一个描述500件艺术品、电影、小说和电子游戏中虚构角色与机器视觉技术互动行为的数据集,我训练了一个简单的机器学习算法(在R中使用kNN算法),仅通过虚构角色的信息来预测某个行为是主动还是被动。可预测的行为通常是缺乏情感且意义明确的动作,其中机器视觉技术被视作简单工具。而算法无法正确预测的不可预测行为则更具矛盾性和情感负载,展现角色与技术之间更复杂的权力关系。该结果支持了Munk等人的理论:失败预测可有效用于识别定性分析中的丰富案例。本研究不仅复现了Munk等人的结论,还进一步证明该方法可应用于更广泛的人文学科领域,且无需复杂的神经网络,简单的机器学习算法即可实现。未来研究需进一步探索该方法适用于何种数据类型以及何种机器学习模型最具生成性。为支持后续研究,附录中提供了生成结果所需的R代码以供复现,该代码也可被复用或调整以测试其他数据集。