Examining the arc of 100,000 stories: a tidy analysis

1 · David Robinson · April 26, 2017, 2 p.m.
I recently came across a great natural language dataset from Mark Riedel: 112,000 plots of stories downloaded from English language Wikipedia. This includes books, movies, TV episodes, video games- anything that has a Plot section on a Wikipedia page. This offers a great opportunity to analyze story structure quantitatively. In this post I’ll do a simple analysis, examining what words tend to occur at particular points within a story, including words that characterize the beginning, middle, or e...