Shit and the Need for Data-Driven Standards

Feces, faeces, ordure, dung, manure, excreta, stool, stool-NOT-faeces, and stool-NOT-feces are the prime examples in a newly published study that examines the need for data-driven standards. The study is: “Laying a Community-Based Foundation for Data-Driven Semantic Standards in Environmental Health Sciences,” Carolyn J. Mattingly, Rebecca Boyles, Cindy P. Lawler, Astrid C. Haugen, Allen Dearry, and Melissa Haendel, Environmental […]

Algorithmic Distinguishing of Novelists from their Punctuation Patterns

Adam J. Calhoun has written a wonderful blog entry that illustrates, with some great data visualization, that it is possible to algorithmically distinguish different novelists based only on  their punctuation habits. The idea is simple: just remove all words from a corpus of text and look at the patterns of the punctuation. Here is an illustration.   […]

Improbable Research