Last night I gave a presentation on Project Bow, an ape language research project, and what issues concerning rigorous proof were presented by spontaneous and non-replicable results.
Modern scientific research, especially in the social sciences, relies on aggregated data which shows replicable, statistically significant results. Many of the researchers in the social sciences don’t actually understand statistics and use programs that do all the calculations for them. However, they are convinced that without statistics, nothing can be proven. They forget the rules of simple logic and finite mathematics. They deny that a single example can falsify a rule, or that an outcome that is greater than chance under probability might be enough to prove a point
When it comes to linguistic data, once we are able to agree on a phonemic inventory for a language, the data is finite. There is a finite number of combinations of the phonemes that are possible. If the data is written, calculating the probabilty of any given form being chosen should be very easy, and it does not require higher order statistics. If the likelihood of the choice is far greater than chance, replication may not be necessary. This way, we can judge that someone speaks our language after a limited exchange, rather than by requiring constant repetition.
The findings of the Neogrammarians about the relatedness of the IndoEuropean languages were based on the idea that the similarities in the roots of these languages could not be due to chance. No statistics were necessary to establish that proof.
This begs the question: why use statistics when probability will do?