Segmentation of Indus Texts

A statistical analysis by the Tata Institute of Fundamental Research Team.


We adopt a comprehensive approach to segment the Indus texts using statistically significant signs and their combinations in addition to all the texts of length 2, 3 and 4 signs. We find that we can segment 88% of Indus texts (of length 5 and above) by this method and hence it can be suggested that the texts of 5 or more signs can actually be seen as permutations of other frequent sign-combinations or smaller texts (of length 2, 3 or 4 signs). The results of the segmentation process are in agreement with our earlier results (Yadav et. al, 2008, henceforth referred to as Paper 1) where we show the importance of 2, 3 and 4 sign combinations as important units of information. We do not assume anything regarding the content of the script and the work is purely based on the structural analysis of Indus Texts.