Deep Learning in Archiving Indus Script and Motif Information

One day Artificial Intelligence (AI) could be of great help in reading ancient Indus inscriptions. There are likely to be many steps before we get there however. These will include further discoveries of ancient Indus specific materials, new excavations and proper databases covering those and previous discoveries. There will also be advances in areas outside Indus studies directly, like computer vision and image recognition that will serve an AI effort to decipher the ancient Indus script. Being able to clearly recognize signs is a one such component. So far this had largely been done by hand, by scholars like Iravatham Mahadevan, Asko Parpola and others.

This paper, just published in the Journal of Computer Applications in Archaeology (CAA) by students under the guidance of Prof. Debasis Mitra at the Florida Institute of Technology over a number of years, presents a novel, fully implemented computational pipeline for the automated digitization and archival of image-based data from seals of the ancient Indus Valley Civilization. The system is designed to extract and store key information from seal images—specifically, the script (graphemes) and motifs—using deep learning models integrated with a custom database. It is a big step towards using automated tools to help tackle the Indus script problem.

"Convolutional Neural Networks (CNNs) form the foundation of our tool components. They have transformed computer vision by enabling classification, reconstruction, and even generation of visual data," (p. 158) write the authors. There are three major components, using tools developed and tailored to Indus-script reading exercises. What they call Ancient Script Recognition Network (ASR-net) combines two separate applications, YOLOv3 (for bounding box detection of individual graphemes) and MobileNet (for grapheme identification). One can call this Optical Character Recognition (OCR) for ancient scripts. The third element, Motif Identification Network (MI-net) is a custom CNN that identifies recurring iconographic motifs on seals, e.g. whether it is a unicorn, tiger-looking man on tree, elephant, horned ram, etc. that is on the seal being analyzed, bringing into play the important if still not understood context adjacent to the seal inscription. The system is trained and validated on a manually annotated dataset of 963 seal images, with 40 grapheme classes and 11 motif classes. Extracted information is archived in a relational database, enabling efficient querying and analysis. The system supports automated, large-scale statistical analysis of Indus script and motifs, aiming to aid ongoing decipherment efforts. The methodology is transparent, with clear descriptions of data collection, annotation, model training, and validation.

The results demonstrate strong performance, particularly for grapheme identification. MobileNet achieved a validation accuracy of 94.5% and an F1 score of 95% for grapheme identification - these can roughly be described as confidence scores that the automated reading is corrected. MI-net attained a mean accuracy of 85% across five-fold cross-validation for motif identification. There are still challenges of course, as the authors readily point out: class imbalance in motif data, with some motifs extremely rare, and stylistic variation and ambiguity in motif classes, complicating both human and machine classification. Indeed humans struggle with classifying some of the signs, often eroded or broken, so such high confidence scores with an automated approach are excellent - and these results were achieved with less than half of all known seal inscriptions being put to pasture for the applications. Then there exclusion of damaged or broken seals from training, which limits robustness in real-world archaeological contexts. At the same time, there are new avenues to enhance analysis in the future, besides expanding the dataset, like using even more advanced machine learning techniques (e.g., zero-shot learning or YOLOv5). Finally, how can the integration of richer archaeological metadata (e.g., provenance, context, material) enhance the interpretability and utility of the archived data for both computational and traditional researchers?

This article is a significant contribution to computational archaeology and the study of undeciphered scripts. The use of YOLOv3 for object detection and MobileNet for efficient grapheme classification is well-justified, balancing accuracy with computational efficiency—especially important for potential field deployment. Indeed field deployment is where one can hope to one day see this set of applications used: someone discovers or catalogues a seal on site with a mobile camera, the data is sent to a central database (we still need a robust public database of ancient Indus signs!), where a preliminary reading is made. With proper annotations and access, such a database and pipeline would be a major step in using more automated methods to help us understand the Indus script, its meaning and function (today it can take years before a newly discovered seal is published or brought to the attention of other scholars).

Image: A few samples from the image dataset of IVC seals.