Patterns in the Script of Indus Valley Civilization

Nisha Yadav and M. N. Vahia, Tata Institute of Fundamental Research


The people who wrote in the Indus script were highly skilled and specialised. They had a mutually agreed coding that was valid over the entire stretch of the civilisation and beyond. The Indus script was an intellectual and abstract creation of the highest standards.


Mohenjo-daro is an archeological site in the province of Sindh, Pakistan. Built around 2600 BCE, it was one of the largest settlements of the ancient Indus Valley Civilization.

The Indus Valley civilization also called the Harappan civilization was one of the largest Bronze Age civilizations in the world. At its peak urban phase (ca. 2600 to 1900 BC) it was spread over an area of more than a million square kilometers in the northwestern parts of India, present day Pakistan and some parts of Afghanistan and Iran. The Indus Valley civilization has been justifiably acclaimed for its well-planned cities, excellent water management systems, highly standardized architecture and rich lifestyle over such a vast area.

The roots of the civilization can be traced back to the site of Mehrgarh in Pakistan dated to about 7000 BC. The civilization reached its peak around 2600 BC and it went into decline around 1900 BC. At its peak, the civilization included thousands of settlements with large cities (such as Mohenjodaro, Harappa, Dholavira etc.), modest towns and small villages. The sites are classified as Harappan if they show some of the major characteristics of the civilization.

The civilization seems to have adopted standardized writing style over the entire area of influence around 2600 BC that remained in used till about 1900 BC. Harappan writing can also be found in West Asia, as far away as Sumer in the present day Iraq. About five thousand samples of inscribed objects have been discovered from several sites of the civilization. These include steatite or terracotta seals and sealings (impressions of seals), copper tablets, pottery and other material. The inscribed objects, in general, have a wide variety of designs and contents. Apart from the yet undeciphered script of the Indus Valley civilization, these objects often have images of animals, mythical figures, composite and multi-headed animals, scenes with people (perhaps mythical), and other types of geometric and abstract motifs. The most common animal motif depicted on a majority of these objects is the Unicorn. Harappan seals provide the earliest known depiction of Unicorn. It has often been visualized as a legendary animal later in other cultures. The purpose of these inscribed objects is not clear. Some of them are suggested to be used for stamping clay tags attached to bales of goods, however, they may have had other uses as well.

An example of a square Indus seal is shown in Figure 1. The average length of the Indus texts is about five signs. Several external and internal features of the Indus texts have led to the suggestion that the direction of the Indus writing is right to left. These include cramping of signs towards the left end of objects and overflow of signs to subsequent lines at the left end of objects.


Figure 1: A square Indus seal with an Indus text consisting of eight signs on top, a field symbol (Unicorn) in the middle and a decorated object at the bottom left. The seal is about 5 cm x 5 cm in size (Copyright 2 Harappa Archaeological Research Project/J. M. Kenoyer,, Courtesy Dept. of Archaeology and Museums, Govt. of Pakistan).

A large fraction of these inscribed objects with the Indus script are in the form of seals largely between 2 to 5 square centimeters in size. Only two samples of Indus script on larger objects have been discovered so far from the site of Dholavira in Gujarat. One of them is a large wooden board (about three meters in length) with ten Indus signs. The other sample is a stone slab with four Indus signs which was discovered in an underground chamber at Dholavira.

Geometric and abstract patterns are in particular interesting due to their extensive usage of symmetry and select number of divisions. Interestingly a geometric design that is referred to as Swastika in later literature is also found on Harappan seals. There are some very complex concentric circular patterns on objects no more than a few square centimeters in size suggesting a strong commitment to precision. Our study on some of their complex geometric patterns reveals remarkable understanding of geometric space in the art of its creators.

One of the most enigmatic creations of the civilization is its script. It came to light several decades ago with the discovery of the civilization in the 1920s but its content remains undeciphered. A vital aspect of the Indus Valley civilization therefore remains obscure. The main hurdles in the decipherment of the Indus script include extreme brevity of the Indus texts, absence of information on their content and usage, absence of bilingual or multilingual texts, lack of knowledge about their language(s) and apparent discontinuity in traditions at the decline of the Indus Valley civilization.

The signary of the Indus script consists of signs of various designs and the designs vary in the complexity from sign to sign. Some signs visually resemble different forms of stick-like human figures, animals such as fish, crab, birds etc. or vertical strokes that may or may not have numeric function. Other signs of various geometric designs such as cross, diamond, triangles, U-shape, ovals and abstract shapes can also be seen in the signary.

The total number of signs in the signary of Indus script is generally agreed by scholars such as Iravatham Mahadevan and Asko Parpola to be around 400 though Bryan Wells identifies about 676 distinct signs. The first two concordances created by Mahadevan and Parpola are largely in agreement with each other though small differences in the identification of a few signs remain.

The number of signs in a script generally defines the type of the script. Logographic scripts (such as Chinese) have thousands of signs with each sign corresponding to a word. In contrast, in case of alphabetic scripts consisting of single sound signs, the number of signs often does not exceed forty. Another class of scripts lying in between these two extremes consists of about 400 to 900 signs in their signary and they are considered to be logo-syllabic scripts. The scripts such as ancient Egyptian Hieroglyphs and Mesopotamian Cuneiform texts belong to logo-syllabic category. In logo-syllabic scripts, each sign is used both for its pictorial as well as for its phonetic value. Based on the count of the total number of signs in the signary of the Indus script, the Indus script is suggested to be logo-syllabic.

The nature and content of the Indus script have been extensively debated in the literature. More than one hundred attempts have been made to assign meanings to various signs and sign combinations of the Indus script, relating it to Proto-Dravidian language on one hand to Indo Aryan language on the other. It has even been suggested that the script is entirely numeric or even that it is a collection of symbols. Most of the interpretations are at variance with each other and at times even internally inconsistent. None of these interpretations are satisfactory. Hence, the problem of the Indus script lies unresolved with no universal consensus on any of the interpretations.

An objective approach to address the problem of the Indus script is to extract the syntax of the Indus writing using computational and statistical methods before attempting its interpretation. Such an approach does not require any a priori assumptions about the content and the grammar of the writing but it can provide significant insights into its structure.

About six years ago, we at TIFR decided to take a fresh look at the problem. In our study of the Indus script, we used a series of computational and statistical methods on the dataset of the Indus script. We highlight some important conclusions on the patterns of the Indus script based on our study.

  • Indus writing is highly structured and the sequencing of signs in Indus texts follows definite rules.

  • The sign frequency distribution of the Indus script follows Zipf-Mandelbrot law, an empirical law followed by ordered systems.

  • There is an asymmetry in the usage of text beginners and text enders with very few signs constituting most of the text enders while relatively large number of signs occurring as text beginners.

  • It is possible to identify pairs of signs that appear together in the longer Indus texts but often do not appear consecutively. Using this insight, it is possible to revisit the entire dataset and demonstrate that most of the longer Indus texts can be segmented into smaller units.

  • A machine learning model of the Indus script based on nearest neighbour associations can successfully predict signs in an Indus text with about 75% accuracy. It can also generate sample Indus texts that can also be found in the dataset of Indus texts.

  • The Indus script seems to be versatile enough to permit writing of different content as can be seen from the texts on the Indus seals found at West Asian sites. However, in the case of West Asian seals, the grammar seems to have been tweaked.

  • Studies of the flexibility in sign usage suggest that the sequencing of signs in Indus writing is as flexible as one would expect for natural linguistic systems and is much more than that in artificial linguistic systems (computer languages). However, it is less flexible in comparison to the systems in which abstractions are conveyed (music) or the manner in which biological information is coded (DNA or Protein).

  • While there is a common thread of rules and grammatical structures in the Indus writing, subtle variation in sign usage across sites and type of objects suggests that writing on different types of objects and at different sites do provide individualistic clues to their content.

  • Based on the design, the Indus signs can be classified into two major categories: Basic signs and Composite signs which are composites of other signs and additives.

  • Study of the design of Indus signs suggests that the compound signs are not simply a space saving technique since the texts in which the compound signs appear is very different from that of its constituents in any combination.

These results indicate that the people who wrote in the Indus script were highly skilled and specialised. They had a mutually agreed coding that was valid over the entire stretch of the civilisation and beyond. The Indus script was an intellectual and abstract creation of the highest standards. It was in use for about 700 years without significant changes – even though some rare signs that were used only once or twice can also be found. The rules of writing were also fairly precise and could be understood by all those who could read or use the writing. The design of the script also suggests a complex pattern. A lot of thought had gone into the design of signs and the grammar to be adopted. Any proposed interpretation of the Indus script should be able to explain these characteristics before it is accepted. A successful decipherment will help us peek into this extraordinary and ingenious creation of the Harappan people. It is important that we pursue this study and decipher the script. It is one of the great challenges for scientists.

More published papers on the subject are available at



Nisha Yadav is a Scientific Officer at the Tata Institute of Fundamental Research. She has expertise in making space based astronomy telescopes and related fields. Since past few years she has been pursuing a study of the Indus Valley civilization with a special interest in its writing. She has published several papers in leading international and Indian journals. She has been invited to present her work at various meetings in India and abroad. Her research aims at understanding the structure of the Indus script using various computational and statistical techniques and explores its relation to other aspects of the culture.


Prof. Mayank Vahia is a scientist at TIFR in the department of astronomy and astrophysics. He has very wide ranging interests on all aspects of astronomy in particular and science in general. After spending 25 years working on making telescopes that were flown on American, Russian and Indian satellites, his present interest is in the history of astronomy in particular and science in general. He also coordinates the Astronomy Olympiad programme in India.

Leave a Reply

Your email address will not be published. Required fields are marked *