Now Reading:

Decoding 6,000-Year-Old Language Can Bury North-South Divide

Work by an Indian cryptographer on the still-undeciphered Indus Valley script suggests Sanskrit may have been the root language. If true, it questions Aryan invasion theory as well as the north-south divide

Mar 8, 2025, 13:09 IST
Decoding 6,000-Year-Old Language Can Bury North-South Divide
Given the immense interest sparked by Nirmala Sitharaman’s post, The Times of India has made this piece free to read. Yajnadevam’s research challenges the Aryan invasion theory—was Sanskrit spoken & written in 4000 BCE? Read on to explore a potential rewriting of history.
Yajnadevam, aka Bharath Rao, is a rare cryptographer – among epigraphists, archaeologists, linguists etc – who can claim to have cracked the code to deciphering the Indus Valley script. The earliest pottery with Indus script symbols dates to 4,000 BCE. Ever since, the Indus script has resisted attempted decipherments.
Some of the most famous decipherments of other ancient scripts were made largely due to the presence of bilingual or trilingual inscriptions. For example, during Napoleon’s invasion of Egypt, his army found the Rosetta Stone while demolishing a fort. This had the same message written in ancient Greek, Egyptian hieroglyphics, and another ancient Egyptian script. While England seized the Rosetta Stone after defeating France, a French scholar, Champollion, managed to decipher hieroglyphics using the fact that the same Greek names occurred both in the Greek portion and the hieroglyphic portion of the inscription.
Another famous decipherment was due to Henry Rawlinson, a young cadet with the East India Company’s army, who taught himself Persian and was posted to Persia. He travelled to the Zagros Mountains, where there was a trilingual inscription in old Persian cuneiform, Elamite, and Sumerian. Risking his life, Rawlinson managed to climb the sheer cliff and accurately copy the entire inscription. Later, he would use his knowledge of Persian to decipher the old Persian portion, which was key to also deciphering the Elamite and Sumerian scripts.
Closer home, a breakthrough in the decipherment of the Brahmi script was made using a bilingual coin minted by the Indo-Bactrian king, Agathocles. The coin had his name in Greek on one side, and in Brahmi on the other. Unlike all these cases, however, no bilingual or trilingual has been found for the Indus script. Indus inscriptions were never conveniently accompanied by another identical message in a known script!
Yajnadevam used cryptography, treating the unknown Indus script as a cipher encoding a known language (for which he had to first determine which language would be the best candidate). His work was built on information theory, specially on a famous 1945 paper written by Claude Shannon, the father of modern information theory. During World War II, Shannon was asked to determine how to make secret codes unbreakable. However, Shannon found that all codes could be broken once enough messages written in code had been read.
Moreover, reading enough text ensured that the solution found by the code-breakers was unique. Uniqueness and correctness mean that if I decipher your code as saying, “We will bomb the enemy on Tuesday,” it is not possible for the code to also mean, “We will have eggs for breakfast.” This had also been how, many centuries ago, an English code-breaker had broken the coded messages sent by Mary, Queen of Scots, to her loyalists, simply by intercepting enough of her coded letters.

Three more most read TOI+ stories:
Abdul Hamid won India the 1965 war. Why is his legacy under attack?
Cricket's darkest overs: When Pakistani umpires denied India victory
She was young, healthy, full of life — so why did her heart suddenly stop?


Most earlier attempts at decipherment involved assigning values to very short inscriptions (one or two symbols long). However, when the same symbols occurred in longer inscriptions, the assigned values would fail to result in a meaningful phrase or a grammatically correct word – resulting in the decipherers assigning an entirely different possible meaning to the same symbols once they occurred in longer inscriptions. Thus, the number of possible solutions kept going up as more messages were read, so no one could say what the correct solution actually was.
As an example, suppose a decipherer finds three one-symbol inscriptions and thinks they mean “cat”, “jar”, and “go”. But he then finds a fourth inscription with all three of these symbols. It is hard to justify “cat jar go” as a meaningful sentence, and so he must assign a completely different meaning to this combination of symbols once they occur together.
Yajnadevam’s first task was to fix on a likely candidate for the language of the Indus script. At the outset, he was able to rule out a large family of languages called “agglutinative languages”, that include all Dravidian languages, and ancient middle eastern or near eastern languages like Sumerian, Elamite, and Hittite. This was because of several mismatches between the pattern that all these languages follow, and the pattern of the Indus symbols.
First, the Indus script had cases where the same symbol was repeated three times consecutively. This never occurred in Dravidian or other agglutinative languages, but it did occur in old Vedic forms of Sanskrit (for instance, jajaja which means “I fought”).
Second, agglutinative languages never had compound words composed of more than two root words, while the Indus script had such words, as does Sanskrit. Third, agglutinative languages had prefixes or suffixes that could not exist as separate words and that were always joined to the root word in a fixed order. However, the Indus script had possible prefixes and suffixes also occur as separate words, and they occurred in different positional order in different inscriptions. This was inconsistent with agglutinative languages, but consistent with Sanskrit. Thus, Yajnadevam started his decipherment taking Sanskrit to be the language of the Indus script.
By using standard code-breaking methods (first identifying the symbol with the highest frequency, then the symbol which occurs most frequently along with it, in a sequence until all symbols are identified), he assigned values to the symbols.
He then found, to his surprise, that the assigned values resulted in meaningful words and grammatically correct Sanskrit expressions. At this point, he was able to decipher and translate messages on individual Indus inscriptions and seals, eventually reading more than enough to satisfy Shannon’s threshold for uniqueness. This showed that not only was it possible to read the inscriptions in Sanskrit, but that it would not be possible to read the entire body of Indus inscriptions using a different language. He says his work establishes Sanskrit as the language of the Indus Valley Civilisation (IVC).
The translated inscriptions mention Vedic deities (like Shiva/Rudra, Indra, and others), yajnas or havans, horses, food, and often ask the deities for blessings or protection for a sea voyage. There are messages where the writer mentions that the ocean is his home. This bears out archaeological evidence and Sumerian accounts of extensive international trade (including ocean trade) during the IVC.
Some messages were carved onto bangles and other ornaments, showing that both the craftsman and the buyer were literate. Some Indus script inscriptions found in foreign locations, like Susa or Ur, used Indus script to write words in Akkadian for specific traded goods, like “wine” or “cumin”. This was presumably done so that it would be understandable by both parties in an international transaction.
The next interesting thing that Yajnadevam did was to compare each Indus symbol with the symbol producing the same sound in Brahmi. He found an amazing physical similarity between Indus symbols and Brahmi symbols associated with the same sound. This, coupled with the existence of mixed inscriptions containing both Indus and Brahmi script, showed that Brahmi had naturally evolved from Indus script.
Such mixed inscriptions were found all over the country, even as far south as Keezhadi in Tamil Nadu, where 600 BCE mixed inscriptions could be read as meaningful Sanskrit words (for example, “powerful” was written on an axe). In fact, some mixed inscriptions persisted even into the Gupta age, while others were found in foreign countries like Vietnam (where they could also be read as Sanskrit).
How does this decipherment change our view of our history? First, the main tenet of the Aryan invasion/migration theory is that steppe invaders/migrants brought in Sanskrit into our country sometime around 1500 BCE. They then also imposed their culture, religion, and this language (Sanskrit) on us. However, the decipherment shows that Sanskrit was not only being spoken, but even written, way back in 4000 BCE, negating this.
Second, a main source of the north-south divide is also the Aryan invasion theory, which says northerners are descended from steppe invaders who drove away the original IVC inhabitants, who became the ancestors of the southerners. The decipherment, by beautifully establishing the linguistic, cultural, and religious continuity of our civilisation, has destroyed both these theories.
The writer is associate professor, School of International Studies, JNU