TEDx rule: No talks with an inflammatory political or religious agenda, nor for polarizing “us vs them” language. We seek to build consensus and provide outside-the-box thinking, not to revisit familiar, unresolvable disputes on these topics.


Hello Everyone This talk will be different. Success of any technology is not determined by billions it adds to revenues of multi-billion dollar industries. The real test of technology lies in how it improves the life of a common person. Data Science must play serious role in common man’s life. Beyond improving bottomline of companies, targeting customers better, classifying your photos into happy, sad, and to have lots of fun and convenience and cool stuff. And until that happens, it will be incomplete. I work on the application of analytics on issues that impact lives beyond corporate houses of New York and Mumbai. 2 days ago, there was terror attack in UK. Earlier it was in Spain. Before that in France. Earlier in India and so on…

Elections in UP happened. Whole society was divided in castes and religions. Count of humans was nowhere. It was all how many Brahmans, Dalits, Yadavs, and Muslims.

Can this society survive like this for a long? How can my expertise of Analytics help in changing things for good? I will be discussing cases of how analytics can be applied to the domains which get minimum attention of technologists despite meriting maximum. Caste and terror being two major issues in India and world respectively are part of my talk today. I believe that in solving these core issues lies real success of Data Science or Analytics.

We all have heard of Garbage In Garbage Out – that means if you feed junk data input to a system, the output would be junk. When I applied data science to these serious challenges, I learnt a few more lessons. Let us talk about the caste first.

Lesson 1 on Social Equality

Caste and gender bias are like massive bumpers that destroy the progress of nation into shocks that break our backbones. The Data Output is definitely garbage. But what is the Data Input? I took a journey to find the data source. Browsed through works of historians and social rights activists and indologists who blamed the origin on first texts of Indian culture – the Vedas. And Ramayan. And Manu Smriti – the hindu code of law. They are supposed to be casteists and anti-woman.

I decided to take a deep dive into the data. I learnt Sanskrit – not Sanskrit of Kalidas. But Sanskrit of Vedas. I read lots of translations. And was shocked! Shocked because I found Vedas and Manu Smriti to be anti-thesis of casteism and gender discrimination. In fact, these texts can be used as Bibles of social and gender equality. Then what explains these allegations? Simply wrong translations. Some 19th century British experts published some translations in English. And others started using them for reference. Whatever they wrote became gold standard for others. Whatever books British published, we just accepted as scriptures. No one verified or double-checked. Ramayan?

The reference to Sita exile and murder of one Sudra comes in Uttar Ramayan and not Ramayan of Valmiki. A spurious text of much recent origin. Yes, there has been garbage data output. But not because of some ancient scriptures as garbage data input. Perhaps due to some intermediary step between data input and data output. Out of geographical limitations or professions. Garbage output of gender discrimination probably emerged when invaders like Timur and Ghazni and Akbar invaded India specifically targeting women. It is time most of us come out a subtle guilt that source of our culture was based on any form of discrimination.

So here is the lesson of Data Science: Garbage Data In means Garbage Data Out But Garbage Data Out does not always mean Garbage Data In. Always cross check the data source. Don’t assume things. And here is the icing on cake: Even the Data Output, at least today, is not that garbage. Garbage forms a small error population, not the normal range. Most Indians are not casteist.

When we started ‘Dalit yajna’ to encourage so-called Dalits to perform Vedic rituals and recite Vedic mantras that we were taught as prohibited for lower castes, people from all sections joined in to break caste barriers. So-called Brahmins funded and managed operations. Even women did these rituals contrary to popular perception that women cannot recite Vedic mantras. And we are getting support from all corners. We are using same data inputs to break social barriers and remove garbage data outputs that were supposed to create them in first place!

Lesson 2 on Global Terrorism

Islam is the only religion that is supposed to mean Peace. To you, your religion. To me, mine is the mantra. Very helpful, very kind, very gentle. If there is one word that I would use as synonym for Muslims, it will be “Peaceful”. Data source seems great. Yet, we see that Islamic terrorism a reality. We just witnessed what happened in London.

Before that in Paris, Brussels, Syria, Mumbai, etc etc. Really garbage data coming as output. This sharp dichotomy has always perplexed me. I decided to approach the problem from Data Science perspective after witnessing one of the most tragic attacks. Approach the problem beyond the hype, beyond the politics, beyond the religion.

I ran analytics on the profile of terrorists and found something interesting. Most had poor IQ. A disturbing past. Emotional Disorders. And none could understand the Arabic of Holy Quran or Hadiths. In fact, if you observe recent trends, as more data gets available, most perpetrators of violent acts are neo-converts. What access does a Jihadi John sitting in Europe have to understand religion?

Absolutely, nothing. What I found was that these weak minds are brainwashed by translations and not actual scriptures taught by a religious expert. All that fanatic groups had to do was to flood internet with translations that promote hatred. That promise rewards in Heaven in return of following the message of hatred. And make it SEO optimized to appear in top of search engine results.

They feed on spurious translations that any sensible mind can decipher to make no sense. But since they are targeting weak minds and not APJ Abdul Kalam, it works perfect for them. The scholars of Islam that I met always insisted that one cannot understand Islam unless being taught by a scholar. But here we have some rogue English translations of hadiths – holy texts – that is spread like wildfire on thousands of websites. You have 8 different translations of Holy Quran in English that spread hatred. A weak mind has to open a few sites, read the hatred translation, not bothering about original, be attracted to get divine rewards, be afraid to avoid punishment, believe he became an expert, and then spread violence to book his seat in Heaven. The peaceful translations like Study Quran is expensive and not in public domain.

Originals are in Ancient Arabic that no one understands. So here is the harsh lesson: Garbage Data Out means Last Processing Step must be transferring Garbage Data. It does not matter what the source is. Just corrupt the last step, and you can ensure garbage comes out. But here is also the solution: Manage that last step and Garbage Data will be eliminated. All that needs to be done is to remove online access of all such spurious translations that are prone to misinterpretation. And encourage availability of translations that promote peace. We would prevent lots of terror attacks.

Q: This video is ambiguous. Did you mean terrorists and islamic radicals are the product of wrong interpretations of Quran ?

A: I mean ban top 8 translations of Quran and English translation of Hadiths.

Q: Who is stopping upper caste Hindus from refuting caste based superiority ? As long as superiority is there discrimination will prevail.
You are one of the few who stoutly speak against this social evil.

A: 10 jokers of 21st century do not define what ages-old source of Hinduism is.

