A Note From A Digital Anthropologist
Updated: May 21
Nearly ten years ago, while doing my PhD in Anthropology, I learnt that you can understand people if you live with them, watch them, take notes, and ask questions.
One of the areas that I studied was crime and crime reporting. That provided me with another set of learnings. When you go to crime scenes and meet either suspects or witnesses, a remarkable pattern shows up — because everyone is scared, even if they are innocent, everyone defaults immediately towards lying. They’re not big lies, it’s just that the environment is so unstable and unfamiliar after a crime event that the truth sometimes makes no sense.
I used these learnings to complete my PhD and got a job in market research.
The key methods here were focus groups. We would get people into rooms and have intensive discussions with them. These discussions were designed to extract deeper motivations and human impulses, so that marketers and brand strategists could connect with actual people and sell even better.
So there are three things that I learned through 2000 — observe, don’t ask, everybody lies and understanding people is big business.
1. Observe, don’t ask
In 2010, I set up my own business. The purpose was to use the Internet to watch people doing things.
Most of the people research industry believed that stuff on the internet was false (projections) or niche (only elite people used it), so it was not a source of big insight.
However, this is because the frame of observation is limited. If we expand the framing of data, suddenly new things are possible:
What is a selfie? What can these filters tell us about the human condition?
It turns out, a lot. If you found 10,000 women in India, extracted the selfies they engaged with most and least, what would you learn? Was there something common and subtle among the most popular and least popular selfies in India? Do men “like” a certain type of selfie — in terms of posing, posture and filters? Do women like other types? And whose behavior would impact how the selfie uploader modified their next selfie?
What about video sharing? What can frequency patterns tell us about cities?
We extracted all data from small towns that contained YouTube URLs, and plotted them by time and theme. We noticed spikes and troughs through the course of the day that have profound cultural differences. It’s amazing how much you can learn if you look for things that have one view or one retweet, rather than things that go viral.
And this is just created data.
There is also a lot of passive data. Searches, likes, favourites… all those quiet things we do that we believe are not being observed. They are, and they say something about us. Passive data tells us what motivates small segments. What do Gurgaon bikers like (boxing and shopping)? What do people who work in the bureaucracy favor (Bollywood and Army)?
The internet is not just an efficient tool for communication: it is the most powerful cultural record we have ever created.
And for many years, I have mined it, interpreted it and tried to develop an understanding of humans that we could not have gotten if we interviewed them.
For example, we were laughed at when we called a Trump victory at the Republican Primaries and then at the Presidential election. It was the truth that people would not admit to pollsters.
2. Everybody lies
What is this AI thing? It’s something I avoided for a long time. It seemed like the next Kool Aid. Too much noise, some alarmism. By nature I’m not a trend butterfly. You know the type: Big Data one year, Blockchain the next… not my cup of tea.
But we were encountering problems in our business that were only becoming magnified.
We were comparing ourselves to industry standards in academic or market research. They looked at 100 data points, we looked at 10,000… so our confidence levels were higher.
However, there are millions of data points online and the cultural record growing every day. We were ignoring 99% of the data.
Let me give you a sense of how big the data was: 30 likes, 3 status updates, 8 images, 10 books, 15 newspaper articles, 2 PDFs. That’s probably the information ecosystem of 3 people just this morning as they were heading in to work.
When we multiply this over a day, and then over an entire city, and then a region, and a continent; there is no way a human brain can compute all the cultural information being generated.
And to be very honest, we tried many things over the course of the years that delivered superficial results. The truth of a culture would always get lost when we threw computers at it. Otherwise, it would get reduced to graphs, charts, pivot tables.
This is not just a research industry problem. It’s actually a deep and profound ‘world problem’. Like I said, the internet is the biggest cultural record every created.
The problem is — it’s BIG. And this scale also helps people to stay away from each other.
Our individual data feeds are tailored to ensure that you don’t have to process more than you need to.
As a result, we have fractured into a multi-verse. Everyone living in their own bias bubbles. You are never exposed to ‘otherness’ — your neighbors, different countries, different lifestyles are always mocked. How would they deal with ‘big data otherness’?
If you scroll through your timeline, you will pretty much encounter ideas you already deem to be acceptable.
We all can appreciate people with alternative political beliefs but we secretly know they are wrong.We all can appreciate people from other cultures but we secretly know ours is better.
This is what volume does to us.
It blinds us and ensures that things continue the way they have been. We shut down alternative perspectives by aggressively curating our feeds to deliver what we know.
And this is exactly when things like cognitive computing shine.
3. Understanding people is big business
So, we began experimenting over the last few months. We stopped curating and limiting. Clients would ask questions like “tell us about X”, and we would not sample from a population; we would take the entire population. We took all the data — images, texts, pdfs, status updates, check ins and pushed it into “AI as service” providers.
There were some huge errors, of course, but suddenly we started noticing uncomfortable things that went against how we imagined the world worked.
For example, we noticed that from small town America to most of Europe, the idea that true world leadership is deeply linked to faith and religion is dominant. From climate change to ideas on how to live, people outside our microbubbles are turning towards religion — and so much political strategy in these countries mock these ancient institutions.
We also noticed that there was a deep concern among violent movements in the US with the future of their children. The far left want this future to be secured via education, while far right want this future to be secured via cultural preservation. It’s something we would not have picked up normally — as we are distracted by their high volume talk about guns and genocide.
It’s then that we realized that the data we were gathering when we started, even though it was big, was still biased. Any act of limitation is based on decisions: and human decisions are fundamentally problematic because they are… human.
When we uncapped how much we gathered, we suddenly began to glimpse the truth of the world.
There is a great line in AI circles: “The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else”.
Rather than this frightening us, this amorality is actually quite liberating. You need this level of neutrality to get to empathy.
The AI does not care about who is in charge, what is politically acceptable, what your clients want to hear, what prevailing wisdom is. It will look at what you feed it and tell you what you did not know about the people around you.
Before I close, I want to add one additional point about the future. We are now seeing a world where everyone has been born into the matrix and data streams are being generated from infancy. What will we learn about ourselves when we have every data point from infancy to death captured?
If you enjoyed reading this post, you might also like: Making Machines Human: How Quilt.AI is Indexing Humanity at Scale