Data Marketplaces for Individuals (Still) Don't Make Sense

November 2, 2023 :: 3 min read

Controlling your own data is nice. Decide what you want to share, and charge for it; or don't share anything at all. But other than data that compromises your privacy, do you have anything worth selling?

Imagine an alternative universe in which two things are true: 1) you can freely choose between ad-supported business models that syphon your data, vs those that just charge you some dollar amount; 2) there exist data marketplaces where private individuals can sell their data. A while back, I wrote a post on why such marketplaces could be exploitative. I haven’t changed my mind on that. Long story short, people strapped for cash might be compelled to e.g., sell their medical data. While everyone else doesn’t have to interact with the marketplace at all, or only if they want some pocket money. But do you actually have any data worth selling?

What do you have?

Let’s put aside any corporate data silos. They have a lot of the good stuff that gives them competitive advantage, and could be sold for reasonable profit. We’ll talk about it another time. Instead, we’ll focus exclusively on the data that an individual may have, and try to sell. I want to point out, that when I say have, I don’t mean just on hand but also content that you could produce.

Audiovisual content is the most promising. You can sell photos with certain characteristics. 360 degrees portraits of a middle-aged man with moustache and a ponytail; smiling, frowning, neutral expression. All kinds of things. Outfits, looks, features. Acting certain scenes, reading text in a particular manner. All annotated with descriptive metadata. Useful for generating e.g. hyper-customised content — imagine something akin to The Hangover with your favourite silent era actors. The crucial bit here is that such audiovisual content can be commissioned. It isn’t just about what you have but what you could produce (semi-)professionally for the marketplace.

Seemingly, your text data has no value. No one is going to commission you to produce text either. Unless we’re talking about your messages and emails. Although, there’s some textual value to it — the conversational exchange — what you’re really after is the next kind of data…

Food market
Much like at a farmer's market, most people don't have any produce to sell. Picture source.

…your tabular data; though it’s quite tricky too. On one hand, you can’t be asked to create anything useful. Much of what you consider your data is not really interesting either. However, data that describes you and your activities is desirable. Your transaction history, your grocery lists, your health data and measurements, the schedule for movies and dinner that you agreed on over text with friends (social graph). Revealing any of which is a privacy violation. A quite serious one at that.

Annotations and metadata

One thing that stands out to me, and hopefully to you too, is that a lot of the value here comes from the metadata. For instance, for audiovisual content, its rich and descriptive annotations. Let’s say you have a 50 y.o. woman reading a passage from the Lord of the Rings. Warm voice, low cadence, high pitch, Irish accent, lisp… — you get the point. No different than modern social media that already contains a lot of descriptions that make finding content with a particular vibe easier.

If you ever tried to curate a dataset or clean data for production use you probably see where I’m going with this. Cleaning, sorting, and annotating your data is extremely costly. Existence of a data marketplace that does a bulk of that work for you would make things easier. In particular, if it has some kind of rating system that most modern platforms do.

Wrapping up

Unless you’re an amazing (voice) actor, it seems to me that you can’t really participate in data marketplaces. At least not without selling off your private information. Eventually, we might have robust and versatile technologies that let us do that without sacrifing (much) utility. But for now, something has to give; even in this imaginary universe.

Having said all that, I’m not entirely sure that our society would want that. Recently, there has been some outrage regarding GenAI/Deepfakes of Tom Hanks — whose generative persona was used in some commercial; as well as Hollywood contracts with clauses that allow studios to use their actors in generative content. To end on a dystopian note, I can imagine an Instagram celebrity, who effectively makes an audiovisual 4D-scan of themselves. Which could then be used in all kinds of content. I guess there’s a fine line between empowerment and hustle, and exploitation.

More posts.