The idea of you "owning" the data about yourself is both emotionally and intellectually appealing. This data, which ranges from the critical (your medical and financial records) to the theoretically trivial (what you buy and search for, and which Web sites you visit) defines, quantifies and describes your preferences, resources, habits and health. It is a proxy for you. It is also what every marketer in the entire commercial universe wants to get their hands on.
At present this data is smeared across thousands of different locations in hundreds of formats ranging from paper forms at your chiropractor's office to digital records captured by the supermarkets you frequent to the often erroneous credit profiles kept about you in the vast data warehouses of companies such as Experian and Equifax. It is stored by the IRS, lost by TJX and analyzed by anyone who can get their hands on it.
This data might be high grade (for example, your tax returns and medical records are in-depth, detailed and specific), or low grade (such as your Google searches and your click stream as you navigate Amazon). But whatever the source or the quality, that data has value and it is guaranteed that someone, somewhere, considers even the smallest part of it valuable and worth exploiting.
Just consider the various customer loyalty programs that supermarkets run. You enter your ID and the detailed knowledge about what and when you buy gives them an in-depth, detailed and very personal profile of you. They know what kind of plonk you drink and even your favorite brand of hemorrhoid cream. It doesn't get much more personal than that.
Now, if you truly owned any of your data then you'd be able to control who gets access to it, what parts and how much of it they could see, how long they could retain it and what exactly they could do with it. That is, of course, exactly what the commercial world doesn't want and, for a number of profound reasons, not what I suspect the majority of people want either.
Let's first consider what it would mean to own your own data. Owning anything is a responsibility and in the case of any data, this is a task that requires a lot of sophistication and knowledge if you're going to do it effectively and reliably.
In the case of your personal data it means verifying, organizing, categorizing, storing, updating, archiving and securing it, along with negotiating its release, its deployment and its use with and by interested parties. That looks a lot like work. In fact, it looks like heavy lifting and no fun at all.
Well, people are working on technologies and services that aim to make personal data management easy and effective. At Harvard University's Berkman Center for Internet and Society, for example, there is Project VRM. The goal is to develop a set of tools for Vendor Relationship Management (VRM), which has been described as the reciprocal of Customer Relationship Management (CRM) as practiced by businesses.
I recently talked to Joe Andrieu, CEO of start-up SwitchBook, which plans to help you manage your Internet searching such that your activities are organized and what you're looking for is kept private -- what the company calls "user driven search". Andrieu is passionate about the need for VRM and says SwitchBook will implement the policies and methodologies for VRM, all of which is great.
But the problem I foresee is that without real privacy laws, with user interest in managing one's own data currently almost non-existent, and with nothing even remotely approaching a public dialog on how our data is routinely used and abused, how can VRM work? The fact is our society needs VRM and needs it now. So, how can we, the IT industry, the only people who "get it", help the rest of the world to get it?






Comments
Mark,
Great talking the other night. Here are a few more thoughts.
First, most of us in the VRM conversation are pretty wary about the term "owning your data". What is often called "your data" isn't owned at all by the individual: credit scores, addresses, phone numbers, driving records, social security numbers. Even if it it "personal data" with both statutory and contractual limits on its use, these are not things "owned" by the user in any useful sense of the word. Often individuals own neither a copyright nor exclusive database rights over that information, which makes the term a bit inflammatory and misleading.
Instead, I, and others, like to think about opportunities where integrating the data around--or on behalf of or at the point of--the user is the structurally better option. In these situations, the user is the place of collection, coordination, and permissioning, allowing vendors to add and view data on an authorized basis. This shift profoundly changes a lot of hard problems into simpler ones. http://blog.joeandrieu.com/2007/06/14/vrm-the-user-as-point-of-integrati...
In these scenarios, users do in fact control the data they've collected under the laws concerning database rights. Armed with this improved dataset, users can now license restricted access to just those parties they /want/ to do business with.
In our chat, you raised the issue of how we handle the inherent danger in a collected "honey pot" of user data. Data leaks and data breaches become much more challenging when cracking the system gets you /all/ of the data instead of just one vendor's. And indeed, the "logical model" of a Personal Datastore implies that all the data is right there, where the user can control it. In practice, you don't need to physically store all the data in the same place, you can instead deploy it around the network, anywhere you like, with different datastores holding different types of data--one for your music ratings, another for your health records, another for your official address of record. You could even extend this to allow a single data type to be stored in many places and aggregated on the fly, which is something you /cannot/ do with your transactions at a silo'd data provider.
However, I think the biggest issue isn't the one you raise (I'll respond to that in just a second), it is that this data is being aggregated about everyone already. And without a credible alternative, it will be nearly impossible to rip the data surveillance model of customer management out of the toolbox of major corporations. By working with personal datastores, we can in fact offer that credible alternative--a data model and relationships that is more accurate, more intimate, and more effective for both organizations and individuals. Once we shift the model to something like the personal datastore can we hope to liberate users from the corporate data silos and information farming that currently engage people as if they were digital surfs.
That said, I think your closing point is right on the money. We need both a new regulatory and legal framework as well as a public dialogue. All of those are explicit goals of the new organization that will soon emerge from Project VRM. Keep your eyes open for it. We hope to have an announcement within the next month or so.
'Ownership' of data, whatever that means, is merely a starting point. I might 'volunteer' information - to me that just means I share it on my own terms - but the more important point is the ability to establish and maintain relationships. For that I need and want the following 'functionality' to be enabled for me:
1. take charge of my data (content, relationships, transactions, knowledge),
2. manage (arrange, mash-up, analyse) it according to my needs and preferences
3. share it on my own terms
4. whilst connected and networked on the web.
This is what I mean when I talk about turning the individual into a platform and into the most authoritative source about themselves. It does not happen by creating a database or a data store, however personal. The word store implies passive and static, even with some sort of distribution layered on top. The objective needs to be equip individuals with analytical, and other, tools to help them understand themselves better and give them an online spring board to relationships with others. In VRM context this includes vendors.
It is the user who should define the nature of the data stored, shared, analysed - and what data is called or labelled, whether confidential or premium etc. The critical thing is the user's ability to share it and do all sorts of groovy things with it independently of third parties, and without the data being hijacked and harvested by third parties in the process.
There is a difference between those who emphasise data and those who emphasise relationships. Data can be a vehicle for relationships, but not the other way around. If relationships are seen as more important, then third parties get in the way. If data is considered more important aspect, then intermediaries tend to abound.
Another crucial difference revolves around the meaning of 'personal data'. One kind of personal data means one's address, date of birth, phone number, social security number etc etc. And the other kind, proliferating with the advent of the social web, is the 'data pertaining to a person'.
The former is usually static data, your address or phone number can change from time to time, and although it is possible to change your name, the date of birth is unchangeable. The latter is dynamic, at any time only a snapshot of the person and the more data can be created and captured, the more granular and valuable it can become. On the web such flows of data often act as a proxy for a relationship. People subscribing to my blog, Friendfeed, Twitter, Facebook updates etc - such data is personal, i.e. related to my person and yet, its existence revolves around sharing it with others. Personal data stores are for 'personal data' as the name suggests. We have few means, if any, to harness the dynamic data, created by persons.
from the paper - A VRM Journey: http://www.mediainfluencer.net/vrmjourney/
Great conversation. I'm looking forward to hearing about what Project VRM will do next. I think it could be a boon for users as well as for vendors who are trying to figure out how best to monetize their content.
Allan Hoving
http:.//www.paycheckr.com
Great conversation. I look forward to hearing more about what Project VRM will do next. I think it could be a boon for users as well as for vendors trying to monetize their online content.
Allan Hoving
http://www.paycheckr.com
Adriana makes a great point, and I'll channel Iain Henderson here to expand on it.
The VRM opportunity requires not just a personal datastore, but systems that allow the user to do smart stuff with that data. Iain puts it simply: when dealing with data you have two types of systems: doing systems and thinking systems.
I like Adriana's point that we need to give users better thinking systems for their personal data, whether dynamic or static, owned or unowned. However, I don't agree that a datastore presumes the ascension of data over relationships, nor do I fully understand her implied desire to abolish third parties who "get in the way". Percentage wise, almost no one hosts their own services on hardware they built, on software they wrote. Instead, we use third party software, hardware, and services as tools to interact with others. Some of those third parties will host data--managing accessibility, survivability, permissioning, etc., without needing to actually view the content of the datastore. Others will provide value-added services based on the content of that data, such as music recommendations, search recommendations, and social recommendations. Or, more exciting, services we can't yet even imagine. Just as Tim Berner's Lee probably never imagined Twitter when building the world's first website.
All of which is to say that data are the inputs that enable these value-added services... even if they run on my personal computer. For many value-generating transactions, it makes the most sense for this data to be collected, collated, and re-provisioned on behalf of the user at a datastore under their control. Without that axiomatic control of one's datastore, all digitally mediated relationships inherently become subject to the whims of silo'd service providers such as FaceBook, Twitter, and MySpace. When we open it up, we create opportunities for both personally hosted thinking systems /and/ service providers acting as thinking systems. And without that openness, the value propositions from these kinds of services won't reach Internet scale nor will they enjoy the impact that Internet scale offers, both to entrepreneurs and to society.
And to Mark's original point, to open that fully, we need regulatory and legal changes as well as the technical enablers.
"However, I don't agree that a datastore presumes the ascension of data over relationships,"
Neither do I, that's why I'd never say that.
Apart from thinking that datastores are a bit pointless (for more see comments on this post http://themineproject.org/index.php/2008/11/what-mine-is-not/), data is merely a proxy for a relationship, just like a conversation and hanging out together is. It's more a vehicle or an enabler of a relationship and the reason I am associating data with relationships is that data is the main way of connecting human beings online.
What I am saying is that data under user's direct control and ownership (for more nuanced meaning of control see my previous comment) is the soundest basis for using data as a proxy (enabler, vehicle etc) for relationships online. Definitely at this stage of data online, especially as I am tapping into social web, whose main contribution was turning consumers into producers and creators, audiences into distributors etc. I want that empowerement go all the way to data ownership and management.
...implied desire to abolish third parties who "get in the way".
Hm, of course I want to by-pass third parties, especially those that "get in the way"! What's wrong with that?!
"Percentage wise, almost no one hosts their own services on hardware they built, on software they wrote. Instead, we use third party software, hardware, and services as tools to interact with others."
And? I don't see this as relevant to my position at all. The question is to what extend the third party enables me and to what extend it merely provides for me. Can I add value to the tool or software I am using? Am I free to move the benefits of using the tool or service etc etc.
More fundamentally, I actually think that much of our online life is unnecessarily 'outsourced', to the detriment of our autonomy. This has impact on privacy (and also security) as individuals are 'eased out' from having direct impact on the way their data is stored and used. Privacy is a behaviour, not a system, so how can a third party do this on my behalf?
"Some of those third parties will host data--managing accessibility, survivability, permissioning, etc., without needing to actually view the content of the datastore. Others will provide value-added services based on the content of that data, such as music recommendations, search recommendations, and social recommendations."
As far as I can tell all of this has already happened, most of it years ago - hosting providers, google mail, google search etc - these are things that already exist.
But why persist in saying that the user is not capable to do it for themselves? Why is it always 'on behalf of'?
Post new comment