Alex,
I appreciate your comments on this issue. Chris and I have spent a great deal of time thinking through the privacy issues that you discuss here. One of the primary reasons that we have launched as an “Alpha” and not even a “Beta” product is that we are looking specifically for this kind of feedback. Furthermore, the solutions to many of the issues you have brought up are part of our roadmap.
Let me address our overall strategy. In order for WSRelater to work, we need to know the aggregate viewing behavior of a set of people. We do NOT need to know specifically who they are. For example, we need to know
Some person looked at item A, then item B, then item C
We do not need to know that the person above is “[email protected]”.
In order for this to work across websites, WSRelater needs some way to know that the two people in the following example are the same person
Some person looked at a group 1 on Tribe, group 2 on Tribe
Some person looked at discussion 3 on eCademy
The solution is to use the FOAF practice of sending a one-way hash of the person’s email address. (As an FYI, if you submit an email address to the WSRelater with type PERSON it is automatically converted – we don’t store the address). By using this best practice attention can be aggregated across partners using WSRelater.
Let me address your specific issues and describe what our strategy is as we move from an Alpha to a Beta product.
(1) How do I know with whom you are sharing what data with?
The answer is simple: we are not sharing your user data with our partner sites. A partner site queries WSRelater for recommendations of items, not for information about a person.
Right now we have an Alpha API feature that lets you query the database for information about a user. This is for debugging and testing purposes only (another reason that we are in Alpha). Its really hard to know if you implemented the API correctly if you can not directly query the system. Perhaps we should only enable this API function for the site that contributed the data or only for the development instance of the database. We are looking for your feedback.
(2) What is being done with all my data?
We are using it in aggregate to make recommendations of items that a person would be interested in. It is an item-to-item filter, so a partner asks for information about a item and gets back more items.
(3) How can I be certain that I am in control?
Since the system is based on the one-way hash of an email address we have a nice way to deal with this issue. The user who controls the email address with the behavior can remove any of the behavior data about him or herself or can opt out of being used for recommendations completely.
(This is not yet implemented, but this is how it will work)
After verifying a person is the owner of an email address, he or she can use a web interface to view all WSRelater behavior entries from that address. Any or all of these can be removed at any time. Since this is a real time system, item recommendations that relied on this data will be updated the next time they are requested. By opting out completely no data will be accept by WSRelater from this one-way hashed emailed address.
We would love to have this “console” interoperate with existing efforts like Attention Trust, or perhaps even BE Attention Trust. Why rebuild if it is already there?
As a follow up question: how do you feel about WSRelater keeping this opted-out data, but removing any reference to the user when it is removed? So when a user opts-out of WSRelater, instead of deleting all rows from the database, a unique random identifier (that can never be tied to you) will be applied to those rows, instead of the one-way hash of the users email.
As a final note: please remember this is a recommendation web service, not an ad network or a personalized search engine. We are not trying to deliver ads for wedding planners because you type “engaged” into a profile. Our goal is to provide high quality recommendations of items that you might be interested in based on what other people previously liked (or didn’t like). Being that this is an item-to-item system it can work for an anonymous user who simply clicked on a first book, image, or group and want to see more things like it. We think this is a compelling proposition for the end user and leverages the collective wisdom of crowds in doing so.