Paul Crowley ([info]ciphergoth) wrote in [info]trustmetrics,
@ 2003-07-09 10:43:00
Previous Entry  Add to memories!  Tell a Friend!  Next Entry
Simple trust metric
Here's a simple trust metric. As usual, nodes are people and directed arcs denote that A confers trust onto B.

We each have a litre bucket. When the bucket overflows, water runs equally down each outgoing arc into the buckets at the other end. To assess who I trust, I simply start pouring water into my own bucket; the order in which other buckets become full to overflowing is their trust rank.

Update with notes:

Dead Ends: Where water has nowhere to go, it runs off the edge of the graph. This happens to water that flows into a subset of nodes all of which are full, with no arcs leaving the subset; the simplest example is a full bucket with no outgoing arcs.

Determining which nodes are dead ends is like determining "unreachability" in garbage collection. Reverse all the arcs, add a "special" node with an arc to each non-full bucket, determine which nodes are reachable from this special node with a standard depth-first search. Those nodes which are not reachable are dead ends.

Monte Carlo implementation: A bucket contains at most n droplets. Each droplet starts from your own bucket. A droplet hitting a full bucket chooses an outgoing arc at random to follow. Repeat until you have enough trusted people. You can explicitly detect dead end conditions, or you can give each droplet a maximum number of transitions before it "evaporates". The higher n is, the longer it will take to run and the greater the precision of your trust ranking.

Deterministic implementation: If water is pouring into my bucket at a constant rate, the rate at which each non-full bucket is filling up is defined by a set of linear equations. Find the "dead ends", determine the set of linear equations, solve, and determine which bucket(s) is going to fill next and how much water it will take. Add exactly that much water, add the newly filled buckets to the trust list, and iterate.

Incomplete knowledge: This metric is designed to work in a situation where each node advertises its outgoing arc list on a different Web page - you don't need to know the whole graph to answer the question "which node do I trust next", you only need the outgoing arc list of nodes you already trust. This applies to both implementations.



(Post a new comment)


[info]vampwillow
2003-07-09 09:22 am UTC (link)
I need to ask a question on this as it otherwise makes no sense. As I see it there are two options; either all the buckets are empty, albeit each bucket-owner has set the distribution of their overflow, and you are the only one adding water until such time as you find that someone becomes trustable, or each person starts by trusting themselves (a 'full' bucket) and you are only adding to your own full bucket to see where the overflow goes, but this would mean your trust would be based upon other's levels of trust to an over-ridable level.

or have I misunderstood the source of new water?

(Reply to this) (Thread)


[info]ciphergoth
2003-07-09 09:44 am UTC (link)
The former, pretty much - if I want to know who I trust, I start with all the buckets empty, then start adding water to my bucket, and everyone gets a "trust rank" showing how much I trust them in order.

(Reply to this) (Parent)


[info]naturalborn
2003-07-10 12:08 pm UTC (link)
This is very similar in flavor to advogato's current trust metric. The main API difference is that it creates ordinal rankings. To weed stuff out based on it you'd presumably have to set a threshold. Advogato currently has multiple manually tweaked thresholds based on minimum distance, so a single simple one would be a big improvement.

Your system has some nice game-resistance properties. It can neither help nor hurt a peer's own status to cert other peers. However, the system can be gamed a little bit by trading certs with peers, so as to utilize water which would otherwise be wasted by being a dead end.

(Reply to this) (Thread)


[info]ciphergoth
2003-07-10 01:12 pm UTC (link)
Thanks for your comments!

Exactly as you say, you just decide how many people to trust and you get that many people or so back from the metric. The "or so" is because some people may rank equally.

There are some big differences with Advogato. For one thing, it's deterministic; Advogato gives different rankings given isomorphic graphs, because Ford-Fulkerson doesn't treat isomorphic graphs the same. But the most important difference is that it works on incomplete graphs. And of course it's considerably simpler than Advogato.

TBH it seems as if Raph must have considered and rejected this metric before inventing the Advogato metric, but I'd be curious to know why; it's not in his draft PhD thesis.

The game-resistance property you name was a key design decision. I think dead ends will be very rare in practice, but I'll be interested to find out.

(Reply to this) (Parent)(Thread)


[info]naturalborn
2003-07-10 03:13 pm UTC (link)
Don't be so sure that the technique occured to Raph before. It certainly hadn't occured to me. Trust metrics are a largely unexplored area, in which even the basic concepts are very far from obvious, despite sounding simple once they're explained.

Another problem with dead ends is that a node may be discouraged from certing a dead end because that would waste water once the node is filled up. This could be fixed by making water at a dead end back up rather than spill out. That might be a worthwile improvement.

(Reply to this) (Parent)(Thread)


[info]ciphergoth
2003-07-10 03:48 pm UTC (link)
You might be right about backing up. I had initially thought that it would make an incomplete graph implementation harder, but thinking about it it's just as easy - since you can and must do dead end discovery either way, you just mandate that water divides evenly between arcs that lead to non-dead-end nodes. I still think dead ends will be rare though.

(Reply to this) (Parent)(Thread)


[info]naturalborn
2003-07-10 04:01 pm UTC (link)
The advogato graph has enough dead ends that its confidence levels on diary ratings are extremely low.

(Reply to this) (Parent)

similar to "attack resistant distributed metadata"
(Anonymous)
2003-07-11 05:39 am UTC (link)
I don't fully grok your idea yet. I like the simple flavor of it and the physical metaphor.

At first blush, it seems somewhat like the idea that Raph, Amber and I proposed at the first O'Reilly p2p conference: that your confidence in a proposition is equal to the average of the confidences of your children, multiplied by a constant fall-off factor. To put it into your metaphor, that would mean that a fixed fraction of each bucket evaporates into the void, and then the remainder is distributed evenly among children. Except that the water that is distributed doesn't disappear from your bucket, it remains there and your children get a copy! Whoops, so much for the nice metaphor. :-)

In my proposal, a certain fraction of the water that flows into your bucket stays there (the part that doesn't stay evaporates) and the children as a group get in-flow equal to the amount that stayed in your bucket. In your idea, there is fixed amount reserved in each bucket, and the children as a group get water equal to the total input to your bucket minus that fixed amount.

And then instead of comparing the values, you ascribe rankings according to when that fixed amount was reached. I like the way that this unifies the "reduction of transitive trust" part and the "evaluation of results" part. It would be more complicated if, for example, the fixed amount that is reserved differed from the cut-off that indicates rank.

Okay, enough rambling. Your idea is interesting!

--Zooko

(Reply to this) (Parent)


[info]naturalborn
2003-07-10 08:03 pm UTC (link)
Here is another weakness in your system, which is also present to some extent in max-flow. Say that I cert A and A certs B. If I get inflow of one unit per time, then A will get filled one tick after I do and B will get filled one tick after that. However, if I cert both A and B then both of them will get filled two ticks after I do, which is quite a bit worse for A. This problem is worse in longer chains.

The following technique fixes it. Say that each node has a priority ranking of its certificates. (Advogato does implicitly, the ranking determined by implementation artifact, due to the nondeterministicness of maxflow). We will repeatedly add new club memberships to the system, by giving a new membership to the seed and having it be passed on until it stops. Each node obeys the following algorithm: The first membership it receives, it keeps for itself. The next membership it routes to its most favored cert, then the next most favored after that, and so on. It starts over at the beginning when it's completed a cycle.

This technique fixes the problem I mentioned above, keeps many of the good aspects of your technique, and is extremely efficient to implement. It seems almost too simple, like there has to be something wrong with a technique so straightforward.

(Reply to this) (Thread)


[info]ciphergoth
2003-07-11 01:10 am UTC (link)
I'm not sure the weakness you cite is really a weakness. Putting more people on your trusted list weakens the value of the trust you give them - that's pretty much a design decision.

Your alternative metric is one I've considered, and it's an interesting one - ridiculously efficient as you say. I think it might display some weird behaviour though: if the person at the top of my list is already trusted, then the person at the top of their list gets my second trust token before the second person on my list, which seems unnatural.

However, you could use this ranked cycling rule to make the "droplet" technique deterministic. Your metric then becomes a special case with droplet size 1, and it's worth considering what other droplet sizes will do.

Note that you still have to do dead end detection.

Thanks for writing this!

(Reply to this) (Parent)


[info]purplerabbits
2003-07-11 04:26 am UTC (link)
Have you considered trying out such a system on a real life group of friends and acquaintances? It might help to show whether the results look 'natural' or reflect what the people handing out trust want or expect.

I can't answer for the social consequences, though...

(Reply to this) (Thread)


[info]ciphergoth
2003-07-11 04:46 am UTC (link)
I certainly have. The metric is easy to implement, but the code to fetch Friends lists from LJ is just complex enough that it would take time away from other things I'm supposed to be doing...

(Reply to this) (Parent)(Thread)


[info]skx
2003-07-11 07:34 am UTC (link)
I have perl magic to do this already .. if you wish I could share.

(Reply to this) (Parent)(Thread)


[info]ciphergoth
2003-07-11 09:26 am UTC (link)
Yes please!

(Reply to this) (Parent)(Thread)


[info]skx
2003-07-11 09:45 am UTC (link)
http://www.steve.org.uk/ljfm/ljf.txt

Two things you want to change:

1. The cookie must be your login cookie, insecure I know, because you must be logged in as a paid user to use the directory search.
2. The search operation, as it stands the code retrieves the 'people who list you as a friend' not 'the people who are your friends'. See the comment above the 'directory_search' function.


(Nope - that isn't my cookie. I don't trust you that much ;)

(Reply to this) (Parent)

Interesting
[info]raph
2003-07-12 11:07 pm UTC (link)
It's an interesting idea, and not one I had thought about. Without having analyzed it carefully, it would seem that it has similar attack resistance properties as the current flow-based algorithm, and some advantages of its own, especially that it's deterministic.

I'm a little unsure whether the "incomplete knowledge" thing is a real win. A good heuristic to use for the network flow algorithm is to always pick the shortest augmenting path. I think you can do this with partial graph info as well, but it may well be that your result is crisper in the number of nodes that need be known.

This model has some network flow flavor and some eigenvalue flavor. As such, I'm intrigued by the possibility that it might be useful as a "bridge" between the two.

I'll have to think about this some more. Thanks for an interesting idea!

(Reply to this)

Some more thoughts
[info]naturalborn
2003-07-14 05:29 pm UTC (link)
The superficially fast looking technique I suggest actually has serious performance problems under some circumstances. Consider the case where A certs only B, B certs (A, C), C certs (A, D), D certs (A, E), etc. Y is the seed. How long will it take to add Z to the list? I'm not sure if there's a trick to efficiently compute it.

The weakening problem of certing more I mentioned before is quite common and probably serious. Nodes which only get to cert a single other node into membership will likely cert noone into membership if they cert two other people, so it creates a very strong and artificial disincentive for certifying, which is a bad thing.

(Reply to this)


[info]gaspy
2003-07-31 10:27 am UTC (link)
Ok, I feel like I live in a different world now.

(Reply to this)


[info]brian1789
2003-08-02 02:41 pm UTC (link)
Applied to people (the LJ friends-list example) one issue immediately asserts itself: you're assuming zero impedance on each arc. A better model IMO would (keeping the water metaphor) place a valve midway along each arc, reflecting the trust-receptivity of that particular arc. The greater-rank nodes in the current algorithm are interestingly those holding greater trust *or distrust*... the closer-in nodes have greater variability, perhaps linked to more exposure or knowledge between nodes. Some arcs are completely open, others are completely blocked -- it picks up enemies as readily as friends. Lesser-rank nodes are probably less-connected or well known, and hence assuming a fixed valve setting is OK.

(Reply to this)

open source code
(Anonymous)
2003-08-05 10:55 am UTC (link)
i'd like to have a look at source code too.

(Reply to this)


[info]fool_in_spirit
2003-08-06 03:24 am UTC (link)
I wounder if it wouldn't make a better algorithm if more water would flow dependig on how many friends in common do you have with a particular person. AFter all if we have many friends and communities in common chances are that we are very strongly linked, and possibly his other friends might be more of interest for me.

I radically changed my friend list, to make a search using only my closed friends.
I probably pissed off also quite some people in the process, so I'll have to go back asap.
In the meantime, could you please tell me when I can try again? For now I am always getting the same results.

Thank you,
Pietro

(Reply to this)

(Deleted post)

[info]ciphergoth
2003-08-08 04:04 am UTC (link)
Yes, they do. I'm a boy though :-)

(Reply to this) (Parent)


Create an Account
Forgot your login or password?
Login w/ OpenID
English • Español • Deutsch • Русский…