From a Google Group I joined recently:
If you plan on using bayesian categorization, i would suggest ruinning the raw text through a Shannon Information theory-like filter to identify the most relevant words in a text. With even a mild cut on the relevancy you can reduce the index size while increasing the overall quality of the matches…. and all this would be language-independent.
Regarding the tagging as trusted or not trusted: having trusted editors is always good, but then you risk not being able to scale, and to be attacked for enforcing a left/right/religious/atheist/whatever point of view. What I would love to see is a system that correlates info,a and then lets users understand it. For example: I have A, B, C, D, and E submitting reports. A, B and C tell me that the sun is yellow and the grass is green, D tells me that the sun is red and the grass is blue, E tells me that the sun is red and grass is yellow. The system will cluster A, B, C, and give me a value that determines the cluster veracity as a function of the veracity of the 3 people submitting the reports, while it shows that D agrees mildly with them while E doesn’t on any point. As as user, I can see a computed veracity that will point me to the most likely truthful reports, but if I know for a fact that the grass in that region is yellow, as E states, then maybe I will trust E more than the others. This system would offer several advantages: besides lowering the challenge of identifying experts on the field in a short time, it would show who departs more often from the truth, and allow users to choose their “side” of the truth, while being aware of other points of view.
No specific thoughts on how this applies to the Connection Engine yet… I just wanted to record it to reference at a later point.
0 Responses to “One approach to trust and reputation”
Leave a Reply
You must login to post a comment.