Discussion:
[Seeks-users] RFC: Active User Refinement of Search Results.
David Mackey
2011-09-08 05:19:09 UTC
Permalink
*Summary:*
In this RFC I will briefly outline the case for adding active user
refinements to Seeks. I'm going to try to keep this RFC as brief as
possible, but can flesh it out as objections, questions, or suggestions are
raised.
*Basic Features:*
* Ability to remove results from search queries.
* Ability to move specific results up/down within a page.
* Must remember on a per-user basis the results of active user refinements.
*Why It Matters:*
* While a large percentage of users are only interested in providing passive
(if any) feedback to search engine a small, but significant minority are
interested in active content curation. These individuals can provide results
categorization which is not easily or quickly acquired via algorithmic
mechanisms.
* Individuals interested in active curation currently have very limited and
unsatisfactory options for active curation. Most current options are
deficient in numerous ways and many are being or have been disbanded. The
availability of an open source alternative would provide content curators
with confidence in the longevity of their work.
* Results for an actively curated engine can quickly outpace those of a
machine-only or passive-feedback engine on popular terms as users are able
to quickly populate the best results.
*Preventing the Inevitable:*
The largest challenge to such an endeavor occurs with success. With success
comes the enticement for users to abuse the active content curation (or
passive for that matter) in an attempt to force results in which they have a
commercial interest to rise to the top. This can be controlled by using a
meritocracy based system in which users earn influence with the
demonstration of knowledge, ability, and integrity.
User's active curation results should always take precedence for that user
over other results (they may have valid reasons for desiring their results
to appear at the top for their own queries), but an aggregate of trusted
curation results in combination with traditional passive user behavior and
metasearch aggregation and analysis will result in the best results.
*Personal Note:*
I have personally worked with a number of social search engines in actively
curating content. In every instance I have been disappointed with the
short-term lifespan of my data due to commercial refocusing. If Seeks where
to add such active content curation abilities to the software I would
immediately begin curating content and providing refined results for
numerous topics.
Emmanuel Benazera
2011-09-09 14:49:28 UTC
Permalink
Hi David,

thanks for bringing the issue and providing a first set of questions
and answers about AURSR (so, Active User Refinement of Search
Results).

My contribution lies below.
Post by David Mackey
In this RFC I will briefly outline the case for adding active user
refinements to Seeks. I'm going to try to keep this RFC as brief as
possible, but can flesh it out as objections, questions, or suggestions
are raised.
* Ability to remove results from search queries.
* Ability to move specific results up/down within a page.
* Must remember on a per-user basis the results of active user refinements.
As a start, it is correct that Seeks does not currently fully
implement any of those basic features.

The first feature, 'Ability to remove results from search queries' is
partially implemented as follows:

- it is possible to 'reject' a result for a given query (and with no
effect on other, even similar, queries). Rejection means that the
system does not boost that result up (anymore). It does not mean that
the result will not appear in the list, as obtained from an externel search
engine or data source feed.
The reason for not eradicating a result once and for all for a given
query is that we would need a reverse mecanism in case of an
erroneous deletion occurs.

- it is possible to ban a result or a set of results throughout *all*
queries. This can be achieved by writing matching regexps and adding
them to the plugins/websearch/patterns/reject file.

Now, ability to move specific results up/down is not implemented
mainly because I am not convinced that it does make sense. Let me
explain. Wikipedia articles are built to slowly 'converge' onto a kind
of argumented truth (if that ever truely exists). Search instead is
less about the facts, and more about the user context, at least
IMO. As such, there isn't really an 'Oracle' for search, as there is
for the kind of truth Wikipedia is after. This means that a user could
spend a decent time actively ranking results, in a way that would not be
approved by others. In this case, there would still be the need for a
semi-passive re-ranking algorithm based on each user active ranking.

Btw, google did try this up/down active ranking, and latter backed down,
http://searchengineland.com/google-likedont-like-move-results-up-hide-them-or-suggest-your-own-12797

Additionnally, somme studies indicate that users tend to read groups
of results at once, e.g. the first five. And that ranking among those
groups does not impact their satisfaction and/or searches.

This is to justify why it is not *yet* implemented. The project is
totally open to such changes though.

Finally, per-user storage can only be achieved by running a Seeks node for
each user, as for now. This could change in the future, and proposals
on how to move in that direction are welcome.
Post by David Mackey
* While a large percentage of users are only interested in providing
passive (if any) feedback to search engine a small, but significant
minority are interested in active content curation. These individuals can
provide results categorization which is not easily or quickly acquired via
algorithmic mechanisms.
True. Though I'm concerned with the amount of data users may be
facing. Which is a reflection of the number of users actively
involved. Attracting users nowadays is difficult. This requires
state-of-the-art and well calibrated UIs. Both are out of the scope of
Seeks right now. That said, this does not prevent us from implementing
a form of AURSR.
Post by David Mackey
* Individuals interested in active curation currently have very limited
and unsatisfactory options for active curation. Most current options are
deficient in numerous ways and many are being or have been disbanded. The
availability of an open source alternative would provide content curators
with confidence in the longevity of their work.
Yes, I agree.
Post by David Mackey
* Results for an actively curated engine can quickly outpace those of a
machine-only or passive-feedback engine on popular terms as users are able
to quickly populate the best results.
The largest challenge to such an endeavor occurs with success. With
success comes the enticement for users to abuse the active content
curation (or passive for that matter) in an attempt to force results in
which they have a commercial interest to rise to the top. This can be
controlled by using a meritocracy based system in which users earn
influence with the demonstration of knowledge, ability, and integrity.
User's active curation results should always take precedence for that user
over other results (they may have valid reasons for desiring their results
to appear at the top for their own queries), but an aggregate of trusted
curation results in combination with traditional passive user behavior and
metasearch aggregation and analysis will result in the best results.
I agree here also.
Post by David Mackey
I have personally worked with a number of social search engines in
actively curating content. In every instance I have been disappointed with
the short-term lifespan of my data due to commercial refocusing. If Seeks
where to add such active content curation abilities to the software I
would immediately begin curating content and providing refined results for
numerous topics.
I do believe that!

So, I would love to have other users taking the time to give their
opinion here, before any move to that direction.

Though a first step would be to take advantage of new forthcoming API
http://seeks-project.info/wiki/index.php/API-0.4.0
and see what kind of API calls would be needed for even basic AURSR.

As for now, the API embeds a POST call,
http://seeks-project.info/wiki/index.php/API-0.4.0#recommend_a_result_to_the_P2P_collaborative_filter_for_a_given_query
that allows to recommend a new result for a given query. This result
does not need to come from an external source such as a search engine.
Here, the poster is the source.
Result ranking (up/down) could probably be achieved by attaching more
parameters to this call. Deletion is on its way.
Note that this would not affect results obtained from external search
engines and feeds, but only those obtained form the P2P ring.

Finally, behind AURSR lies the emergence of a meritocracy, much like
that of Wikipedia. This would require a new thread of
discussion. Briefly, I'm convinced that there is a need for a social
layer on top of Seeks itself. Such a layer would be a great step
forward. Though I believe the amount of work required is above our
current workforce.

Let me know your thoughts,

Em.

Loading...