Evaluation of Information Retrieval Systems

Created on March 14, 2013, 12:38 p.m. by Hevok & updated by Hevok on May 2, 2013, 4:40 p.m.

One has to consider how the Quality of the Results that the Search Engine gives back. Among the Quality Measures of the Evaluation of the Information Retrieval Systems the two most important ones are Recall and Precision. One has defined two Sets. The Relevant Documents which would be the right answer to the Question and one has the Retrieved Documents which are the Documents that the Retrieval System or the Search Engine really returns. In the ideal Case they should have some match. The Match or Intersection of both are the relevant Documents that have been really retrieved by the Retrieval System. In the ideal Case these two Sets would be the same, but usually one misses something that one will not find or that or find something that does not belong to the really relevant Documents. Therefore one can measure how much is the amount of relevant Documents, that really have been retrieved and found. So first of all one has the Recall and the Recall is nothing else then the number of relevant Documents that one has retrieved and one divide this by the number of relevant Documents. If one has found all relevant Documents than the Recall is one. One the other hand one has to look at the Precision. The Precision says that one takes the into account the number of relevant Documents that have been retrieved and one divides this by the number of all of the retrieved Documents. In the ideal Case there are only those Documents in the retrieved Documents that do really matter, but sometimes one might have found some Documents that do not really matter and then the precision value is somewhere between zero and one. So Recall stands for the completeness of the Search Result and Precision stands for the Accuracy of the Search Result. One can combine both Measures into a harmonic mean of both of them, which is the F Measure and with the F Measure one comprises both Values, the Recall and the Precision into one Value. This again is a Value between zero and one and as closer the Value are to one the better are these Search Systems in general

Recall = |R ∩ P| / |R|

Precision = |R ∩ P| / |P|

Fa = (1+a)⋅(RecallPrecision)/ a ⋅ (Recall + Precision)


Tags: measurements, recall, precision, quality, information, data
Categories: News, Concept
Parent: Information Retrieval

Update entry (Admin) | See changes

Comment on This Data Unit