Mean Average Precision mAP for evaluating information retrieval or recommendation systems

This blog explains mAP in layman’s terms.

Shivani Jadhav
2 min readJan 16, 2022

Let’s say you have two different ranking models. Given a query, these ranking models give you a list of documents in ranked manner i.e. the most relevant ones according to the model for this particular query are kept at top position. Now you want to evaluate which ranking model is better.

Consider the following example:

Query: “how to eat an apple?”

Document1: “Apple launched a new iPhone”

Document2: “Apples are very healthy and should be eaten at the breakfast”

Document3: “An Apple a day keeps the doctor away”

Document4 : “Banana and apple are my favourite fruits”

Ranking model1 ranks the four documents as follows:

  1. Document2
  2. Document3
  3. Document4
  4. Document1

Ranking model2 ranks the four documents as follows:

  1. Document3
  2. Document1
  3. Document4
  4. Document2

A human annotator marks the four documents as relevant or irrelevant for that query and the four documents are marked as follows:

  1. Document1 – Irrelevant
  2. Document2 – Relevant
  3. Document3 – Relevant
  4. Document4 – Irrelevant

Mapping the lists given by the two ranking models with the human annotations.

Ranking model1 – [Relevant, Relevant, Irrelevant, Irrelevant]

Ranking model2 – [Relevant, Irrelevant, Irrelevant, Relevant]

Based on this we can say that ranking model1 is better than ranking model2 because it ranks the relevant documents at higher level than the irrelevant ones.

How is mAP@2 calculated ?

mAP@2 for ranking model1=[1/1+2/2]/2=1 and in percentage it is 100%

Now let’s understand each term in the calculation

mAP@2 for ranking model1=[1/1+2/2]/2=1 and in percentage it is 100%

1/1 is nothing but Precision@1 which means that till position 1 how many relevant documents are seen divided by the value of that position

mAP@2 for ranking model1=[1/1+2/2]/2=1 and in percentage it is 100%

Similarly, 2/2 is precision@2 which means till position 2, 2 relevant documents were observed out of 2 documents in total.

mAP@2 for ranking model1=[1/1+2/2]/2=1 and in percentage it is 100%

And the last 2 denotes the average (mean)

mAP@2 for ranking model2 =[1/1+1/2]/2=0.75 and in percentage it is 75%

The second value in the calculation 1/2 says that till position 2, 1 relevant document is observed out of the total 2 documents that are observed till position2.

How is mAP calculated?

mAP ranking model1= [1/1+2/2+2/3+2/4]/4= 0.7925 or 79.25%

mAP ranking model2 = [1/1+1/2+1/3+2/4]/4= 0.5833 or 58.33%

Based on mAP scores, ranking model1 can be considered better than ranking model2.

Food for thought: Why can’t we rely on accuracy as an evaluation metric when it comes to evaluating ranking models in information retrieval or Recommenders systems?

--

--

Shivani Jadhav

Data scientist Expertise : Information retrieval Area of interest-NLP, Machine learning, deep learning