Assignment 2
UCR - CS 172 –Spring 2024
Instructions: Submit in Canvas by 5/18. This is individual assignment.
Exercise A
1. Compute the first 1 iteration of PageRank scores (d=0.7) of each node in the graph below. Show your work.
2. Write a program (e.g., in Java) to compute the final scores of the nodes and the number of iterations needed to converge, if we use convergence constant epsilon=0.001.
3. If we use personalized PageRank with nodes 1 and 2 in the base set, write the first iteration of the PageRank formulas.
Exercise B
Show how MapReduce can be used to efficiently solve the following problem:
Given a collection C of input documents, output all words with more than 8 characters that appear in more than 2 documents.
For example, if input documents are:
D1: hi there incredible
D2: hi incredible play application
D3: it is incredible
the output will be: "incredible"
Write pseudocode for map and reduce functions.
Full points will be given to efficient solutions.
Exercise C
Consider query Q that has a total of 6 relevant results in the collection, and a search engine that returns results:
r r x x r x x x r, where x is a relevant result and r is not relevant.
1. Compute Precision-at-4, Recall-at-4, F1-at-4, Average Precision, and DCG-at-4 (assuming relevant results have score 1 and non-relevant 0).
2. Mention an application where higher precision is more important than higher recall and one for the opposite.