Given an array of integers with possible duplicates, randomly output the index of a given target number. You can assume that the given target number must exist in the array.
Note:
The array size can be very large. Solution that uses too much extra space will not pass the judge.
Example:
int[] nums = new int[] {1,2,3,3,3};
Solution solution = new Solution(nums); // pick(3) should return either index 2, 3, or 4 randomly. Each index should have equal probability of returning.
solution.pick(3); // pick(1) should return 0. Since in the array only nums[0] is equal to 1.
solution.pick(1);
public class Solution { int[] nums;
Random r = new Random(); public Solution(int[] nums) {
this.nums = nums;
} public int pick(int target) {
ArrayList<Integer> idxs = new ArrayList<Integer>();
for (int i = ; i < nums.length; i++) {
if (target == nums[i]) {
idxs.add(i);
}
}
return idxs.get(r.nextInt(idxs.size()));
}
}
Simple Reservior Sampling approach
public class Solution { int[] nums;
Random rnd; public Solution(int[] nums) {
this.nums = nums;
this.rnd = new Random();
} public int pick(int target) {
int result = -;
int count = ;
for (int i = ; i < nums.length; i++) {
if (nums[i] != target)
continue;
if (rnd.nextInt(++count) == )
result = i;
} return result;
}
}
Simple Reservior Sampling
Suppose we see a sequence of items, one at a time. We want to keep a single item in memory, and we want it to be selected at random from the sequence. If we know the total number of items (n), then the solution is easy: select an index ibetween 1 and n with equal probability, and keep the i-th element. The problem is that we do not always know n in advance. A possible solution is the following:
- Keep the first item in memory.
- When the i-th item arrives (for {\displaystyle i>1}
):
- with probability {\displaystyle 1/i}
, keep the new item (discard the old one)
- with probability {\displaystyle 1-1/i}
, keep the old item (ignore the new one)
- with probability {\displaystyle 1/i}
So:
- when there is only one item, it is kept with probability 1;
- when there are 2 items, each of them is kept with probability 1/2;
- when there are 3 items, the third item is kept with probability 1/3, and each of the previous 2 items is also kept with probability (1/2)(1-1/3) = (1/2)(2/3) = 1/3;
- by induction, it is easy to prove that when there are n items, each item is kept with probability 1/n.