Part 1: 781 Fun Yanay gets a lot of spam calls. An area code is defined to be a three digit number..

Part 1: 781 Fun Yanay gets a lot of spam calls. An area code is defined to be a three digit number from 200-999 inclusive. In reality, many of these area codes are not in use, but for this question we'll simplify things and assume they all are. Throughout these questions, you should assume that Yanay's area code is 781. Question 1. Assuming each area code is just as likely as any other, what's the probability that the area code of two back to back spam calls are 781? In [2]: prob_781 = (1/800) ** 2 prob_781 Out[2]: 1.56252-06 Question 2. Rohan already knows that Yanay's area code is 781. Rohan randomly guesses the last 7 digits (0-9 inclusive) of his phone number. What's the probability that Rohan correctly guesses Yanay's number, assuming he's equally likely to choose any digit? Note: A phone number contains an area code and 7 additional digits, i.e. XXX-XXX-XXXX In [7]: prob_yanay_num = (1/9) ** 7 prob_yanay_num Out[7]: 2.0907515812876894-07 Question 4. Which of the following test statistics would be a reasonable choice to help differentiate between the two hypotheses? Hint: For a refresher on choosing test statistics, check out the textbook section on Test Statistics. 1. The proportion of area codes that are 781 in 50 random calls 2. The total variation distance (TVD) between probability distribution of randomly chosen area codes, and the observed distribution of area codes. (Remember the possible area codes are 200-999 inclusive) 3. The probability of getting an area code of 781 out of all the possible area codes. 4. The proportion of area codes that are 781 in 50 random calls divided by 2 5. The number of times you see the area code 781 in 50 spam calls Assign reasonable_test_statistics to an array of numbers corresponding to these test statistics. In [19]: reasonable_test_statistics = make_array() For the rest of this question, suppose you decide to use the number of times you see the area code 781 in 50 spam calls as your test statistic. Question 5. Write a function called simulate that generates exactly one simulated value of your test statistic under the null hypothesis. It should take no arguments and simulate 50 area codes under the assumption that the result of each area is sampled from the range 200-999 inclusive with equal probability. Your function should return the number of times you saw the 781 area code in those 50 random spam calls. In [13]: possible_area_codes = ... def simulate(): # Call your function to make sure it works simulate() Question 6. Generate 20,000 simulated values of the number of times you see the area code 781 in 50 random spam calls. Assign test_statistics_under_null to an array that stores the result of each of these trials. Hint: Use the function you defined in Question 5. In [17]: N test_statistics_under_null = ... repetitions = test statistics under null