# Regression (IB SL)

$$\def\frac{\dfrac} \, 1 (\mathrm{IB} / \mathrm{sl} / 2019 / November / Paper 2 / \mathrm{q} 1) [Maximum mark: 6] The number of messages, M, that six randomly selected teenagers sent during the month of October is shown in the following table. The table also shows the time, T, that they spent talking on their phone during the same month.$$\begin{array}{|l|c|c|c|c|c|c|}\hline \text{Time spent talking}&&&&&&\\ \text{on their phone (T minutes)} & 50 & 55 & 105 & 128 & 155 & 200 \\\hline \text{Number of messages (M)} & 358 & 340 & 740 & 731 & 800 & 992 \\\hline\end{array}$$The relationship between the variables can be modelled by the regression equation M=a T+b.  (a) Write down the value of a and of b.  (b) Use your regression equation to predict the number of messages sent by a teenager that spent 154 minutes talking on their phone in October. 2 (IB/sl/2019/May/paper2tz1/q5) [Maximum mark: 6] A jigsaw puzzle consists of many differently shaped pieces that fit together to form a picture. Jill is doing a 1000 -piece jiggaw puzzle. She started by sorting the edge pieces from the interior pieces. Six times she stopped and counted how many of each type she had found. The following table indicates this information.$$\begin{array}{|l|r|r|r|r|r|r|}\hline \text{Edge pieces $(x)$} & 16 & 31 & 39 & 55 & 84 & 115 \\\hline \text{Interior pieces $(y)$} & 89 & 239 & 297 & 402 & 580 & 802 \\\hline\end{array}$$Jill models the relationship between these variables using the regression equation y=a x+b.  (a) Write down the value of a and of b.  (b) Use the model to predict how many edge pieces she had found when she had sorted a total of 750 pieces. 3 (IB/sl/2019/May/paper2tz2/q1) [Maximum mark: 6] A group of 7 adult men wanted to see if there was a relationship between their Body Mass Index (BMI) and their waist size. Their waist sizes, in centimetres, were recorded and their BMI calculated. The following table shows the results.$$\begin{array}{|l|c|c|c|c|c|c|c|}\hline \text{Waist $(x \mathrm{~cm})$} & 58 & 63 & 75 & 82 & 93 & 98 & 105 \\\hline \text{BMI $(y)$ }& 19 & 20 & 22 & 23 & 25 & 24 & 26 \\\hline\end{array}$$The relationship between x and y can be modelled by the regression equation y=a x+b.  (a) (i) Write down the value of a and of b. (ii) Find the correlation coefficient.  (b) Use the regression equation to estimate the BMI of an adult man whose waist size 4 (IB/sl/2018/November/Paper2/q2) [Maximum mark: 6] The following table shows the hand lengths and the heights of five athletes on a sports team.$$\begin{array}{|c|r|r|r|r|r|}\hline \text{Hand length $(x \mathrm{~cm})$} & 21.0 & 21.9 & 21.0 & 20.3 & 20.8 \\\hline \text{Height $(y \mathrm{~cm})$ }& 178.3 & 185.0$& 177.1 & 169.0 & 174.6 \\\hline\end{array}$$The relationship between x and y can be modelled by the regression line with equation y=a x+b.  (a) (i) Find the value of a and of b. (ii) Write down the correlation coefficient.  (b) Another athlete on this sports team has a hand length of 21.5 cm. Use the regression equation to estimate the height of this athlete. 5 (IB/sl/2018/May/paper2tz1/q8) [Maximum mark: 13] The following table shows values of \ln x and \ln y.$$\begin{array}{|l|l|l|l|l|}\hline \ln x & 1.10 & 2.08 & 4.30 & 6.03 \\\hline \ln y & 5.63 & 5.22 & 4.18 & 3.41 \\\hline\end{array}$$The relationship between \ln x and \ln y can be modelled by the regression equation \ln y=a \ln x+b  (a) Find the value of a and of b.  (b) Use the regression equation to estimate the value of y when x=3.57. The relationship between x and y can be modelled using the formula y=k x^{\prime \prime}. where k \neq 0, n \neq 0, n \neq 1.  (c) By expressing \ln y in terms of \ln x, find the value of n and of k.  6 (IB/sl/2018/May/paper2tz2/q1) [Maximum mark: 6] The following table shows the mean weight, y \mathrm{~kg}, of children who are x years old.$$\begin{array}{|c|c|c|c|c|c|}\hline \text{Age ($x$years) }& 1.25 & 2.25 & 3.5 & 4.4 & 5.85 \\\hline \text{Weight$(y \mathrm{~kg})$} & 10 & 13 & 14 & 17 & 19 \\\hline\end{array}$$The relationship between the variables is modelled by the regression line with equation y=a x+b.  (a) (i) Find the value of a and of b, (ii) Write down the correlation coefficient.  (b) Use your equation to estimate the mean weight of a child that is 1.95 years old. 7 (IB/sl/2017/November/Paper2/g8) [Maximum mark: 14] Adam is a beekecper who collected data about monthly honcy production in his bee hives. The data for six of his hives is shown in the following table.$$\begin{array}{|c|c|c|c|c|c|c|}\hline \text{Number of bees$(N)$} & 190 & 220 & 250 & 285 & 305 & 320 \\\hline \text{Monthly honey production in grams$(P)$} & 900 & 1100 & 1200 & 1500 & 1700 & 1800 \\\hline\end{array}$$The relationship between the variables is modelled by the regression line with equation P=a N+b.  (a) Write down the value of a and of b.  (b) Use this regression line to estimate the monthly honey production from a hive that has 270 bees. Adam has 200 hives in total. He collects data on the monthly honey production of all the hives. This data is shown in the following cumulative frequency graph. 8 Question (IB/sl/2017/November/Paper2/q8b) 7 continued Adam's hives are labelled as low, regular or high production, as defined in the following table.$$\begin{array}{|l|c|c|l|}\hline \text{Type of hive }&\text{ low} & \text{regular} & \text{high} \\\hline \text{Monthly honey production}&&&\\ \text{ in grams$(P)$} & P \leq 1080 & 1080 < P \leq k & P>k \\\hline\end{array}$$ (c) Write down the number of low production hives. Adam knows that 128 of his hives have a regular production.  (d) Find (i) the value of k; (ii) the number of hives that have a high production.  (e) Adam decides to increase the number of bees in each low production hive. Research suggests that there is a probability of 0.75 that a low production hive becomes a rogular production hive. Calculate the probability that 30 low production hives become regular production hives. 9 (IB/sl/2017/May/paper1tz1/q4) [Maximum mark: 6] Jim heated a liquid until it boiled. He measured the temperature of the liquid as it cooled. The following table shows its temperature, d degrees Celsius, I minutes after it boiled.$$\begin{array}{|c|c|c|c|c|c|c|}\hline t(\min ) & 0 & 4 & 8 & 12 & 16 & 20 \\\hline d\left({ }^{\circ} \mathrm{C}\right) & 105 & 98.4 & 85.4 & 74.8 & 68.7 & 62.1 \\\hline\end{array}$$ (a) (i) Write down the independent variable. (ii) Write down the bolling temperature of the liquid. Jim believes that the relationship between d and t can be modelled by a linear regression equation.  (b) Jim describes the correlation as very strong. Circle the value below which best represents the correlation coefficient.$$\begin{array}{lllll}\hline 0.992 & 0.251 & 0 & -0.251 & -0.992 \\\hline\end{array}$$ (c) Jim's model is d=-2.24 t+105, for 0 \leq t \leq 20. Use his model to predict the decrease in temperature for any 2 minute interval. 10 (IB/sl/2017/May/paper2tz2/q2) [Maximum mark: 7] The maximum temperature T, in degrees Celsius, in a park on six randomly selected days is shown in the following table. The table also shows the number of visitors, N, to the park on each of those six days.$$\begin{array}{|l|c|c|c|c|c|c|}\hline \text{Maximum temperature$(T)$} & 4 & 5 & 17 & 31 & 29 & 11 \\\hline \text{Number of visitors$(N)$} & 24 & 26 & 36 & 38 & 46 & 28 \\\hline\end{array}$$The relationship between the variables can be modelled by the regression equation N=a T+b.  (a) (i) Find the value of a and of b. (ii) Write down the value of r.  (b) Use the regression equation to estimate the number of visitors on a day when the maximum temperature is 15^{\circ} \mathrm{C}. 11 (IB/sl/2016/May/paper2tz1/q5) [Maximum mark: 6] The mass M of a decaying substance is measured at one minute intervals. The points (t, \ln M) are plotted for 0 \leq t \leq 10, where t is in minutes. The line of best fit is drawn. This is shown in the following diagram. The correlation coefficient for this linear model is r=-0.998.  (a) State two words that describe the linear correlation between \ln M and t.  (b) The equation of the line of best fit is \ln M=-0.12 t+4.67. Given that M=a \times b, find the value of b. 12 (IB/sl/2016/May/paper2tz2/q8) [Maximum mark: 15] The price of a used car depends partly on the distance it has travelled. The following table shows the distance and the price for seven cars on 1 January 2010 .$$\begin{array}{|l|l|l|l|l|l|l|l|}\hline \text{Distance,$x \mathrm{~km}$}& 11500 & 7500 & 13600 & 10800 & 9500 & 12200 & 10400 \\\hline \text{Price, y dollars} & 15000 & 21500 & 12000 & 16000 & 19000 & 14500 & 17000 \\\hline\end{array}$$The relationship between x and y can be modelled by the regression equation y=a x+b.  (a) (i) Find the correlation coefficient. (ii) Write down the value of a and of b. On 1 January 2010 . Lina buys a car which has travelled 1100 \mathrm{~km}.  (b) Use the regression equation to estimate the price of Lina's car, giving your answer to the nearest 100 dollars. The price of a car decreases by 5 \% each year.  (c) Calculate the price of Lina's car after 6 years. Lina will sell her car when its price reaches 10000 dollars.  (d) Find the year when Lina sells her car.  13 (IB/sl/2015/November/Paper2/q9) [Maximum mark: 16] An environmental group records the numbers of coyotes and foxes in a wildlife reserve after t years, starting on 1 January 1995 . Let c be the number of coyotes in the reserve after t years. The following table shows the number of coyotes after t years.$$\begin{array}{|c|c|c|c|c|c|}\hline \text{number of years} (t) & 0 & 2 & 10 & 15 & 19 \\\hline \text{number of coyotes} (c)& 115 & 197 & 265 & 320 & 406 \\\hline\end{array}$$The relationship between the variables can be modelled by the regression equation c=a t+b  (a) Find the value of a and of b.  (b) Use the regression equation to estimate the number of coyotes in the reserve when I=7 Let f be the number of foxes in the reserve after t years. The number of foxes can be modelled by the equation f=\frac{2000}{1+99 \mathrm{e}^{-k t}}, where k is a constant.  (c) Find the number of foxes in the reserve on 1 January 1995 .  (d) After five years, there were 64 foxes in the reserve. Find k.  (e) During which year were the number of coyotes the same as the number of foxes? 14 (IB/sl/2015/May/paper2tz1/q1) [Maximum mark: 7] The following table shows the average number of hours per day spent watching television by seven mothers and each mother's youngest child.$$\begin{array}{|l|c|c|c|c|c|c|c|}\hline\text{Hours per day that}&&&&&&&\\ \text{ a mother watches television}(x) & 2.5 & 3.0 & 3.2 & 3.3 & 4.0 & 4.5 & 5.8 \\\hline \text{Hours per day that}&&&&&&&\\ \text{ her child watches television}(y) & 1.8 & 2.2 & 2.6 & 2.5 & 3.0 & 3.2 & 3.5 \\\hline\end{array}$$The relationship can be modelled by the regression line with equation y=a x+b.  (a) (i) Find the correlation coefficient. (ii) Write down the value of a and of b. Elizabeth watches television for an average of 3.7 hours per day.  (b) Use your regression line to predict the average number of hours of television watched per day by Elizabeth's youngest child. Give your answer correct to one decimal place. 15 (IB/sl/2015/May/paper2tz2/q3) [Maximum mark: 6] The following table shows the sales, y millions of dollars, of a company, x years after it opened.$$\begin{array}{|l|c|c|c|c|c|}\hline \text{Time after opening ($x$years)} & 2 & 4 & 6 & 8 & 10 \\\hline \text{Sales ($y$millions of dollars)} & 12 & 20 & 30 & 36 & 52 \\\hline\end{array}$$The relationship between the variables is modelled by the regression line with equation$y=a x+b$.  (a) (i) Find the value of$a$and of$b$. (ii) Write down the value of$r$.  (b) Hence estimate the sales in millions of dollars after seven years.  Answer ( Regression) (a)$a=4.30,b-163$(b) 826 (a)$a=6.93,b=8.81$(b) 93 (a)(i)$a=0.141, b=11.1$(ii)$r=0.978$(b)$24.5$(a)(i)$a=9.91, b=-31.3$(ii)$r=0.986$(b) 182 (a)$a=-0.454, b=6.14$(b)$y=261$(c)$n=-0.454, k=465$(a)(i)$a=1,92, b=7.98$(ii)$r=0.985$(b)$11.7$(a)$a=6.96, b=-455$(b)$p=1420$(c) 40 (d) (i)$k=1640$(ii) 32 (e)$0.144$(a) (i)$t$(ii) 105 (b)$-0.992$(c)$4.48$(a)(i)$\quad a=0.667, b=22.2$(ii)$r=0.923$(b) 32 (a) Strong, Negative (b)$b=e^{-0.12}$(a)(i)$r=-0.994$(ii)$a=-1.58, b=33500$(b) 16100 (c) 11800 (d) 2019 (a)$\quad a=13.4, b=137$(b) 231 (c) 20 (d)$k=-\frac{1}{5} \ln \left(\frac{11}{36}\right)$(e) 2007  (a) (i)$r=0.947$(ii)$a=0.501, b=0.804$(b)$2.7$ (a) (i)$a=4.8, b=1.2$(ii)$r=0.988$(b)$34.8\$