QuestionFinal ExamANA 500Use gretl and the State Smoking dataset to answer the questions below.Upload a Word or PDF document that contains your answers.Upload your gretl script file in addition to your Word or PDF document.1. Dataset:a) How many variables are in this dataset?b) How many observations are in this dataset?c) What are the elements/entities in this dataset?2. Descriptive statistics:a) Calculate descriptive statistics for the variables “consumption” and “cig_price”.b) What types of variables are “consumption” and “cig_price”?c) Provide a scatterplot between cigarette consumption and cigarette prices (do not include a fit line).Describe the relationship between the two variables shown in the scatterplot.d) Provide an estimated density plot for the variable “consumption”. Is the variable skewed? If yes,in which direction?e) Calculate separate descriptive statistics for each region for the variables "consumption" and"cig_price".f) What type of variable is "region"?g) In which region is the average price of a pack of cigarettes the highest? In which region is percapita cigarette consumption the highest?3. Simple linear regression:Estimate a first-order simple linear regression model where "consumption" is the outcome of interest and"cig_price" is the predictor.a) Write the estimated regression equation.b) Interpret the coefficient on the "cig_price" variable.c) Is the estimated coefficient on the "cig_price" variable statistically significant? How do youknow?d) Interpret the model’s R-squared.e) Provide a scatterplot between “consumption” and “cig_price” that includes the estimatedregression equation.f) Use your estimated model to predict per capita cigarette consumption in a state where the averageprice of a pack of cigarettes is $4.g) Explain whether it is meaningful to interpret the estimated intercept in your model.4. Multiple linear regression:a) Now include “med_income” as a predictor in the model that you estimated for question 3.Interpret the estimated coefficients.b) Are the estimated coefficients statistically significant? How do you know?c) Compare the adjusted R-squared from this model and the model you estimated in question 3. Whatcan you infer from comparing the adjusted R-squared between the two models?d) Interpret the results from the F-test for this model.e) Use your estimated model to predict per capita cigarette consumption in a state where the averageprice of a pack of cigarettes is $4 and the median household income is $42,000.f) Now estimate a model that allows the effect of cigarette prices to depend on the state’s medianhousehold income. Interpret the model’s estimated coefficients.g) Include region dummy variables as predictors in the model that you estimated in question 3.Interpret the estimated coefficients and conduct an F-test to determine if the estimated coefficientson the region dummies are jointly significant.5. Non-linear functional forms:a) Estimate a quadratic regression model where the outcome of interest is cigarette consumption andthe predictor variable is the price of a pack of cigarettes.b) Estimate the effect of an increase in the price of a pack of cigarettes from $4 to $5.c) Provide a scatterplot that includes the estimated regression equation for the quadratic model.d) Estimate a linear-log regression model where the outcome of interest is cigarette consumption andthe predictor is the price of a pack of cigarettes.e) Estimate the effect of an increase in the price of a pack of cigarettes from $4 to $5.f) Provide a scatterplot that includes the estimated regression equation for the linear-log model.g) Which model provides a better fit to the data? The quadratic or linear-log model? Which model doyou think is more appropriate from a theoretical perspective?6. Binary dependent variable:a) Estimate a linear probability model where the outcome of interest is whether a state has an aboveaverage smoking rate and the predictor is cigarette taxes. Interpret the estimated slope coefficient.b) Provide a scatterplot the includes the estimated regression equation.c) In general, what is the main problem associated with the linear probability model? Is that problemencountered for the model you estimated?d) Estimate a logit model using the same variables you used for the linear probability model.Calculate and interpret the change in the odds of a state having an above average smoking ratefrom an increase in cigarettes taxes from $1 to $2.e) Interpret the marginal effect at the mean value of cigarette taxes.f) Interpret the results from the classification table provided in the output from the logit regression.Does the model seem to do a good job predicting when a state has an above average smoking rate?g) Why is the logit model typically preferred over the linear probability model?

