I need this assignment completed in an Application called ‘Stata’
You would need to complete the task in Stata app and strictly follow the instructions on the paper
The exercises from Stata app must be screenshotted and put into assignment
Econometrics assignment -ASB3317
Submission date: 12 December 2022
Module leader: Ms Sheilvy Dewi Soetanto
1. The word limit for this assignment is 1200 (+-10%) which does not include referencing, tables,
figures and equations. Where an assignment exceeds the stated word limit, the following rules
a. Award a mark that reflects deficiencies in the work as submitted, that is in line with
explicit marking criteria and is proportionate to the extent to which the word limit is
b. Disregard work submitted above the word limit.
2. Assignments are to be submitted through Turnitin in Blackboard, only one submission attempt
3. Students may request an extension to the stipulated due date in exceptional circumstances only
and where supporting documentation can be offered. Such requests must be made via the
University’s online request centre no later than 1 week prior to the due date. Requests received
on or after the due date will not be considered.
4. Assignments should be submitted by 23.59 local time in the students’ country of residence.
Where an assignment is late without an agreed extension, the following rules will apply without
a. Up to 1 week late – 10% penalty.
b. From the 8th day late to the end of the second week – Assignment will be capped at the
lowest possible passing grade.
c. Assignments received after the 2 weeks will be zero graded.
(N.B – Where 10% is stated above, in practice this means that 10 points will be removed from
any grade as each assignment is graded out of 100. An assignment graded at 50 which
receives a 10% Penalty with therefore have a final grade of 40)
5. This assignment contributes to 40% of the overall module grade.
6. Before submitting your assignment, please ensure that it is properly referenced, to guard against
accusations of malpractice. Guidance on referencing can be found by clicking on the link below
and choosing “Harvard Referencing Guide”
7. The pdf has to be entitled with the Student ID number of the student. The front/first page of
the assignment has to highlight the name and ID of the student.
8. The data provided on the blackboard. Students need to reply the questions demanded in their
report depending on the last digit of the ID number is even or odd.
Additional information on avoiding allegations of malpractice can be found in the Student
If any of the above rules are not satisfied, there is a penalty of 10%.
You are expected to write the report (aim for 1200 words +-10%) to read, look and feel like an
academic article like American Economic Review (20% of the marks). The do-file of STATA has to
be in the appendix of the report. So do not copy the questions in your file (there is 10% penalty).
This data approach student achievement in secondary education of two Portuguese schools. The data
attributes include student grades, demographic, social and school-related features) and it was collected
by using school reports and questionnaires. Two datasets are provided regarding the performance in
two distinct subjects: Mathematics (mat) and Portuguese language (por). You will find the explanations
of the variables after the questions.
The students with ID even number will analyse the determinants of Math grade and the students with
ID odd number should focus on the Portuguese language grade.
1. Provide the summary of statistics of the variables that you plan to use.
2. You should use the classical linear regression model which takes the form:
= + + (1)
Where Y is the final grade and X is a variable of your choice based on the empirical education literature.
Discuss the choice of the variable and interpret the coefficient and intercept.
3. Add additional variables in your previous model based on the empirical literature like activities,
family support, etc and estimate Equation 1 again. Comment on the sign and significance of all the
coefficient estimates. Outline the economic intuition underlying your results. Are they consistent
with empirical literature? Explain. Do you think that children that one of the parents is teacher
4. You need to include the following variables :activities, paid, romantic in your existing model ( in a
multiple regression model) in addition to the variables that you have selected in question 3 and then
for each of the following questions formulate a null hypothesis and test your equation:
• paid is the only determinant of the grades.
• The relationship between the activities and grades is equal to -1.3
• activities and romantic have the same effect on grades.
5. After investigating the related empirical literature, check the determinants of mid-term grades (G1,
G2). What are the estimates of your model (coefficients of your variables)? Are they line with it?
Do you find similar determinants with the final grades? Explain.
6. Test for Heteroskedasticity the model that you think it is appropriate to determine the grades (in
other words, the model that you think that it has the necessary variables). If it is present, correct it.
7. Does it exist evidence of Multicollinearity in the model that you chose in question 6? If yes, which
variables are collinear? Discuss. If no, explain and show why.
8. Interact the romantic relationship with failures variable in the model that you used in questions 6
and 7. Comment and Interpret the coefficients. Do something similar with the Internet and absences.
Discuss and comment on the results.
9. Discuss whether endogeneity issues are present in the model that you have chosen in questions 6
and 7. If yes, Identify and explain how can you solve them? Does the empirical literature provide
potential instruments for the variables that might be endogenous? Discuss your answer.
10. Do you think that your data has the form of a panel dataset? If yes, explain the reason. If no, explain
how you need to do to make it one. Would panel dataset alternate your results? Discuss.
The definition of the variables are below:
# Attributes for both student-mat.csv (Math course) and student-por.csv (Portuguese language course)
1 school – student’s school (binary: ‘GP’ – Gabriel Pereira or ‘MS’ – Mousinho da Silveira)
2 sex – student’s sex (binary: ‘F’ – female or ‘M’ – male)
3 age – student’s age (numeric: from 15 to 22)
4 address – student’s home address type (binary: ‘U’ – urban or ‘R’ – rural)
5 famsize – family size (binary: ‘LE3’ – less or equal to 3 or ‘GT3’ – greater than 3)
6 Pstatus – parent’s cohabitation status (binary: ‘T’ – living together or ‘A’ – apart)
7 Medu – mother’s education (numeric: 0 – none, 1 – primary education (4th grade), 2 â€“ 5th to 9th grade,
3 â€“ secondary education or 4 â€“ higher education)
8 Fedu – father’s education (numeric: 0 – none, 1 – primary education (4th grade), 2 â€“ 5th to 9th grade, 3
â€“ secondary education or 4 â€“ higher education)
9 Mjob – mother’s job (nominal: ‘teacher’, ‘health’ care related, civil ‘services’ (e.g. administrative or
‘at_home’ or ‘other’)
10 Fjob – father’s job (nominal: ‘teacher’, ‘health’ care related, civil ‘services’ (e.g. administrative or police),
‘at_home’ or ‘other’)
11 reason – reason to choose this school (nominal: close to ‘home’, school ‘reputation’, ‘course’ preference or
12 guardian – student’s guardian (nominal: ‘mother’, ‘father’ or ‘other’)
13 traveltime – home to school travel time (numeric: 1 – <15 min., 2 – 15 to 30 min., 3 – 30 min. to 1 hour,
or 4 –
14 studytime – weekly study time (numeric: 1 – <2 hours, 2 – 2 to 5 hours, 3 – 5 to 10 hours, or 4 – >10
15 failures – number of past class failures (numeric: n if 1<=n<3, else 4)
16 schoolsup – extra educational support (binary: yes or no)
17 famsup – family educational support (binary: yes or no)
18 paid – extra paid classes within the course subject (Math or Portuguese) (binary: yes or no)
19 activities – extra-curricular activities (binary: yes or no)
20 nursery – attended nursery school (binary: yes or no) 21 higher – wants to take higher education (binary:
yes or no)
22 internet – Internet access at home (binary: yes or no)
23 romantic – with a romantic relationship (binary: yes or no)
24 famrel – quality of family relationships (numeric: from 1 – very bad to 5 – excellent)
25 freetime – free time after school (numeric: from 1 – very low to 5 – very high)
26 goout – going out with friends (numeric: from 1 – very low to 5 – very high)
27 Dalc – workday alcohol consumption (numeric: from 1 – very low to 5 – very high)
28 Walc – weekend alcohol consumption (numeric: from 1 – very low to 5 – very high)
29 health – current health status (numeric: from 1 – very bad to 5 – very good)
30 absences – number of school absences (numeric: from 0 to 93)
# these grades are related with the course subject, Math or Portuguese:
31 G1 – first period grade (numeric: from 0 to 20)
31 G2 – second period grade (numeric: from 0 to 20)
32 G3 – final grade (numeric: from 0 to 20, output target)