代做SOCI 252: Homework 9代写数据结构程序

- 首页 >> OS编程

SOCI 252: Homework 9

Social Factors Affecting Maternal and Child Health

In this exercise, we will analyze real, historical birth data. The goal is to model the relationship between length of pregnancy and measured weight of infant at birth. The dataset we will use is in the ncbirths.csv file. Table 1 shows the names and descriptions of the variables in this dataset, where the unit of observation is infants.

Variable

Description

weight

Measured weight of infant at birth (in pounds)

gender

Infant’s assigned gender at birth: “female” or “male”

weeks

Length of pregnancy (in weeks)

fage

Father’s age at infant’s birth (in years)

mage

Mother’s age at infant’s birth (in years)

firstvisit

Date family first sought prenatal care (in weeks)

marital

Where the mother is married: 1 = yes, 0 = no

smoker

Whether the mother is a smoker: 1 = yes, 0 = no

whitemom

Whether the mother is white: 1 = yes, 0 = no

In this problem set, we practice answering the following four questions related to causal studies: (1) What is the estimated average treatment effect? (2) Is the effect statistically significant at the 5% level? (3) Can we interpret the effect as causal? And (4) Can we generalize the results?

As always, start by loading and looking at the data.

1. What is the estimated average causal effect of being a smoker on the birth weight of a baby?

a. Given our research question, what should be our outcome variable (Y)? Visualize its distribution and comment on the distribution of birth weight.

b. Given our research question, a binary variable identifying the smoker status of the mother should be our treatment variable (X). Visualize its distribution and comment on the number of births to mothers who were smoker and mothers who were non-smokers.

c. Now that we have both our Y and X variables, fit a linear model to the data in such a way that the estimated slope coefficient is equivalent to the difference-in-means estimator you are interested in and store the fitted model in an object called fit

d. Create a visualization of the relationship between X and Y and add the fitted line. (Hint: The functions plot() and abline() might be helpful here.)

e. What is the estimated slope coefficient, ?

f. Now, let’s answer the question: What is the estimated average treatment effect? Provide a full substantive answer (make sure to include the assumption, why the assumption is reasonable, the treatment, the outcome, as well as the direction, size, and unit of measurement of the average treatment effect)

2. Is the effect statistically significant at the 5% level?

a. Let’s start by specifying the null and alternative hypotheses. Please provide both the mathematical notations and their meaning.

b. What is the value of the observed test statistic, zobs? (Hint: the code summary() or summary()$coeff be helpful here.)

c. What is the associated p-value?

d. Now, let’s answer the question: Is the effect statistically significant at the 5% level? Please provide your reasoning.

3. Is the effect statistically significant at the 5% level after controlling for a confounder? Repeat Part 2, but include weeks in your model.

a. Let’s start by specifying the null and alternative hypotheses for our smoking hypotheses. Please provide both the mathematical notations and their meaning.

b. What is the value of smoker’s observed test statistic, zobs? (Hint: Specify your new model, save to the environment, and then check summary() or summary()$coeff.)

c. What is the associated p-value of smoker?

d. Now, let’s answer the question: Is the effect of smoking statistically significant at the 5% level? Please provide your reasoning.

4. Can we interpret the estimated effect as causal? In other words, how strong is the internal validity of this study? Have the researchers accurately measured the average causal effect on the sample of candidates who were part of the study? Please explain your reasoning.

5. Can we generalize the results? In other words, how strong is the external validity of this study? Please explain your reasoning and be specific about what population you think the findings can or cannot be generalized to. 





站长地图