Biostatistics Physical Health Data Analysis

Total Length: 5956 words ( 20 double-spaced pages)

Total Sources: 0

Page 1 of 20

BIOSTATISTICS 2NOTES ABOUT DATA:• 2019 BRFSS SPSS Data File.sav – These data are from the Behavioral Risk Factor Surveillance System (BRFSS). The BRFSS collects “state data about U.S. residents regarding their health-related risk behaviors, chronic health conditions, and use of preventive services.” Be sure to read the Background section of the 2019 BRFSS Overview for more details so you get a little better idea of the BRFSS and how the data are used. Investigators all over the country use these data to conduct research about many different characteristics and how they affect health outcomes. The data file for this project is not the complete data set. There are over 250 variables in the complete data set. I narrowed it down to the few variables I want you to use for this project and simplified coding for the sake of your sanity and to best demonstrate your learning of concepts.INSTRUCTIONS: (Please read each question thoroughly)You are a statistician who is tasked with helping a researcher who is interested in determining what characteristics influence a person to report poor health. Using the BRFSS, the researcher find that there are a few variables that can help her answer that question. She first asks if you can conduct some analyses to determine what characteristics predict someone reporting that they had at least one day in the last day where they reported poor physical health (PHYSHLTH_YES_NO) in the last 30 days. In addition, for those who reported at least one day of poor physical health, she is also interested in determining what influences the reported number of days of poor physical health (PHYSHLTH_DAYS). Among other characteristics, the investigator is primarily interested in determining whether veteran status (variable name: VETERAN), and adverse childhood experiences (ACES) (variable name: ACES_Score) influence these two dependent variables. Because there are a range of confounding variables to consider, the researcher also collected data about sex, health insurance, marital status, education, home ownership, income, age, smoking, alcohol use, and exercise, among others. Your job is to help the researcher answer her research questions.1. Using the graphing options in SPSS, choose two appropriate graphical display options to describe PHYSHLTH_DAYS. You should be able to describe whether this variable is normally distributed, and whether there are outliers in the data using the two display options you choose. Copy and paste your graphs/charts below and for each, provide an interpretation of the graph, and explain why you chose that option.The selected graphical options are the histogram with normality curve and dot plot.The Histogram with Normality CurveFigure 1.1 above shows a histogram with a normal distribution curve. The histogram was selected because it provides a view of the central tendency, spread, and shape of the data set, including the presence of outliers. By showing the shape of the dataset, the histogram will provide an at-a-glance view of whether or not the dataset presents a normal distribution. The dataset presents a normal distribution as evidenced by the single-peaked bell-shaped normality curve, with observations spread out symmetrically around the mean. No outliers are evident from the distribution.Figure 1.1The Dot PlotFigure 2.1 above presents a dot plot. The dot plot, like the histogram, presents a view of the frequency distributions of the different data points in the dataset. However, the dot plot provides information on the frequency of individual values, and not a range of values like the histogram. The dots appear as complete bars due to the large number of values attached to each data point. Longer bars represent higher frequencies. Thus, since it focuses on individual data points, the dot plot provides a more effective way of assessing whether outliers exist in the data set than the histogram. Outliers are data points that can be termed either extremely high or extremely low as compared to the rest of the data point or the nearest data point. The dot plot shows that there are no outliers in the data set.Figure 2.12. The variable PHYSHLTH_YES_NO is a categorical, binary, nominal variable (Either people report poor physical health (Yes=1), or they do not (No=0)). Based on this categorical variable, use the appropriate statistical test to determine if there is a difference in ACES_Score, and ALCOHOL between the groups who report poor physical health. You will be doing two hypothesis tests…one for ACES_Score, and one for ALCOHOL. For each test, conduct a formal hypothesis test to answer this question (choose the appropriate statistical test, explain why you chose it, write out your null and alternative hypotheses, run the test, and interpret the results). Include appropriate output from SPSS to show what you did.To test whether there is a difference in ACES score between the two groups: YES and NO, the independent samples t-test will be used. The independent samples t-test will answer this question by comparing the means of the two independent groups in regard to the ACES score to determine whether the ACES mean score for the group that reports YES (poor physical health) differs significantly from that which reports NO (good physical health). The independent samples t-test is appropriate because the data meets the following requirements: i) The dependent variable ACES score is a continuous ratio variable, ii) the independent variable PHYSHLTH_YES_NO is a categorical variable with only two categories (Yes and No), and iii) the groups or categories are independent and hence, a participant cannot be in both groups. The null and alternative hypotheses for the independent samples t-test are:H0: ACES SCOREYES – ACES SCORENO = 0 (the difference of the means is equal to 0)H1: ACES SCOREYES – ACES SCORENO ? 0 (the difference of the means is not equal to 0)Before running the t-test it is advisable to run a comparison box plot to obtain an idea of what to expect in the test. The box plot is as presented below. If the means/variances of the two groups or categories in regard to ACES score were equal, the box plots would have equal lengths.Figure 2.1From the box plots in figure 2.1, it is evident that the variances for the two categories are quite different as the spread of observations for the YES category is greater than that of the NO category. This suggests that the two groups or categories differ by ACES score. The next step is to run the independent samples t-test to check whether the difference between the groups is significant. Results of the t-test are presented in tables 2.1 and 2.2 below:Table 2.1Group StatisticsDid you have any days in the last month when your physical health was not good?NMeanStd. DeviationStd. Error MeanTotal Adverse Childhood Experiences Score (ACES)No589681.45651.96211.00808Yes372732.10462.40083.01244Table 2.2Independent Samples TestLevene's Test for Equality of Variancest-test for Equality of MeansFSig.tdfSig. (2-tailed)Mean DifferenceStd. Error Difference95% Confidence Interval of the DifferenceLowerUpperTotal Adverse Childhood Experiences Score (ACES)Equal variances assumed1954.881.000-45.70796239.000-.64807.01418-.67586-.62028Equal variances not assumed-43.70067753.523.000-.64807.01483-.67714-.61900From the group statistics table 2.1, 58,968 participants reported good health, while 37,273 reported poor physical health. The mean ACES score for the YES (poor physical health) group is 2.10, while that of the NO group is 1.46. Table 2.2 presents the results of the t-test. The Levene’s test for equality of variances yields a significant p value of p < 0.001. Hence, we reject the null of the Levene’s test and conclude that the variance in ACES score for the group that reports poor physical health (YES) is significantly different from that which reports good physical health. This implies that we need to consider the Equal Variances Not Assumed row in interpreting the t-test results. The negative t-value of 43.7 indicates that the mean ACES score for the first group (NO – good physical health) is lower than that of the second group (YES- poor physical health). The associated p-value (p

Stuck Writing Your "Biostatistics Physical Health" Data Analysis ?

3. The variable PHYSHLTH_YES_NO is a categorical, binary, nominal variable. Either people report poor physical health (Yes=1), or they do not (No=1). Based on this categorical variable, use the appropriate statistical test to determine if there is an association between PHYSHLTH_YES_NO and VETERAN. Conduct a formal hypothesis test to answer this question (choose the appropriate statistical test, explain why you chose it, write out your null and alternative hypotheses, run the test, and interpret the results). Include appropriate output from SPSS to show what you did. In addition, answer the following question: Is there a difference in the rate of reporting poor physical health between people who are veterans and who are not? How do you know?ANSWER BELOW:The appropriate statistical test in this…

[…… parts of this paper are missing, click here to view the entire document ]

…continuous variable, and the independent variables are either continuous or categoricalTable 6.1Model SummaryModelRR SquareAdjusted R SquareStd. Error of the Estimate1.394a.155.15410.646a. Predictors: (Constant), Body Mass Index, Own or Rent Home (Housing Status), Current employment status, Respondent Sex, Urban or Rural Location, During the past month, other than your regular job, did you participate in any physical exercise?, Do you have any type of health insurance?, Marital Status, Total Adverse Childhood Experiences Score (ACES), During the last 30 days what is the largest number of alcoholic drinks you had on any occasion?, Education, How often do you smoke cigarettes?, Are you a veteran?, Is your annual household income from all sources:, Age categoriesFrom the R square values presented in the model summary (Table 6.1) above, the independent variables in the model account for 15.5 percent of the variations in the dependent variable PHYSHLTH_DAYS. The correlation value of 0.394 indicates a relatively weak correlation between the dependent variable and the independent variables.Table 6.2ANOVAaModelSum of SquaresdfMean SquareFSig.1Regression309155.0861520610.339181.839.000bResidual1682820.82714847113.344Total1991975.91314862a. Dependent Variable: If you had any days in the last month when your physical health was not good, how many days was it not good?b. Predictors: (Constant), Body Mass Index, Own or Rent Home (Housing Status), Current employment status, Respondent Sex, Urban or Rural Location, During the past month, other than your regular job, did you participate in any physical exercise?, Do you have any type of health insurance?, Marital Status, Total Adverse Childhood Experiences Score (ACES), During the last 30 days what is the largest number of alcoholic drinks you had on any occasion?, Education, How often do you smoke cigarettes?, Are you a veteran?, Is your annual household income from all sources:, Age categoriesThe ANOVA table (Table 6.2) reports how well the regression equation predicts the dependent variable PHYSHLTH_DAYS. The regression model yields a statistical significance of p < 0.001, which is less than 0.05, suggesting that the model is a good fit and significantly predicts the dependent variable PHYSHLTH_DAYS.Table 6.3Results of the linear regression are presented in table 6.3 below. Generally, one’s sex, marital status, education level, employment status, household income level, smoking status, alcohol consumption, frequency of physical exercise, ACES score, age, and BMI significantly predict their self-reported number of days of poor physical health, with p-values of p < 0.05. At the same time, one’s health insurance status, veteran status, house ownership status, and living in an urban and rural area do not significantly influence the self-reported number of days of poor physical health. The specific interpretation of each independent variable is presented below the table:CoefficientsaModelUnstandardized CoefficientsStandardized CoefficientstSig.BStd. ErrorBeta1(Constant)15.185.71921.133.000Respondent Sex-1.169.201-.050-5.810.000Do you have any type of health insurance?.441.327.0111.349.177Marital Status.516.135.0323.822.000Education-.452.102-.037-4.413.000Own or Rent Home (Housing Status).181.213.007.849.396Are you a veteran?-.057.258-.002-.221.825Current employment status-4.082.223-.169-18.335.000Is your annual household income from all sources:-1.318.096-.131-13.762.000How often do you smoke cigarettes?.876.196.0374.468.000During the last 30 days what is the largest number of alcoholic drinks you had on any occasion?-.161.026-.049-6.101.000During the past month, other than your regular job, did you participate in any physical exercise?-4.197.185-.177-22.690.000Total Adverse Childhood Experiences Score (ACES).284.036.0647.788.000Urban or Rural Location-.235.234-.008-1.005.315Age categories.495.088.0595.643.000Body Mass Index.048.012.0303.898.000a. Dependent Variable: If you had any days in the last month when your physical health was not good, how many days was it not good?Sex: sex is a significant predictor of self-reported number of days of poor physical health (p

Show More ⇣


     Open the full completed essay and source list


OR

     Order a one-of-a-kind custom essay on this topic


sample essay writing service

Cite This Resource:

Latest APA Format (6th edition)

Copy Reference
"Biostatistics Physical Health" (2023, August 08) Retrieved June 5, 2026, from
https://www.aceyourpaper.com/essays/biostatistics-physical-health-2179808

Latest MLA Format (8th edition)

Copy Reference
"Biostatistics Physical Health" 08 August 2023. Web.5 June. 2026. <
https://www.aceyourpaper.com/essays/biostatistics-physical-health-2179808>

Latest Chicago Format (16th edition)

Copy Reference
"Biostatistics Physical Health", 08 August 2023, Accessed.5 June. 2026,
https://www.aceyourpaper.com/essays/biostatistics-physical-health-2179808