STATISTICAL ANALYSIS OF AIDS IN QUEENS: CORRELATION COEFFICIENT

         Being able to determine if there is a correlation between two variables is significant before making a conclusion as to whether or not one factor relates to the other.  Statistical correlation examines the relationship between two variables; however, it does not lead to causation.  Living in a community of Woodside, Queens, with 61 zip codes, the value of n in my case is equal to 61 which gives me the degrees of freedom = (61-2) = 59.  Therefore, at a 95% confidence level, I used the value 0.250 as the p-value to determine whether the correlation that I computed was statistically significant.  To be statistically significant, the values that I calculated has to be greater than or equal to 0.250 and by statistically significant, it means that there is a 95% chance that a change in the independent variable may lead to a change in the dependent variable.  At a 95% confidence level, there is a 5 % chance that we may be wrong.  However, since I had a lot of zip codes, my n value gave me higher degrees of freedom meaning that I did not need a high correlation in order to conclude that my data is statistically significant.  Therefore, most of my correlation coefficients were at the 97.5% and 99% confidence level indicating that there is an even lesser chance of making a mistake.

 

Correlation Coefficient for cumulative AIDS rates v. Family Income:          

 

            When I thought about the relationship between family income and AIDS rate, I thought that the correlation would be very strongly negative since I was left with the impression that a better socioeconomic status leads to better education and health benefits.  At least this was what I was told by my High School teacher in Health class.  The overall pattern of my data is that as the percentage of family income less than $10,000 goes up the cumulative AIDS rate goes up.  Likewise, as the percentage of family income between $100,000 and $124,999 goes up, the AIDS rate goes down.  However, when I performed an in depth analysis of the data, East Elmhurst seems to deviate from this pattern.  While the Percentage Family Income less than $10,000 for East Elmhurst, zip code 11370, in 1999 was 7.124228, the AIDS rate soared up to 3713 per 100,000 cases.  Meanwhile, Far Rockaway, zip code 11691, has a percentage family income between $100,000 and $124,999 of 14.24615 and the AIDS rate was 1196.1.  This ratio is far less than the ratio for East Elmhurst.  As the percentage of family income below $10,000 decreases, the AIDS rate should decrease as well because this means that less people are in poverty.  What could be going on East Elmhurst?  I research the unemployment rate for East Elmhurst and found that it is 5.2% (2).  The normal unemployment rate is about 5.5% so clearly, there is no apparent problem in East Elmhurst’s unemployment rate.  I tried to research the drug use rate in this area, however, I could not find any data.  My guess would be that the unavailability of the data is accompanied by the stigma of using drugs.  Nevertheless, the correlation coefficient for the AIDS rate and the percentage of family income less than $10,000 is 0.563877.  At the 95% confidence level, the correlation coefficient is statistically significant using the correlation coefficient, 0.250 as a guide.  Even further, at the 99% confidence level, the data is also statistically significant, using correlation coefficient 0.325 as a basis.  This gives us an even higher confidence that as the percentage of family income less than $10,000 increases, the likelihood of the number of AIDS cases increases as well.  The percentage of family income between $100,000 and $124,999 and AIDS rate has a correlation equal to -0.48761 (Table A).  This means that as the percentage of people with higher income increases, the AIDS rate decreases.  This data is also significant at the 99% confidence level.

 

 

1) American Fact Finder. U.S. Census Bureau. 15 July, 2008.  http://factfinder.census.gov/home/saff/main.html?_lang=en

 

NYC Cumulative AIDS diagnoses by Borough and Zip Code (January 1, 1981 thru December 31, 2006

HIV Epidemiology and Field Services Program, NYC Department of Health and Mental Hygiene

 

2) http://realestate.yahoo.com/New_York/East_Elmhurst/neighborhoods

 

 

Correlation Coefficient for Cumulative AIDS rates and Poverty Level

 

Poverty Level: A minimum income level below which a person is officially considered to lack adequate subsistence and to be living in poverty.  (http://dictionary.reference.com/browse/poverty%20level)

 

Correlation Coefficient for cumulative AIDS rates vs Poverty Level

 

% below poverty level = 0.51936329

 

% above poverty level = -0.51936329

 

            Before doing research regarding the poverty levels in Queens, New York, I thought that there would be a strong correlation between AIDS rate and poverty level since socioeconomic status tends to affect the person’s knowledge about his/her surroundings and health situations.  The data that I calculated supported my claim and the correlation coefficient for AIDS rate v the percentage of people below poverty level is 0.51936329.  This means that as the percentage of people below poverty level increases, the number of AIDS cases increases as well.  There is a positive correlation between the AIDS rate and the percentage of people below the poverty level.  This makes sense because impoverished people tend to be deprived of the basic knowledge regarding AIDS/HIV and they also lack the proper medical treatment since financial concerns prohibit them from obtaining the proper treatment.  Furthermore, impoverished neighborhoods tend to have higher rates of sex and drugs from desperation to acquire money.  On the other hand, the correlation between AIDS rate and percentage above poverty level is -0.51936329 which also makes sense because as the percentage of people above the poverty level increases and the AIDS rate should be going down (Table F).  The poverty levels in 1999 were $8,501 for a one-person household, $10,869 for a two-person, $13,290 for three people, $17,029 for four people, and $20,127 for five people.  These thresholds do not include children under the age of 18.

 

US Census Bureau.  16 July 2008. http://www.census.gov/hhes/www/poverty/threshld/thresh99.html

 

Correlation Coefficient for Cumulative AIDS rates and Race

 

 % Black People = 0.492176657

 

% White People = -0.52399

 

            I knew that racial background was one of the predominant factors with regards to AIDS rates particularly because of the numerous articles that I have read pertaining to African Americans and their high risks for AIDS.  For instance, in an article written by Linda Villarosa, “AIDS is the leading killer of young African-American men” (1).  The media also portrayed African American women as having a higher risk for AIDS than any other racial group.  Therefore, I guessed that the correlation between AIDS rates and the percentage of black people would be a strong, positive correlation.  My findings supported my claim.  According to my calculation, the correlation between AIDS rates and the percentage of Black People is equal to 0.492176657 (Table B).  Although this was not as strong as what I thought, this is still statistically significant at the 99% confidence level.  Several theories as to why AIDS is more prevalent in Black people include that their socioeconomic background is below than any other race.  For instance, according to the data, the third largest population of black people live in Jamaica, Queens and based on what I have seen when I went to the location, the neighborhood is not the most affluent neighborhood in Queens.  Many gangs hang around the corner of the street and the environment reflected that their socioeconomic status is low.  Also, according to my opinion, this goes back in history when African Americans were not given an equal education as other races and therefore, affecting their educational background. 

The correlation between AIDS rate and the percentage White population is supported by the data as well.  Breezy Point, zip code 11697, has the highest percentage of the White population and it also has the highest percentage of Family Income $100,000 to $124,999 (Table C).  This indicates that Breezy Point consists largely of rich, white people.  Family Income as was stated earlier directly affects the cumulative AIDS rates.  Therefore, a possible reason as to why the correlation between AIDS rates and the percentage of White people is negative is that they tend to be wealthier than any other race.  Thus, they are capable of obtaining higher educations and better health benefits due to their socioeconomic status.  It can therefore be inferred that as the white population increases, the AIDS rates decreases.  Nevertheless, this is based on the computed statistical data. 

 

1)http://query.nytimes.com/gst/fullpage.html?res=9B01E3D8103FF930A35757C0A9679C8B63&fta=y

 

Educational Background and AIDS rates

 

The educational background of the people also strongly correlates with the cumulative AIDS rate.  For instance, the correlation for the percentage of male without a high school diploma is 0.40665475.  At the 99% confidence level, this means that there is a 99% chance that an increase in the number of males without a High School diploma will likely lead to an increase in the AIDS rate.  As the males gain further education, the AIDS rate decreases significantly as well.  This can be concluded from the negative correlations that were computed for the percentage of males who earned bachelor’s and doctorate degrees with correlations -0.5418 and -0.2760 respectively (Table D).  The same idea can be concluded for the females.  The correlation between AIDS rate and the percentage of female without HS diploma is 0.4750 and for the percentages of female with bachelor’s degree and AIDS rate, the correlation is -0.3964 (Table D).  As the educational background of the people progresses, the AIDS rate decreases.  This can be explained by the fact that the more education people are able to obtain, the more they are able to make wiser decisions.  Similarly, the more informed people are about AIDS, the more aware they will be regarding their surroundings and thus make wiser decisions. 

 

Average Household Size

 

            The average household size correlates significantly with the AIDS rate as well with a correlation coefficient of 0.2987 (Table E).  A possible explanation for this is that as the household size increases, financial trouble may come with increase as more members of the family need to be fed and supported. 

 

AIDS rate v Household relationship

 

            The correlation regarding the relationship between AIDS rates and Household relationship was interesting in that it projected the idea and guesses that I had in mind.  For instance, I would guess that men or women who were never married would have higher AIDS rate due to the fact that since they are not committed to anyone, they will tend to have sex with more people than men and women who are in a relationship.  Furthermore, the correlation between the percentage of male married with their spouse present and AIDS rate is equal to -0.8337.  This indicates that as the percentage of male with their spouse present increases, the AIDS rate decreases.  Likewise, as the percentage of female married with their spouse present increases, the AIDS rate decreases.  The correlation coefficient is -0.7156 (Table G).  This may be due to the committed relationship that married couples have, thus infidelity does not occur often.  Moreover, if the percentage of male married with their spouse absent increases, so does the AIDS rate.  This may be due to the infidelity that results when the spouse is absent.  The same result goes for the percentage of female married with their spouse absent having a positive correlation of 0.7243.   Thus, committed men and women who are involved in a committed relationship tend to have lower AIDS rates.

            Analyzing the statistics between two factors is specifically significant before concluding any significant relationship between two variables.  By examining the correlation coefficient between the Cumulative AIDS rate and certain demographic factors, I was able to further understand why the prevalence of AIDS in certain areas is higher than other areas.  Several factors contribute to our current understanding of AIDS such as race, socioeconomic background, educational background, as well as marital status.  This research has certainly given me a wider perspective and deeper understanding of the word AIDS.

 

Appendix Page

Map AIDS rate in Queens