Paper 2: Correlations |
|
To find correlations I took data from the U.S. Census in 2000 and compared the data to the AIDS cases I had from the California Department of Health Services. Once the data is compared the data is given a number stating its confidence of correlation. To have a strong and reliable correlation the two pieces of data need to be correlated with 95% confidence. Alameda County has 17 cities in it, so the correlation confidence number was 0.455. Some of the data I correlated stronger than 95% and some correlated less, but overall the data I will be discussing correlated with around 95% confidence. When I graphed all of the data that correlated sometimes the correlation was higher or lower because of one or two cities that did not fit the general trend of the rest of the county. The cities that did not fit the general trend are called outliers and were taken from the data set to test the correlation of the data. Unfortunately, Oakland, the city I am from, was usually an outlier because it has more AIDS cases, and different racial and financial demographics than the rest of Alameda County. The demographic differences between Oakland and other cities made it difficult for me to find things that correlated across all cities. I also had trouble finding correlations because AIDS has affected two different demographics (gay men and African Americans) since the discovery of the disease. As I learned about the AIDS problems throughout Alameda County and especially Oakland, I began to see similarities in the issues that were affecting the AIDS rates. The AIDS cases in Alameda County seemed to be strongly correlated to issues of race and poverty. So that is where I began to look. |
|
RACE I correlated the four most prevalent racial subcategories in Alameda County, White, Black or African American, Asian and Hispanic, with AIDS cases in the different cities of Alameda County. What I found was some what surprising. I had correctly assumed that AIDS cases and Black or African American population would strongly correlate, with a .717 correlation with outliers, Oakland and Emeryville, and an even strong correlation of .772 without the two cities. (Chart 1A and 1B) I also correctly assumed that the number of AIDS Cases in cities would not correlate to the Asian population within cities because the number of AIDS cases in the Asian community has been very low. What surprised me was the correlation between AIDS cases and White population and the lack of correlation between AIDS cases and the Hispanic population. There were multiple articles written throughout Bay Area newspapers discussing the growing amounts of AIDS cases within Hispanic communities, because of this media attention I thought that there would be a correlation between AIDS cases and Hispanic population. Percentage of Hispanic population correlated .210 to AIDS cases. Another surprise for me was the correlation between AIDS cases and the white population, with a correlation of -.402. Because AIDS is so strongly viewed as a “black problem” in the media throughout Alameda County, I had forgotten that AIDS also strongly affects white gay men. I think this correlation reflects the earlier state of the disease rather than the current affect AIDS is having in Alameda County. This correlation also could account for many cases at the beginning of the epidemic, since the AIDS cases for each city is cumulative over the last three decades. Despite the initial correlation when the outlier, Oakland, was taken away the correlation dramatically dropped to -.16949 and became insignificant. (Chart 2A and 2B) |
|
POVERTY/ MEDIAN INCOME Because poverty is such a large factor in AIDS case prevalence, I correlated percent of population below poverty line and median household income with AIDS cases. I found some drastic results. AIDS cases and percent of population below the poverty level correlated at the 99% confidence with a .674 correlation. I found it interesting that even once the outliers, Oakland and Emeryville, were removed the correlation stayed the exact same. (Chart 3A and 3B) Similarly, median household income did not change after the outliers, Piedmont and Oakland, were taken out of the calculation. Median household income correlated with above 90% confidence with a correlation of .41518. (Chart 4A and 4B) I also tried to correlate median household income of each race and AIDS cases but no race correlated on its own. Because in Alameda County ones economic status is usually connected to their race, I was interested to see the connection between income level and race I created a chart of the median income levels for each city and separated the incomes by race. By looking at the chart you can see the disparity in income level by race and the disparity between wealth in different cities throughout Alameda County. When I compared the cities with in Alameda County with the most AIDS cases, to the cities with the most AIDS cases I saw that majority of the cities with a high amount of AIDS cases also had low median income levels. (Chart 5) |
|
HOMOSEXUALITY One of the only ways to identify gay males through census data is to look at male householders living with a male partner. I correlated the percent of households with male householder and a male partner with AIDS cases and found a strong correlation of .651839. When the outlier, Oakland, was taken out the data was still strongly correlated, .491376. (Chart 6A and 6B) |
|
EDUCATION LEVEL AND FOREIGN BORN Poverty and race are both strongly connected to the amount of education one receives so I decided to look at the correlation between different levels of education and AIDS cases. Unfortunately, I found confusing data. The only levels of education that correlated were less than high school and more than college. These two levels of education are on the opposite ends of the spectrum and provided conflicting evidence, so I decided not to find that correlation valid. I also looked at the foreign born population within the different cities in Alameda County because sometimes AIDS cases are brought in to an area by people from or visiting areas of the world with high AIDS case prevalence. But in the case of Alameda County the foreign born population and the number of AIDS cases did not correlate. |
|