Wednesday, April 25, 2012

The SAT Bell Curve


In The Mismeasure of Man, Stephen Jay Gould wrote an extended criticism of the quantification of general intelligence with factor analysis. I half expected public rebukes of Gaussian normal curves following the publication of The Bell Curve by Richard Herrnstein and Charles Murray. When the IQ distributions of whites and African Americans appear on the same graph in proportion to each group’s population, it can evoke a sense that one bell curve is physically dominating or even raping the other.
What the graph actually illustrates is that about as many dim white people live among us as dim black people because the graphs overlap at the left tail. This point can escape attention when this data, which Herrnstein and Murray borrowed from the National Longitudinal Survey of Youth, appears under the assumption of equal group size.
I decided to attempt to replicate these graphs with SAT data. Like it or not, the SAT is a sort of intelligence test, more so than the ACT exam that college applicants in the American heartland so commonly take. I shall quote extensively from a paper by Satoshi Kanazawa because he fairly succinctly summarized the case for the SAT as an intelligence test.
The SAT has a significant advantage as a proxy IQ test over other standardized academic tests, such as the American College Testing (ACT), an alternative university admissions test, or the National Assessment of Educational Progress (NAEP), administered to representative samples of fourth and eighth graders in public schools every year. While the SAT measures the students' critical reasoning ability, both the ACT and the NAEP measure their learned knowledge of academic subjects. This distinction between the SAT and the ACT is well recognized by both testing services…. A principal component analysis of SAT and ACT scores shows that the former load on two factors (verbal and quantitative) while the latter load on four additional factors (information, English, natural sciences, and social studies). Frey and Detterman (2004) show that the correlation between SAT scores and g is .857 (corrected for nonlinearity) when the measure of g is the Armed Services Vocational Aptitude Battery, and it is .72 (corrected for restricted range) when the measure of g is Raven's Advanced Progressive Matrices.

This is not to deny the complicating nuances of the research. After all, a genome-wide association study of intelligence determined that the examined single nucleotide polymorphisms of our DNA influenced the fluid intelligence, which was partially derived from Raven’s Matrices, more than crystallized-type intelligence, which tests of acquired knowledge (like vocabulary) can measure. However, the mysterious Flynn effect of rising intelligence in the industrialized world has more rapidly elevated Raven’s Matrices scores than other intelligence tests.
SAT data can construct score distribution graphs for racial groups but only for four years in the 1980’s. In the case of black and white students, the years in question still likely reflect the present situation because the rapid decline in the black-white score gap occurred just prior to these years, and these score differences have, more or less, persisted since then. PhotobucketPhotobucketPhotobucket Though the verbal and writing subtests might not elicit a Pavlovian reaction to bell curves, this seems to result from the test range chopping the black students’ curves into wedges. If the true IQ distribution of African Americans follows a bell-shaped Gaussian curve, then an artificial minimum SAT score could be misrepresenting the full ability spectrum of black students.

In 1996, SAT score distributions “recentered” to reflect a new 1990 reference group that replaced the old 1941 reference group. Prior to the recentering, the greater decline of average verbal scores relative to mathematics subtest scores had concerned the College Board. Recentering also lowered the mathematics standard deviation to make black, Hispanic, and female students “appear less below average.” The following graph shows that recentering increased verbal scores even more for the black students in the 1990 reference group, giving them a bell-shaped distribution. Photobucket This does not convince me that the same occurred for actual post-recentering black SAT scores because the black-white gap remained virtually unchanged.
Certainly, the SAT verbal and math subtest distribution for the general population shifted higher, as shown below:  photo satvdist-1.gifPhotobucketPhotobucket Notice that the percentages with the highest scores continued to increase even after the recentering, especially on the mathematics subtest.

Shifting all groups higher could hurt the black average verbal and writing SAT scores by revealing a full bell curve and thereby allowing the artificial floor to fall out from under the worst students, unless black performance improved simultaneously, causing the two phenomena to mask each other. However, if African Americans suddenly attained an extended bell-shaped distribution, I would expect an increase in their score variance on the verbal subtest, which would be reflected in an increased standard deviation. On the contrary, black students have long held the lowest standard deviations, and the graph of this quantity has been equally flat for math and verbal subtests.
The following graphs show the black and white score distributions without the population sizes being held equal. At the time, African Americans were the largest minority, and similar graphs for Hispanics and Asians make the respective groups’ curves almost imperceptible puddles, so I shall forgo posting them. PhotobucketPhotobucketPhotobucket As the standard deviations graph above already revealed, Asians comprise the most heterogeneous group, and I find their distribution to be the most fascinating. PhotobucketPhotobucketPhotobucket The most obvious characteristic of the verbal and writing graphs are the bimodal distributions, which one would expect in a group for whom English frequently is the second language. This matches the writing subtest distribution for Hispanics below, but the Asian verbal subtest graph has one other aspect lacking in the Hispanic counterpart. Despite the large number of poor performers on the left side, the most elite performers of the Asian graph appear to present in roughly equal proportion to those of the white graph. In fact, a slightly higher proportion of Asians achieved the highest two verbal score ranges compared to the white group for each of the four years, and these were years prior to most of the Asian score improvement that I previously discussed.

On the mathematics subtest graph, the Asian distribution extends noticeably more into the higher ranges than the white distribution. Thus, a much greater proportion of Asians achieve the highest range of math performance, a point that I shall also extend to men.
PhotobucketPhotobucketPhotobucket A 2006 no-confidence vote compelled Larry Summers to resign from his position as president of Harvard because he gave a speech in which he said the following:
There are three broad hypotheses … with respect to the presence of women in high-end scientific professions…. The second is what I would call different availability of aptitude at the high end…. It does appear that on many, many different human attributes—height, weight, propensity for criminality, overall IQ, mathematical ability, scientific ability—there is relatively clear evidence that whatever the difference in means—which can be debated—there is a difference in the standard deviation, and variability of a male and a female population…. Even small differences in the standard deviation will translate into very large differences in the available pool substantially out.

The standard deviations graph above validates Summers’ observation about differing aptitude variability between the sexes, and this is especially the case on the mathematics subtest. The following graphs illustrate just how much the standard deviation difference in math (plus a difference in mean) translates into substantially more male students in the highest aptitude levels. PhotobucketPhotobucketPhotobucket Dr. Summers, on behalf of Harvard University, I would like to offer you your job back.



ResearchBlogging.org






Davies G, Tenesa A, Payton A, Yang J, Harris SE, Liewald D, Ke X, Le Hellard S, Christoforou A, Luciano M, McGhee K, Lopez L, Gow AJ, Corley J, Redmond P, Fox HC, Haggarty P, Whalley LJ, McNeill G, Goddard ME, Espeseth T, Lundervold AJ, Reinvang I, Pickles A, Steen VM, Ollier W, Porteous DJ, Horan M, Starr JM, Pendleton N, Visscher PM, & Deary IJ (2011). Genome-wide association studies establish that human intelligence is highly heritable and polygenic. Molecular psychiatry, 16 (10), 996-1005 PMID: 21826061

Hiscock, M. (2007). The Flynn effect and its relevance to neuropsychology Journal of Clinical and Experimental Neuropsychology, 29 (5), 514-529 DOI: 10.1080/13803390600813841

Kanazawa, S. (2006). IQ and the wealth of states Intelligence, 34 (6), 593-600 DOI: 10.1016/j.intell.2006.04.003

18 comments:

  1. What do you know about mutational loads and whether they vary across races?

    ReplyDelete
  2. I don't know about genetic load (deleterious mutations), but recent allele fixations differ between Africans, Europeans, and Asians, per Coop et al. I would expect the putting off of childbirth until later in life to have a significant influence on population differences in genetic load.

    ReplyDelete
  3. Differences in genetic load is the most plausible explanation of most differences in general intelligence, i.e. people with fewer deleterious mutations have higher intelligence. The genetic load theory is compatible with everything we know about g (e.g. heritability, correlations with health). Evolutionarily, it makes much more sense than the idea that intelligence is selected for, in which case everybody would be smart.

    ReplyDelete
  4. Can you point to any good references for genetic load in relation to intelligence, behavior, etc.?

    ReplyDelete
  5. Cochran is working on the genetic load idea.

    ReplyDelete
  6. Wow, those are creepy graphs. The undulating Blob ...

    ReplyDelete
  7. Something you might consider doing is calculating just how small the number of ultra-elite black scorers is. How many 1500+ blacks are left after Harvard, Yale, Princeton, and Stanford get their bites at the apple?

    ReplyDelete
  8. Another myth that won't die is that we could just switch affirmative action from race to class. But, as far as I can tell, virtually all the 1300+ black sutdents come from the middle class or above, or mixed race homes, or foreign elites.

    ReplyDelete
  9. Interesting data set.

    May I suggest to label axes?

    Also, concerning the male-female difference, you write that 'The standard deviations graph above validates Summers’ observation'. Summers' observation was that 'there is a difference in the standard deviation, and variability of a male and a female population'.

    1) What do you mean by a standard deviations graph? I cannot find a single value (or graphical representation) for the standard deviations of men and women in this post.

    2) Looking at the male and female plots makes me doubt whether male and female standard deviations will differ. The main difference seems to be one of means. You could check that with a t-test (comparing means) and a standard deviation test (comparing standard deviations).

    3) What is the directionality? Are women worse on these tests, or are the tests less geared towards women? By agreeing with Summers, you suggest it is the former. Why?

    ReplyDelete
  10. The graphs labeled "verbal/critical reading standard deviations" and "math standard deviations" graph standard deviations for men, women, and racial/ethnic groups. Men always had higher variance/standard deviations than women on the math subtest. This is reflected in a shorter and wider math bell curve for men. Summers was correct because far fewer women than men scored at the highest levels of the math subtest. I have never heard a case made that the math subtest has a gender bias, but a 2010 study determined that it does not have a cultural bias against black students, but the verbal subtest does, even though the black-white gap is larger on the math subtest.

    ReplyDelete
  11. I really appreciate what you post. You have a new subscriber now.

    ReplyDelete
  12. J said:

    Differences in genetic load is the most plausible explanation of most differences in general intelligence, i.e. people with fewer deleterious mutations have higher intelligence. The genetic load theory is compatible with everything we know about g (e.g. heritability, correlations with health). Evolutionarily, it makes much more sense than the idea that intelligence is selected for, in which case everybody would be smart.


    There is a healthy dose of nonsense there.

    The brain is a very expensive organism, developmentally and operationally. There will be an optimum brain size that balances the cost of increasing brain size (and in any social species there will always be selection for increase in brain size to allow for exploitation of other members of your species as well as a conversion of some behaviors from learned to innate) against the benefits of that increased brain size in terms of reproductive success. In addition, we can expect the size of the selection effect to be correlated with the average group size. Thus selection for intelligence goes up as different groups start engaging in large-scale civilizations, like the Indians, the Chinese and the Caucasians.

    Also, have you thought about how the argument about genetic load applies to other species? Why do those with lower genetic load not have higher intelligence than humans?

    ReplyDelete
  13. J said:

    Differences in genetic load is the most plausible explanation of most differences in general intelligence, i.e. people with fewer deleterious mutations have higher intelligence. The genetic load theory is compatible with everything we know about g (e.g. heritability, correlations with health). Evolutionarily, it makes much more sense than the idea that intelligence is selected for, in which case everybody would be smart.


    If you are going to appeal to genetic load to explain differences in intelligence, then I think you also have to appeal to genetic load to explain:

    1. differences running ability (long distance and sprinting)

    2. Differences in propensity for violence

    3. Differences in civilization building

    Now, the problems with those things are that different groups are good at different things, so now you are reduced to claiming that Africans (and African Americans) have higher genetic load in their intelligence genes, whites and East Asians have higher genetic load in their running/sporting genes, and so on, which seems rather ad-hoc and silly.

    ReplyDelete
  14. I've never seen the SAT distribution by race before -- that's very interesting. Can you post a link to the source data? I'd love to take a look at it!

    ReplyDelete
  15. Why the genetic load paradox is based on a fallacy. http://www.handsongenetics.com/PIFFLE/GeneticLoad.pdf

    Environment has as great impact on individuals, if not more than genetics. Even Darwin espoused this. "...there is a frequently recurring struggle for existence, it follows that any being, if it vary however slightly in any manner profitable to itself, under the complex and sometimes varying conditions of life, will have a better chance of surviving, and thus be naturally selected."

    Graphs can be created to show anything one wants. However, those graphs don't show the underlying causal factors skewing the numbers to be what they are. If children, from the impressionable ages, birth to three, are raised in a stable, encouraging household their ability to self-control, have self-esteem, and learn are increased. Add stability in the classroom through high school and you'll likely have a productive AMERICAN citizen. We need to focus on improving these areas as we look to compete in the future with the likes of China and India. In 2008 China was already producing 10x engineers compared to the U.S. http://www.itif.org/content/chinas-engineering-degrees-were-about-10-times-us-number-and-represented-much-higher-share-a. Imagine what that will be by 2050 if we don't focus on making our entire nation better.

    ReplyDelete
  16. Random mutation said...

    "If you are going to appeal to genetic load to explain differences in intelligence, then I think you also have to appeal to genetic load to explain:

    1. differences running ability (long distance and sprinting)

    2. Differences in propensity for violence

    3. Differences in civilization building

    Now, the problems with those things are that different groups are good at different things, so now you are reduced to claiming that Africans (and African Americans) have higher genetic load in their intelligence genes, whites and East Asians have higher genetic load in their running/sporting genes, and so on, which seems rather ad-hoc and silly."

    Agree. You also have to look at the correlation between nation building and the propensity toward violence. There's only one race that, throughout history, has consistently used the cause of nation building to advocate violence. Violence has also been exported to every continent as a result...all in the name of slaughtering Native Americans, oops I mean nation building. Look in the mirror.

    With respect to running ability, athletes come in all races. There are many more Caucasian American athletes than African American athletes. The pursuit of athletics slows down around the high school/college age and is discouraged over academics, whereas the pursuit of athletics over academics in the African American community is almost expected. At least that's what the press and internet tell us.

    ReplyDelete
  17. Intelligence probably differs from athletics and temperament, in that the genetics of intelligence is complex with many alleles of small effect. Running ability can be fairly well manipulated, as Olympic athletes sometimes illegally do, and something as simple as the relative proportion of fast- versus slow-twitch muscle fibers might have great influence. Likewise, psychiatrists change patients' mood and temperament with drugs to influence neurotransmitters. Changing intelligence is not so simple.

    If natural selection plays an important role in intelligence evolution, then David Piffer's approach might be the best direction for identifying the genetic contribution to racial gaps.

    ReplyDelete
  18. LORD ANON says:

    I apologize in advance for the informal speech and lack of correct jargon that will inevitably follow.

    I'd say I'm a pretty smart upper-middle-class white guy; I got a 35 on the ACT and scored a 215 on the PSAT. Concepts generally come quicker to me than anyone else; I've only met one person who I can be sure understood a specific concept quicker than me.

    That being said, I'm also pretty arrogant so it's quite possible that I'm overestimating my intelligence.

    I recently had a theory that you could easily call fucked up. Here goes the premise:
    I live in Alabama. Here in the south, there are a lot of dumb African Americans. If you look at score breakdowns on intelligence-measuring tests, blacks generally score lower. Even when correcting for things such as class differences and the fact that whites tend to try harder, it still seems that blacks have a distinctive lesser capacity to understand concepts than whites. Specifically, these blacks I'm referring to are African Americans with ancestral slavery living in the deep south. Why is there this intelligence discrepancy?
    Imagine being a slave. There is nothing more degrading. There is an obvious desire to get out of slavery and an accessible way - escape to the North. Many slaves, such as Harriet Tubman, devised plans to help slaves escape or devised plans solely for their own escape (or their family's.) Those who were not good at devising elaborate plans were caught and remained in the south, or killed. Those who remained in servitude were either loyal or very dumb.
    From the slaveowner's perspective, A slave that is constantly attempting to usurp your authority is not even worth owning. You would selectively breed, effectively, for the dumb loyal slave who does exactly what he/she is told.

    These processes seem to select, among other things (loyal, etc) the least intelligent African Americans for continued survival in the South.

    ReplyDelete