Intelligence Testing: Yesterday and Today

  • Due to under-education, and to measure mental abilities, intellectual tests were developed.
  • Binet-Simon test looked at individual differences in mental functioning (focus academic ability).
  • 1971: Court Case Larry P. vs. Wilson Riles—California supreme court in 1975 placed a moratorium on using IQ tests on African-Americans.

Review of Reliability

  • Reliability—consistency with which individuals respond to test stimuli. The types are:
  • Test-Retest: Consistency of responses to the same test stimuli on repeated occasions.

o May lead to “test-wiseness” that influences their scores the second time or clients may show practice effects.

  • Equivalent-Forms: Equivalent or parallel forms of a test are developed (ex: test forms A, B, C with different colors for an exam).
  • Split-Half: Test is divided into halves (or odd numbered items vs. even numbered items) & participant's scores on the two halves are compared (allows for internal-consistency reliability).
  • Internal Consistency: Do the items on a test measure the same thing? Index of internal consistency, average of split half correlations is made (Cronbach's alpha).
  • Inter-Rater: Independent observers agree about their ratings of an aspect of someone's behavior.
  • Reliability needs to be consistent in all forms, otherwise it won't be valid at all; and reliability does not automatically equal validity.

Measures for Reliability

  1. Test-Retest reliability: Pearson's r and Interclass correlation
  2. Equivalent forms reliability: Person's r
  3. Split-half reliability: Pearson's r
  4. Internal consistency reliability: Cronbach's alpha and Kuder-Richardson-20
  5. Inter-rater reliability: Person's r and Interclass correlation Kappa

Review of Validity

  • Validity: An assessment technique measures what it is supposed to measure
  • Content Validity: Measures comprehensiveness in assessing the variable of interest (does it measure all areas of the construct of interest).
  • Predictive Validity: Type of criterion-related validity. Extent to which test scores indicate some behavior or event in the future.
  • Concurrent Validity: Type of criterion-related validity. Extent to which test scores correlate with scores on other relevant measures given at the same time.
  • Construct Validity: Extent to which test scores demonstrates all aspects of validity in a consistent manner (involves both convergence and discriminant validity demonstration).

Definitions of Intelligence—3 Classes Emphasize (are not mutually exclusive definitions):

  1. Adjustment or adaptation to the environment—adapting to situations or dealing with situations.
  2. Ability to learn—educability in the broad sense of the term
  3. Abstract thinking—ability to use a wide range of symbols and concepts, ability to use verbal and numerical symbols.

Theories of Intelligence

Factor Analytic Approaches

  • Spearman—general intelligence g (general tests) and specific intelligence s (unique test aspects).
  • Spearman viewed intelligence as a broad generalized entity. Used principal components.
  • Thrustone—viewed intelligence as a series of “group factors” not the basic Used principal factors.

o 7 factors (Thurstone's Primary Mental Abilities)

  • Spearman and Thurstone also used different data sets (broad range vs. academic institutions).

Cattell's Theory (Hierarchical Model of Intelligence)

  • Emphasized He developed 17 ability concepts. Divided Spearman's g into 2 components:

o Fluid Ability: Genetically based intellectual capacity

o Crystallized Ability: Capacities that are tapped by intelligence tests, (culture based learning).

Guilford's Classification (Viewed as a classification or taxonomy; not really a theory)

  • Structure of Intellect Model (SOI)—used model as a guide in generating data.
  • Intelligence components can be divided into 3 areas: operations, contents and products.
  • Operations: Cognition, memory, constructing logic alternatives, arguments, evaluation.
  • Content: Areas of information in which the operations are performed (figural, symbolic, semantic and behavioral).
  • Products: When a mental operation is applied to a context there are 6 types of products.

o Units, classes, systems, relations, transformations and implications.

Gardner's Theory of Multiple Intelligences (Viewed as "Talents" not intelligences)

  • Gardner—theory of multiple intelligences (8 intelligences):

o Linguistic, Musical, Logical-Mathematical, Spatial, Bodily-Kinesthetic, Naturalistic, Interpersonal and Intrapersonal.

Sternberg's Triarchic Theory of Intelligence

  • People function on the basis of three aspects of intelligence: componential, experiential and contextual.
  • Emphasis on planning responses and monitoring them and de-emphasis on speed & accuracy.
  • Componential: Analytical thinking (good test-taker)
  • Experiential: Creative thinking (combine separate elements of experience
  • Contextual: “street smart”—practical, can play the game and manipulate the environment.

Today's Focus—More on Spearman + Thurstone Contributions

  • Focus is largely still on a single IQ or Spearman's
  • Current intelligence tests are made up of subtest scores (Thurstone factors).

The IQ: It's Meaning and It's Correlates—The Intelligence Quotient (IQ)

Ratio IQ

  • Mental Age (MA): Index of mental performance (X items passed)
  • Chronological Age (CA): Individual's given age
  • IQ: Used to overcome differences cause by CA and MA to express deviance
  • IQ= MA/CA x 100
  • IQ measurement is not one of equal-interval measurement and we can't add & subtract (so IQ of 100 is not twice IQ of 50).

Deviation IQ

  • Ratio IQ is limited and not fully applicable to older age groups.
  • Compares an individual's performance on IQ test with his/her same age peers .
  • Same IQ has a different meaning for different ages (ex: same IQ for 22 year vs. 80 year old).

Correlates of the IQ: School Success, Occupational Status and Success, Demographic Group Differences

  • School

o General IQ shows success in school and specific tests measure what area.

o IQ scores + grades correlation—.50

  • Occupation

o Based on educational level acquired (income, race, prestige...)

o IQ also good predictors of job performance

  • Demographic Group

o Differences between sexes for specific abilities; males on spatial and quantitative ability and females on verbal ability.

o Hispanic & African Americans have lower IQ scores than North or European Americans.

Heredity and Stability of Intelligence

  • Intelligence is influenced by genetic factors (behavioral genetics)
  • Similarity in intelligence is a result of the amount of genetic material shared (monozygotic more similar than dizygotic twins or siblings).
  • IQ variance associated with genetics varies from 30% to 80%.
  • Environment plays a role—biological relatives raised together are more similar.
  • Heritability of intelligence is not stable; 20% in infancy and 60% in young adults, 80% in old age.

Stability of IQ Scores and the Flynn Effect

  • IQ Scores tend to be less stable for children and more stable for adults and more influenced at a younger age for children than for adults (i.e. environment).
  • Flynn Effect: From 1972 onwards Americans IQ scores on average have increased 3 points each decade.

The Clinical Assessment of Intelligence Scale 1: The Stanford-Binet Scales

Stanford-Binet 1972 revised test kit version followed a fourth revision in 1986 and the most recent revision in 2003—Stanford-Binet Fifth Edition (SB-5)


  • Hierarchical Model of Intelligence; 5 factors that tap non-verbal & verbal abilities.
  1. Fluid Reasoning: Ability to solve new problems. Measured by sub-tests
  2. Quantitative Reasoning, Visual-Spatial Processing, Working Memory and Knowledge
  • Each sub-test is made up of items of varying difficulty (age 2-adulthood)
  • Multistage Testing: Two routing subtests the Object-Series Matrices and Vocabulary subtest

o Routing: Examinee's performance on these two sub-tests determine which item to start with for each remaining subtest.

Standardization and Reliability and Validity:

  • Included 4,800 participants aged 2-96 years old; participants were tested using various areas.
  • SB-5 administered to individuals with disability, mental retardation to ensure utility of scores.
  • Comparing Stanford-Binet to other scales like Wechsler Scales; the scale has strong validity.

The Clinical Assessment of Intelligence Scale 2: The Wechsler Scales

  • Wechsler-Bellevue Intelligence Scale; developed to correct flaws in Stanford-Binet Scale.
  • Test was designed for adults and items were groups into subtests not according to age level.
  • Used a deviation IQ concept; intelligence is normally distributed, compare with same-age peers.



  • 1955—Wechsler-Adult Intelligence Scale (WAIS); revised version 1981 (WAIS-R).
  • 1997—(WAIS-III); and most recent version 2008 (WAIS-IV)
  • Inclusion of reversal items in the subtests introduced first in WAIS-III

o Two examinee's both begin with the same base items then based on performance subsequent items are presented in reverse sequence until a perfect score on two consecutive items is obtained.

  • WAIS-IV—provided Index scores in addition to the Full Scale IQ Scores.

Obtaining the Full Scale IQ Score and Index Scores + Standardization:

  • Raw scores converted to standardized scores for a given age group.
  • Full IQ Score and Index score—adding scale scores of each subtest and converting sums to IQ equivalents.

Reliability and Validity

  • Scores from previous WAIS-III and WISC-IV are strongly correlated with WAIS-IV scores (good).
  • Over relying on global IQ scores can thus be misleading (Full Scale IQ)

The Wechsler Intelligence Scale for Children (WISC-IV)—Description and Standardization

  • 1949—WIC; multiple revisions since then and the latest version WISC-IV was published in 2003.
  • Used to test children age 6-16 years old; has 10 core and 5 sub-tests. A reduced version of WAIS.
  • Individual subtests define 4 major indices and make up the Full Scale IQ (*see pg. 212).

o Verbal Comprehension Index (VCI), Perceptual Reasoning Index (PCI), Working Memory Index (WCI), The Processing Speed Index (PSI)

The Wechsler Preschool and Primary Scale of Intelligence (WPPSI-III)

  • 1967—WPPSCI developed; a revised version since then and the latest WPPSI-III in 2002.
  • Similar to the WISC-IV but targeted towards youth; so children below the age of 6.
  • Only 3 indices—Full Scale IQ, Verbal IQ and Performance IQ; addition of PSI for age 4+; but also has several subset scales specific for children only.

Clinical Use of Intelligence Tests

Estimating General Intelligence Level

  • Determining the person's g level—what is the patient's intellectual potential?
  • Intellectual ability level can also assist with helping individuals recover cognitive abilities following head trauma, injury.
  • IQ scores need to be interpreted and placed in an appropriate context.

Prediction of Academic Success and Appraisal of Style

  • Intelligent tests should predict academic success in school.
  • Intelligence tests allow us to observe patient at work (observations; help with interpretation).
  • Some clinicians made diagnosis of mental disorders from intelligence tests (intertest scatter) but this is not at all reliable.

Final Observations and Conclusions—IQ is an Abstraction

  • Look at IQ as “present functioning” not innate potential; it is an abstraction that allows us to predict specific behaviors.
  • Most believe that there is a “true IQ” and intelligence tests assess these.

Final Observations and Conclusions—Generality Versus Specificity of Measurement

  • Intelligence tests can provide broad general index of intellectual functioning across a range of situations. Can thus be used to compare similar individuals in same situations.