Psychological testing

From Vero - Wikipedia
Jump to navigation Jump to search

Template:Short description Template:Infobox diagnostic Template:Sidebar with collapsible lists Psychological testing refers to the administration of psychological tests.<ref name=":1">Template:Cite book</ref> Psychological tests are administered or scored by trained evaluators.<ref name=":1" /> A person's responses are evaluated according to carefully prescribed guidelines. Scores are thought to reflect individual or group differences in the theoretical construct the test purports to measure.<ref name=":1" /> The science behind psychological testing is psychometrics.<ref name=":1"/><ref Name = "Nunnally">Nunnally, J.C., & Bernstein, I.H. (1994). Psychometric theory. New York: McGraw-Hill.</ref>

Psychological tests

According to Anastasi and Urbina, psychological tests involve observations made on a "carefully chosen sample [emphasis authors] of an individual's behavior."<ref name=":1"/> A psychological test is often designed to measure unobserved constructs, also known as latent variables. Psychological tests can include a series of tasks, problems to solve, and characteristics (e.g., behaviors, symptoms) the presence of which the respondent affirms/denies to varying degrees. Psychological tests can include questionnaires and interviews. Questionnaire- and interview-based scales typically differ from psychoeducational tests, which ask for a respondent's maximum performance. Questionnaire- and interview-based scales, by contrast, ask for the respondent's typical behavior.<ref>Mellenbergh, G.J. (2008). Chapter 10: Surveys. In H.J. Adèr & G.J. Mellenbergh (Eds.) (with contributions by D.J. Hand), Advising on Research Methods: A consultant's companion (pp. 183-209). Huizen, The Netherlands: Johannes van Kessel Publishing.</ref> Symptom and attitude tests are more often called scales. A useful psychological test/scale must be both valid, i.e., show evidence that the test or scale measures what it is purported to measure,<ref name=":1"/><ref>American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.</ref>) and reliable, i.e., show evidence of consistency across items and raters and over time, etc.

It is important that people who are equal on the measured construct (e.g., mathematics ability, depression) have an approximately equal probability of answering a test item accurately or acknowledging the presence of a symptom.<ref>Template:Cite journal</ref> An example of an item on a mathematics test that might be used in the United Kingdom but not the United States could be the following: "In a football match two players get a red card; how many players are left on the pitch?" This item requires knowledge of football (soccer) to be answered correctly, not just mathematical ability. Thus, group membership can influence the probability of correctly answering items, as encapsulated in the concept of differential item functioning. Often tests are constructed for a specific population and the nature of that population should be taken into account when administering tests outside that population. A test should be invariant between relevant subgroups (e.g., demographic groups) within a larger population.<ref name="Putnick">Template:Cite journal</ref> For example, for a test to be used in the United Kingdom, the test and its items should have approximately the same meaning for British males and females. That invariance does not necessarily apply to similar groups in another population, such as males and females in the United States or between populations, for example, the populations of the UK and the US. In test construction, it is important to establish invariance at least for the subgroups of the population of interest.<ref name = Putnick/>

Psychological assessment is similar to psychological testing but usually involves a more comprehensive assessment of the individual. According to the American Psychological Association, psychological assessment involves the collection and integration of data for the purpose of evaluating an individual’s "behavior, abilities, and other characteristics."<ref name="APA Dict">American Psychological Association. (n.d.). Psychological assessment. APA Dictionary of Psychology. Accessed October 11, 2023 [1]</ref> Each assessment is a process that involves integrating information from multiple sources, such as personality inventories, ability tests, symptom scales, interest inventories, and attitude scales, as well as information from personal interviews. Collateral information can also be collected from occupational records or medical histories; information can also be obtained from parents, spouses, teachers, friends, or past therapists or physicians. One or more psychological tests are sources of information used within the process of assessment. Many psychologists conduct assessments when providing services. Psychological assessment is a complex, detailed, in-depth process. Examples of assessments include providing a diagnosis,<ref name = "APA Dict"/> identifying a learning disability in schoolchildren,<ref>Template:Cite book</ref> determining if a defendant is mentally competent,<ref>Template:Citation</ref><ref>Template:Cite journal</ref> and selecting job applicants.<ref>Template:Cite journal</ref>

History

A Song dynasty painting of candidates participating in the imperial examination, a rudimentary form of psychological testing.
Physiognomy was used to assess personality traits based on an individual's outer appearance.

The first large-scale tests may have been part of the imperial examination system in China. The tests, an early form of psychological testing, assessed candidates based on their proficiency in topics such as civil law and fiscal policies.<ref name="gregory">Template:Cite book</ref> Early tests of intelligence were made for entertainment rather than analysis.<ref name="inthandbook">Template:Cite book</ref> Modern mental testing began in France in the 19th century. It contributed to identifying individuals with intellectual disabilities for the purpose of humanely providing them with an alternative form of education.<ref name=":0">Template:Cite book</ref>

Englishman Francis Galton coined the terms psychometrics and eugenics. He developed a method for measuring intelligence based on nonverbal sensory-motor tests. The test was initially popular but was abandoned.<ref name=":0" /><ref>Template:Cite journal</ref> In 1905 French psychologists Alfred Binet and Théodore Simon published the Échelle métrique de l'Intelligence (Metric Scale of Intelligence), known in English-speaking countries as the Binet–Simon test. The test focused heavily on verbal ability. Binet and Simon intended that the test be used to aid in identifying schoolchildren who were intellectually challenged, which in turn would pave the way for providing the children with professional help.<ref name=":0" /> The Binet-Simon test became the foundation for the later-developed Stanford–Binet Intelligence Scales.

The origins of personality testing date back to the 18th and 19th centuries, when phrenology was the basis for assessing personality characteristics. Phrenology, a pseudoscience, involved assessing personality by way of skull measurement.<ref name="psychassess"/> Early pseudoscientific techniques eventually gave way to empirical methods. One of the earliest modern personality tests was the Woodworth Personal Data Sheet, a self-report inventory developed during World War I to be used by the United States Army for the purpose of screening potential soldiers for mental health problems and identifying victims of shell shock (the instrument was completed too late to be used for the purposes it was designed for).<ref name="psychassess">Template:Cite book</ref><ref name=":1"/> The Woodworth Inventory, however, became the forerunner of many later personality tests and scales.<ref name=":1"/>

Principles

The development of a psychological test requires careful research. Some of the elements of test development involve the following:

  • Standardization - All procedures and steps must be conducted with consistency from one testing site/testing occasion to another. Examiner subjectivity is minimized (see objectivity next). Major standardized tests are normed on large try-out samples in order to understand what constitutes high, low, and intermediate scores.
  • Objectivity - Scoring such that subjective judgments and biases are minimized; scores are obtained in a similar manner for every test taker (see below).
  • Discrimination - Scores on a test should discriminate members of extreme groups; for example, each subscale of the original MMPI distinguished hospitalized patients suffering from mental illness and members of a well comparison group.<ref>Template:Cite book</ref><ref>Template:Cite book</ref>
  • Test Norms - Part of the standardization of large-scale tests (see above). Norms help psychologists learn about individual differences. For example, a normed personality scale can help psychologists understand how some people are high in negative affectivity (NA) and others are low or intermediate in NA. With many psychoeducational tests, test norms allow educators and psychologists obtain an age- or grade-referenced percentile rank, for example, in reading achievement.
  • Reliability - Refers to test or scale consistency. It is important that individuals score about the same if they take a test and an alternate form of the test or if they take the same test twice, within a short time window. Reliability also refers to response consistency from test item to test item.
  • Validity - Refers to evidence that demonstrates that a test or scale measures what it is purported to measure.<ref Name = "Nunnally"/><ref>Template:Cite book</ref>

Sample of behavior

The term sample of behavior refers to an individual's performance on tasks that have usually been prescribed beforehand. For example, a spelling test for middle school students cannot include all the words in the vocabularies of middle schoolers because there are thousands of words in their lexicon; a middle school spelling test must include only a sample of words in their vocabulary. The samples of behavior must be reasonably representative of the behavior in question. The samples of behavior that make up a paper-and-pencil test, the most common type of psychological test, are written into the test items. Total performance on the items produces a test score. A score on a well-constructed test is believed to reflect a psychological construct such as achievement in a school subject like vocabulary or mathematics knowledge, cognitive ability, dimensions of personality such as introversion/extraversion, etc. Differences in test scores are thought to reflect individual differences in the construct the test is purported to measure.<ref Name = "Nunnally"/>

Types

There are several broad categories of psychological tests:

Achievement tests

Achievement tests assess an individual's knowledge in a subject domain. Some academic achievement tests are designed to be administered by a trained evaluator. By contrast, group achievement tests are often administered by a teacher. A score on an achievement test is believed to reflect the individual's knowledge of a subject area.<ref name=":1" />

There are generally two types of achievement tests, norm-referenced and criterion-referenced tests. Most achievement tests are norm-referenced. The individual's responses are scored according to standardized protocols and the results can be compared to the results of a norming group.<ref name=":1" /> Norm-referenced tests can be used to underline individual differences, that is to say, to compare each test-taker to every other test-taker. By contrast, the purpose of criterion referenced achievement tests is ascertain whether the test-taker mastered a predetermined body of knowledge rather than to compare the test-taker to everyone else who took the test. These types of tests are often a component of a mastery-based classroom.<ref name=":1" />

The Kaufman Test of Educational Achievement is an example of an individually administered achievement test for students.<ref>Template:Cite web</ref>

Aptitude tests

Psychological tests have been designed to measure abilities, both specific (e.g., clerical skill like the Minnesota Clerical Test) and general abilities (e.g., traditional IQ tests such as the Stanford-Binet or the Wechsler Adult Intelligence Scale). A widely used, but brief, aptitude test used in business is the Wonderlic Test. Aptitude tests have been used in assessing specific abilities or the general ability of potential new employees (the Wonderlic was once used by the NFL).<ref>NFL Wonderlic</ref> Aptitude tests have also been used for career guidance.<ref>Template:Cite book</ref>

Evidence suggests that aptitude tests like IQ tests are sensitive to past learning and are not pure measures of untutored ability.<ref>Template:Cite journal</ref> The SAT, which used to be called the Scholastic Aptitude Test, had its named changed because performance on the test is sensitive to training.<ref>Template:Cite book</ref>

Attitude scales

An attitude scale assesses an individual's disposition regarding an event (e.g., a Supreme Court decision), person (e.g., a governor), concept (e.g., wearing face masks during a pandemic), organization (e.g., the Boy Scouts), or object (e.g., nuclear weapons) on a unidimensional favorable-unfavorable attitude continuum. Attitude scales are used in marketing to determine individuals' preferences for brands. Historically social psychologists have developed attitude scales to assess individuals' attitudes toward the United Nations and race relations.<ref>Brown, R. (1965). Social psychology. New York: The Free Press.</ref> Typically Likert scales are used in attitude research. Historically, the Thurstone scale was used prior to the development of the Likert scale. The Likert scale has largely supplanted the Thurstone scale.<ref name=":1"/>

Biographical Information Blank

The Biographical Information Blanks or BIB is a paper-and-pencil form that includes items that ask about detailed personal and work history. It is used to aid in the hiring of employees by matching the backgrounds of individuals to requirements of the job.

Clinical tests

The purpose of clinical tests is to assess the presence of symptoms of psychopathology .<ref name="Psychological Corporation">Template:Cite book</ref> Examples of clinical assessments include the Minnesota Multiphasic Personality Inventory (MMPI), Millon Clinical Multiaxial Inventory-IV,<ref>Millon, T. (1994). Millon Clinical Multiaxial Inventory-III. Minneapolis, MN: National Computer Systems.</ref> Child Behavior Checklist,<ref>Template:Cite book</ref> Symptom Checklist 90<ref>Derogatis L. R. (1983). SCL90: Administration, Scoring and Procedures Manual for the Revised Version. Baltimore: Clinical Psychometric Research.</ref> and the Beck Depression Inventory.<ref name="Psychological Corporation"/>

Many large-scale clinical tests are normed. For example, scores on the MMPI are rescaled such that 50 is the middlemost score on the MMPI Depression scale and 60 is a score that places the individual one standard deviation above the mean for depressive symptoms; 40 represents a symptom level that is one standard deviation below the mean.<ref>Ben-Porath, Y.-S., Tellegen, A. (2011). Minnesota Multiphasic Personality Inventory Manual of Administration-2-RF. Minneapolis: University of Minnesota Press</ref>

Criterion-referenced

A criterion-referenced test is an achievement test in a specific knowledge domain.<ref name=":1" /> An individual's performance on the test is compared to a criterion. Test-takers are not compared to each other. A passing score, i.e., the criterion performance, is established by the teacher or an educational institution. Criterion-referenced tests are part and parcel of mastery based education.

Direct observation

Psychological assessment can involve the observation of people as they engage in activities. This type of assessment is usually conducted with families in a laboratory or at home. Sometimes the observation can involve children in a classroom or the schoolyard.<ref>Reid, J. B., Eddy, J. M., Fetrow, R. A., & Stoolmiller, M. (1999). Description and immediate impacts of a preventive intervention for conduct problems. American Journal of Community Psychology, 27, 483–517.</ref> The purpose may be clinical, such as to establish a pre-intervention baseline of a child's hyperactive or aggressive classroom behaviors or to observe the nature of parent-child interaction in order to understand a relational disorder.<ref>Template:Cite journal</ref> Time sampling methods are also part of direct observational research. The reliability of observers in direct observational research can be evaluated using Cohen's kappa.

The Parent-Child Interaction Assessment-II (PCIA)<ref>Template:Cite journal</ref> is an example of a direct observation procedure that is used with school-age children and parents. The parents and children are video recorded playing at a make-believe zoo. The Parent-Child Early Relational Assessment<ref>Template:Cite journal</ref> is used to study parents and young children and involves a feeding and a puzzle task. The MacArthur Story Stem Battery (MSSB)<ref>Bretherton, I., Oppenheim, D., Buchsbaum, H., Emde, R. N., & the MacArthur Narrative Group. (1990). MacArthur Story-Stem battery. Unpublished manual.</ref> is used to elicit narratives from children. The Dyadic Parent-Child Interaction Coding System-II<ref>Template:Cite journal</ref> tracks the extent to which children follow the commands of parents and vice versa and is well suited to the study of children with Oppositional Defiant Disorders and their parents.

Interest inventories

Psychological tests include interest inventories.<ref>Anastasi, A., & Urbina, S. (1997). Psychological testing (7th ed.). Upper Saddle River, NJ: Prentice Hall.</ref> These tests are used primarily for career counseling. Interest inventories include items that ask about the preferred activities and interests of people seeking career counseling. The rationale is that if the individual's activities and interests are similar to the modal pattern of activities and interests of people who are successful in a given occupation, then the chances are high that the individual would find satisfaction in that occupation. A widely used instrument is the Strong Interest Inventory, which is used in career assessment, career counseling, and educational guidance.<ref>Template:Cite journal</ref><ref>Template:Cite journal</ref>

Neuropsychological tests

Template:Main

Neuropsychological tests are designed to assess behaviors that are linked to brain structure and function. An examiner, following strict pre-set procedures, administers the test to a single person in a quiet room largely free of distractions.<ref name=":1" /> An example of a widely used neuropsychological test is the Stroop test.

Norm-referenced tests

Items on norm-referenced tests have been tried out on a norming group and scores on the test can be classified as high, medium, or low and the gradations in between.<ref name=":1" /> These tests allow for the study of individual differences. Scores on norm-referenced achievement tests are associated with percentile ranks vis-á-vis other individuals who are the test-taker's age or grade.

Personality tests

Template:Main

Personality tests assess constructs that are thought to be the constituents of personality. Examples of personality constructs include traits in the Big Five, such as introversion-extroversion and conscientiousness. Personality constructs are thought to be dimensional. Personality measures are used in research and in the selection of employees. They include self-report and observer-report scales.<ref>Ashton, M. C., (2017). Individual Differences and Personality (3rd ed.). Amsterdam: Elsevier.</ref> Examples of norm-referenced personality tests include the NEO-PI, the 16PF Questionnaire, the Occupational Personality Questionnaires,<ref name="psychassess" /> and the Five-Factor Personality Inventory.<ref>Jolijn Hendriks, A.a., Hofstee, W.K.B, & De Raad, B. (1999). The Five-Factor Personality Inventory (FFPI). Personality and Individual Differences, 27(2), 307-325. https://doi.org/10.1016/S0191-8869(98)00245-1</ref>

The International Personality Item Pool (IPIP) scales assess the same traits that the NEO and other personality scales assess. All IPIP scales and items are in the public domain and, therefore, are available free of charge.<ref>International Personality Item Pool. [2] Template:Webarchive Accessed July 14, 2020</ref>

Projective tests

Template:Main Projective testing originated in the first half of the 1900s.<ref name="Wasserman">Template:Cite book</ref> The idea animating projective tests is that the examinee is thought to project hidden aspects of his or her personality, including unconscious content, onto the ambiguous stimuli presented in the test. Examples of projective tests include Rorschach test,<ref>Template:Cite journal</ref> Thematic apperception test,<ref>Murray, H. (1943). The Thematic Apperception Technique. Cambridge, MA: Harvard University Press. OCLC 223083.</ref> and the Draw-A-Person test.<ref>Template:Cite book</ref> Available evidence, however, suggests that projective tests have limited validity.<ref>Template:Cite journal</ref>

Psychological symptom scales

Public safety employment tests

Vocations within the public safety field (e.g., fire service, law enforcement, corrections, emergency medical services) are often required to take industrial or organizational psychological tests for initial employment and promotion. The National Firefighter Selection Inventory, the National Criminal Justice Officer Selection Inventory, and the Integrity Inventory are prominent examples of these tests.<ref>Public Safety Self Assessment. National Testing Network</ref><ref>National Firefighter Selection Inventory Technical Report, 2011, I/O Solutions, Inc., Westchester, Illinois 60154 [3]</ref><ref>National Criminal Justice Officer Selection Inventory Squared</ref><ref>Integrity Inventory</ref>

Sources of psychological tests

Thousands of psychological tests have been developed. Some were produced by commercial testing companies that charge for their use. Others have been developed by researchers, and can be found in the academic research literature. Tests to assess specific psychological constructs can be found by conducting a database search. Some databases are open access, for example, Google Scholar (although many tests found in the Google Scholar database are not free of charge).<ref>Template:Cite web</ref> Other databases are proprietary, for example, PsycINFO, but are available through university libraries and many public libraries (e.g., the Brooklyn Public Library and the New York Public Library).<ref>Template:Cite web</ref>

There are online archives available that contain tests on various topics.

  • APA PsycTests. Requires subscription<ref>Template:Cite web</ref>
  • Mental Measurements Yearbook<ref>Template:Cite web</ref>- a non-profit that provides independent reviews of thousands of distinct psychological tests.
  • Assessment Psychology Online has links to dozens of tests for clinical assessment.<ref>Template:Cite web</ref>
  • International Personality Item Pool (IPIP) contains items to assess more than 100 personality traits including Five Factor Model.<ref>Template:Cite web</ref>
  • Organization of Work: Measurement Tools for Research and Practice. NIOSH site devoted to Occupational Health and Safety<ref>Archive of NIOSH'S website</ref>

Test security

Many psychological and psychoeducational tests are not available to the public. Test publishers put restrictions on who has access to the test. Psychology licensing boards also restrict access to the tests used in licensing psychologists.<ref>Template:Cite web</ref><ref>Template:Cite journal</ref> Test publishers hold that both copyright and professional ethics require them to protect the tests. Publishers sell tests only to people who have proved their educational and professional qualifications. Purchasers are legally bound not to give test answers or the tests themselves to members of the public unless permitted by the publisher.<ref>Template:Cite web</ref>

The International Test Commission (ITC), an international association of national psychological societies and test publishers, publishes the International Guidelines for Test Use, which prescribes measures to take to "protect the integrity" of the tests by not publicly describing test techniques and by not "coaching individuals" so that they "might unfairly influence their test performance."<ref>International Test Commission (2000) International Guidelines for Test Use</ref>

See also

Template:Columns-list

References

Template:Reflist

Template:Commons category

Template:Psychology Template:Psychologic and psychiatric evaluation and testing Template:Authority control