This page is under construction.
Article: "Overview of Psychometrics" by Catherine Price
1. Why is a psychometric test different to a list of
questions?
2. Choosing appropriate tests for your use
3. Names of tests
4. History of psychometric testing
5. Level A/B etc.
6. Different categories of test: ability, interest and
personality
There are three distinct
categories of test:
- Ability Tests
- Interest Inventories
- Personality Questionnaires
Ability Tests measure how well someone can do something, how much
they know, and sometimes how great their potential is.
They may test intellectual
abilities such as verbal fluency or numerical reasoning, or they may test
something practical such as clerical skills or programming ability.
These tests have answers
which are either right/wrong or good/bad.
They are usually timed.
Examples of ability tests
are:
- VMT3 – Verbal Reasoning (SHL)
- NMT 4 – Numerical (SHL)
- Critical Thinking Appraisal (Watson Glaser)
- The Able Series (OPP)
Interest Inventories usually describe a person’s
preferred working style, likes and dislikes, and attitudes to external things
such as scientific and artistic activities.
They are self-reporting
and do not have right or wrong answers.
They are not usually
timed, although candidates are usually encouraged to complete them at a single
sitting and not to dwell too long on their responses.
Examples of interest
inventories are:
- Strong Inventory
- Career Interest Inventory
- Occupational Interest Profile
Personality
Questionnaires measure someone’s
preferred or typical ways of thinking or acting, that is, their underlying
characteristics or “traits”.
They are self-reporting
and do not have right or wrong answers.
They are not usually timed
although, as with interest inventories, candidates are usually encouraged to
complete them at a single sitting and not to dwell too long on their responses.
Most personality
questionnaires which are used occupationally, rather then clinically, fall into
two broad categories: type and trait.
Type indicators have a very venerable history going right back to
ancient times. The ancient Greek physician Hippocrates identified four
fundamental human “temperaments”, which were based on “predominant body
fluids”. These characterised people variously as sanguine, melancholic,
choleric or phlegmatic. These “types” were the basis of most Western psychology
and medical practice for centuries, through the Middle Ages, Renaissance and
beyond.
In the twentieth century,
Carl Jung undertook a modern scientific study of personality types, most
famously identifying the difference between extravert and introvert personality
types. Jung’s work was developed further by the mother and daughter team of
Isabel Myers and Katharine Briggs, who extended Jung’s classifications and
constructed the Myers-Briggs Type Indicator (MBTI). MBTI remains among
the best-known type-based occupational psychometric instruments.
Trait indicators seek to measure the individual dimensions which
make up someone’s personality. These have been variously identified as 21 (Hans
Eysenck), 16 (Raymond Cattell), 32 (Savill and Holdsworth) and most recently
“The Big 5” (Costa and Macrea). The ways in which the individual dimensions
interact together can give an indication of the individual’s likely approaches
to working with people, and of their thinking styles, feelings and emotions.
Although they are not
ability tests, trait indicators can be used to draw inferences about an
individual’s likely “fit” to competencies, although this always needs to be
tested in a competency-based interview.
Examples of trait
indicators are:
- 16PF
(OPP)
- 15FQ
(Psytech)
- OPQ 32 (SHL)
Some tests are designed to
home in one specific set of competencies or approaches as, for example, in the Customer
Contact Styles Questionnaire (SHL), which is aimed at sales staff.
Others are hybrids between
a personality questionnaire and an interest inventory, and are sometimes called
Values Questionnaires. An example of this kind of test would be SHL’s Motivation
Questionnaire.
7. Using tests fairly and ethically: validity,
reliability and fairness
When considering any new
psychometric instrument or questionnaire, two key questions need to be
answered:
- What does the test measure?
- How accurately does it do that measurement?
The first of these
questions deals with the validity of the test, the second with its reliability.
Validity may be defined as follows:
“Validity is the extent
to which a test measures what it claims to be measuring, the extent to which it
is possible to make appropriate inferences from the test score.”
British Psychological Society Steering Committee on Test
Standards, 1989
There are a number of
issues to be taken into account when considering the validity of a test, and if
you do the British Psychological Society (BPS) Level A certificate you will go
into them in some detail. But they all come back to the basic questions, “Does
this test really measure what it says it is measuring?” and “Does it relate
sensibly to the real world?”
When you are choosing an
instrument to use in your organisation, however, you also need to be aware of
something called face validity. This means that the test looks
appropriate and relevant to the purpose for which it is being used, and to the
environment in which you work. It can be very important in getting buy-in from
line managers in your organisation, or staff or candidates who are taking the
test. Remember, though, that face validity is not by itself a guarantee of
actual validity.
Reputable test publishers
will be able to show you the research that they’ve done into the validity of
their tests, and if you’re thinking of investing in a test then you need either
to review this data yourself or get a suitably qualified person to advise you
on it.
Reliability deals with the repeatability or reproducibility of
a test or measure. If any measure is to be useful, it needs to be one which
gives more or less the same result each time it is applied. Having said this,
people’s personality and preferences can change over time, and for this reason
the reliability measures for personality questionnaires are always lower than
for ability and aptitude tests.
The most common way of
measuring reliability is to ask volunteers to complete the questionnaire twice,
and then calculate the correlation between the scores on the first and second
occasion. This is known as “test-retest”. However, because in practice this can
be difficult to arrange, there are other, more statistically based ways of
measuring reliability. These are also dealt with in some detail in the BPS
Level A certificate.
Fairness is extremely important, particularly if the tests
are being used as part of a selection process. The rules for ensuring that
tests are administered fairly can be summarised as follows:
- Choose a test which measures a genuine job
requirement
- Choose a test which has been validated to
reduce the potential for discrimination against minority groups
- Issue practice booklets where these are
available
- Follow the administration instructions exactly
- Make sensible adjustments for disabled
candidates
- Always offer feedback to candidates
8. Keeping abreast of new developments
In order to keep up to
date with the latest developments in the field, those administering
psychometric tests should make use of the following resources.
You can often find general
articles on psychometrics in publications such as People Management that
serve the needs of human resources professionals. For those specialising in
psychometrics, there is also the BPS’s Selection and Development Review,
which deals with issues relating to psychometrics in much greater depth.
- Conferences and networking
Find out how other
organisations are using psychometric instruments, either by going to talks
given by professionals, or by talking to colleagues in your network.
A new development from the
BPS is that they now undertake test reviews according to European guidelines
and publish full reviews of all tests that have been submitted by their
publishers. These are quite expensive to buy, but could be well worth it if you
are about to invest heavily in a particular instrument. If you don’t have any
certificate holders in your organisation, then it may be worth consulting an
independent practitioner and asking them to give you some professional advice.
9.
Norms
Most psychometric
instruments are scored by comparing the answers given by a staff member or
candidate to those given by a population of people who have taken the same test
in the past.
The scores are, therefore,
essentially comparative.
In the case of an ability
test, the candidate’s score will be described as a “percentile”, e.g. “This
candidate did better that 22% of people taking this test,” or “This candidate
did better than 75% of the people taking this test.”
The population of people
with whom the candidate is compared is known as the “norm group”. This may be a
general population, or a subset derived by occupational group (e.g. public
sector workers, or those engaged in manufacturing), management or professional
level (e.g. senior managers and directors, or administrators), educational
attainment (school leavers, or graduates) or even gender. Test publishers will
produce norm groups for their tests, and it is the responsibility of the person
marking the test to select an appropriate norm group for their context.
For personality questionnaires,
the candidate will be judged as being “as likely as most people”, or “less
likely” or “very much more likely”, to behave in a certain way or adopt a
certain approach.
There is a standard
statistical approach for translating the candidate’s “raw score” into these
comparative scores. You need to understand it in order to pass BPS Level A.
It is very important, in
the interests of both fairness and accuracy that you:
- Administer the test exactly as instructed, so
that in making comparisons with the norm group you are genuinely comparing
“like with like”.
- Choose an appropriate norm group.
|