Pre-experimentally, the experiment should be designed so as to minimise the chance that the data are produced by anything but the phenomenon of interest. It is a simple fact of experimental life that countless things can go wrong, often in unforeseeable ways. To begin with, we prefer the data to be the result of the subjects' input, and not of a malfunctioning software code or a terminal gone crazy. It is therefore a good idea to investigate the equipment for proper functioning beforehand. We also prefer the data to result from an experimental subject's own reflections about the task, and not from one person copying another person's answers, and so we screen the subjects off from each other. Further, a subject's decisions should result from his or her reflections on the task and not from error and chance, and so we provide very clear instructions before the experiment begins, allow for time to think during the experiment, and make the on-screen prompts as unambiguous as possible. And so on.
Even a perfectly designed experiment will not immediately reveal a phenomenon of scientific interest. As we have seen, an experiment, whether well or ill designed, produces data, not phenomena. Thus, post-experimentally, the resulting data has to be aggregated and (statistically) analysed in order to allow the drawing of inferences about the phenomenon of interest.
An experimental result - say, a measured difference (of size d, say) in average contributions between heterogenous and homogenous groups of players - can then be said to be ‘internally valid' if (and only if) it correctly indicates that a causal effect (of size d) of agent heterogeneity on contribution levels exists in the experimental population. (The internal validity of an experimental result implies thus the existence of a phenomenon.) A result is said to be ‘externally valid' if it correctly indicates that the causal effect exists in other populations.
Slovic and Lichtenstein 1983 suggest that preference reversals are the result of informationprocessing effects, and occur because the mental processes brought to bear on valuation tasks are different from those brought to bear on choice tasks. Thus, one explanation of the reversal phenomenon holds that people have (at least) two different sets of preferences which are activated in different decision situations. Another explanation holds that there is something wrong with the experimental procedure. Yet another explanation is offered by regret theory. Regret theory is a theory of choice under uncertainty which models choice as the minimising of a function of the regret vector, defined as the difference between the outcome yielded by a given choice and the best outcome that could have been achieved in that state of nature.
The phenomenon of preference reversals is therefore consistent with at least three theoretical interpretations. This is a general fact about hypothesis testing: because theoretical hypotheses are never tested in isolation, and there is always some uncertainty concerning auxiliary assumptions, an apparent conflict between an experimentally established phenomenon and a theoretical hypothesis can always be interpreted in more than one ways - as refutation of the hypothesis at stake, as refutation of another theoretical claim made in the derivation of the prediction or as violation of an assumption about the experimental set-up.
The problem of judging whether an inference from experimental situation to another situation (which may but does not have to be a policy situation) has come to be known as the ‘problem of external validity. The problem is essentially to decide under what conditions an experimental result can be projected onto a hitherto unobserved situation of interest. This is a genuine problem because experimental situations differ by their very nature more or less dramatically from those situations about which we would ultimately like to learn. One reason for this we have already encountered: experimental control makes experimental situations artificial to some extent, and people's behaviour may differ between these artificial laboratory and more ‘natural' policy situations.
There is another reason: University students are cheap, available and relatively reliable. They are used in economic experiments because it is convenient to do so, not because for some good epistemic reason. And there is a good chance that their behaviour differs from the behaviour of others, be they the population at large or specific target populations such as ‘managers of companies interested in buying licences for the use of the electromagnetic spectrum'.
People's behaviour in economic experiments appear to vary with factors such as level of monetary incentives, culture and social norms, experience, the distribution of information, social status and what have you. It is therefore difficult to predict whether an experimental result will still hold when some of these factors differ between experiment and target situation - as is invariably the case. As far as I can see, the attempts at solutions philosophers of science have developed can be organised into four groups: solutions based on causal mechanisms, on causal tendencies, on engineering, and on field experiments.
An ‘artifactual field experiment' (AFE), then, is one that is just like a laboratory experiment, except that the subject pool is more representative of the target population of interest.
A ‘framed field experiment' (FFE) is also tightly controlled but with the field context in the commodity, task, stakes or information set of the subjects
A ‘natural field experiment' (NFE), finally, moves the environment from the lab to the field. That is, subjects are observed in their natural habitat rather than in a university laboratory. The only difference between NFEs and naturally occurring situations is that subjects are randomised into treatment and control groups (usually, unwittingly).Experiments are a powerful tool of scientific investigation because they allow the control of background factors in a way that makes causal inferences very reliable. But the reliability comes at the cost of increased unrealisticness with respect to the situations economists are ultimately interested in: ‘natural' market or policy situations.