Lecture 6: Motives, intrinsic and extrinsic

Your author uses the term “incentive” to refer to an environmental event that attracts or repels a person toward or away from an action: “incentives always precede behavior” (p. 101). This took me a while to get used to this use of the term because everyday language tends to use incentive as another word for consequence, what might follow a behavior to affect it’s frequency.

Dr. Reeve’s understanding of “behavioral psychology” seems based on very old models, such as the stimulus-response theories of Miller and Dollard. These were built on a single factor learning theory (respondent conditioning or learning, Pavlovian conditioning: the associative connection of stimuli occurring contiguously so that both now elicited responses originally only produced by one). A limitation of these theories was that they could only explain how behavior came under the control of different stimuli, not how new behavior were developed in the organism. These early learning theories were supplanted by two-factor models (the instrumental learning of Thorndike and operant learning of Skinner. One of the contributions of Dr. Skinner was a clear picture of how novel behavior could be “shaped” by differentially rewarding successive approximations of a desired outcome. (The two-factor models were, in turn, supplanted by the three-factor learning models of Rotter and Bandura as cognitive factors gained importance in understanding behavior: respondent conditioning, operant learning, observational learning [modeling, imitation]).

In the language of operant psychology such events as Dr. Reeve call incentives are referred to as “discriminative stimuli” (they inform organisms as to prevailing reinforcement contingencies). The light comes on in the Skinner box and signals that bar pressing by the rat will now be followed by food pellets. The 40 MPH sign on Main Street signals that it is now safe to increase your speed above the 30 MPH city street without attracting the attention of a passing patrol car. Discriminative stimuli can be thought of as signals that let us know what the current “rule” are for different outcomes (“contingencies of reinforcement”).

Your author believes that, “a reinforce must be defined in a manner that is independent from its effects on behavior.” (p. 101). Well, Dr. Skinner did not do this; in fact he did the exact opposite: he define reinforcement in terms of it’s effects on behavior. (This raises its own problems for both theory and application, but it is how he did it and how most discussions of “operant psychology” continue to define it. A practical difficulty of this “functional definition” of reinforcement is that you never know if a stimulus is going to be a reinforcer for a person at any given point in time, the best you know is that, in similar circumstances, in the past it functioned as a reinforcer.

Despite all my craping about Ch. 5, Dr. Reeve does a nice job on p. 103 of discussing many of the factors that will influence reinforcer effectiveness (or “power”): quantity or intensity, immediacy, degree of deprivation for that stimulus, value of that stimulus to recipient (what Dr. R. calls “fit” or “perceived value”)

Dr. Reeve’s says there are three types of consequences: positive reinforcers, negative reinforcers, and punishers (p. 103). Most texts will tell you there are four types of consequences, corresponding to either the presentation or withdrawal of two categories of stimuli, positive and negative.

Positive reinforcement: an action is followed by a reinforcing stimulus
Negative reinforcement: an action is followed by removal of an aversive stimulus
Punishment: an action is followed by an aversive stimulus

Dr. Reeve gets all this correct (not a trivial success, I have seen college textbooks confuse negative reinforcement and punishment). What he leaves out is the fourth possibility:

Response cost: an action is followed by removal of a reinforcing stimulus.

Response costs (library fines, speeding tickets, loss of privilege , taking away something you want) are a common and important aspect of environmental control of behavior. Especially since humans tend to be “risk averse”, we are more heavily influenced by possible losses than by possible gains. (We’ll need to talk about that more latter.)

Another important aspect of reinforcement (and the other operations) is the schedule the contingencies arrange between the behavior and the consequnce. This is usually discussed in terms of schedules positive reinforcement (but the same schedules apply to the other three operations):

A “continuous” schedule of reinforcement means that every instance of the behavior under consideration is followed by a positive reinforcement (assuming we know what is reinforcing for the organism at that moment). Every time you perform the action you are paid off by the universe. This is actually rather rare in everyday life, most natural contingencies we observe are on some type of “variable schedule” (sometimes you get paid, sometimes you don’t).

A continuous schedule of consequences generates the most rapid acquisition of behavior. It is often used in training a new behavior. Variable schedules of reinforcement tend to general more lasting histories of performance. The variable schedules of reinforcement (ration, interval, delayed) have interesting and sometimes practical characteristics (well understood by the people who operate gambling casinos) but that is for another course. What is especially important to our discussion is what happens when a schedule of reinforcement is discontinued: the behavior may occur but is, now, no longer followed by a positive reinforcer. The technical term is “extinction.”

There are predictable short term effects associated with extinction:

Extinction phenomena

An increase in frequency of the response (“response bursting”)
An increase in the magnitude of the response
Variation in the topology (form) of the response
Emotional behavior
Aggression (possibly)

Extinction is what you experience when you favorite pop/junk food machine “cheats” you by failing to deliver you selection. Consider you typical response: punch the button several more times, punch the button harder, hold the button depressed for a longer period, feel frustrated and angry, and (possibly) curse at or kick the machine. The machine has reliably functioned in the past and now appears to be broken–you make an effort to restore appropriate functioning. This is how an organism responds to being placed on an extinction schedule.
I believe that understanding extinction phenomena is important in unraveling the results of experiments on the so-called cost of extrinsic reinforcement.

[Truth in lecturing warning (Also known as Teacher Bias): I don’t believe these is a hidden cost effect of extrinsic reinforcement, I don’t believe that there is an essential difference between extrinsic and intrinsic reinforcement, I do believe that different consequences have different (and varying) values to us, and I do believe we all experience multiple schedules of reinforcement and multiple degrees of responsiveness of different consequences continuously throughout our life.]

I believe that extinction effects and a restricted range of experimental circumstances are a better explanation for the reputed negative effects of extrinsic reinforcement than a hypothesized erosion of intrinsic motivation. But, you should not believe something just because I believe it. If this question interests you, go take a look at the experimental and clinical literature on reinforcement as a learning and therapeutic tool.

Let us consider some different crude categories of consequences:

Reinforcement Selection

One heuristic for selection of potential reinforcers is to consider two aspects of a potential reinforcer: potency and hassle. Potency refers to the “power” or “impact” of the stimulus on the individual; hassle refers to how difficult and complicated using the stimulus therapeutically will be. My experience is that these two aspects are inversely related. Choosing a reinforcer for therapeutic use involves selection the least powerful stimulus that will “get the job done” (and also be the easiest to manipulate). Consider the following rough hierarchy of classes of stimulus events:

information (knowledge of results, interesting stuff)
social events (praise, hugs, pats, smiles, vocalizations)
physical: a hug, but your author is correct, the child might prefer that bowl of chocolate pudding; and a child on the spectrum might find a hug aversive.
verbal: praise, and sometimes public praise is advocated over private praise, but if a child is socially anxious being praised publicly in front of class may be aversive.
activities (doing things)
enjoyable activities
stimulating activities
challenging activities
The Premack Principle: high frequency behavior will reinforce lower frequency behavior
material (stuff)
consumable (food, drink, scratch off lottery tickets)
tangible (toys, clothes, a new smart phone)
generalized conditioned (tokens, money)
I would suggest that the power of these events increase as you go down the list, and also the
difficulties and complications of you or a family employing these events (satiation, cost, availability,
complexity, and other problems). You task will be easiest if you work at staying as close to the top of
the list as possible and will still obtaining a therapeutic effect.

Punishment

Skinner defined aversive stimuli as consequences that could negatively reinforce a behavior that terminated this stimulus (Again, note: a “functional definition”: the stimulus is defined by the effect it has on behavior.). When aversive stimulus are presented following a behavior the operation is known as punishment.

Skinner’s research (and that of others) demonstrated some important differences between the effects of punishment and extinction. Punishment can suppress behavior but tends to have little (Skinner believed no effect) effect on response strength. Extinction actually decreases response strength.
Punishment does have a number of undesirable side effects, which your author discusses reasonable accurately. It often lead to emotional behavior and sometimes aggression in the recipient, can motivate non-helpful/nonadaptive avoidance behavior, models aversive control, and can lead to the punishing agent being feared (or, in extreme cases, over-identified with as in abused child and Stockholm syndrome cases). Finally, any beneficial effects of punishment are very dependent upon the punishment procedure being carried out close to optimal with respect to elements associated with effective suppression (maximum intensity, immediate delivery, for every instance of the target behavior), deviation from these optimum aspects lead to rapid decline of any effect. [In contrast, positive reinforcement procedures are very “robust”: you can deviate quite a bit from optimal characteristics and still get a good result. Moral of the day: It is far easier to encourage desirable behavior than get rid of undesirable behavior (Except by crowding it out with good behavior.).