Chapter 4
Operant Conditioning: Skinner's Radical Behaviorism
Emphasized practical usefulness of science of human
behavior rather than testing formal theories.
Believed notions of subconscious or other fictions, like Hulls concepts of
intervening variables, are misleading and wasteful.
Believed explanations of behavior rely exclusively on directly observable
phenomena.
Believed psychology is considered an objective science whose methods involve the
analysis of behavior without appeal to subjective mental events.
Basic Assumptions
Human behavior follows certain laws.
Causes of behavior are outside the person and can be studied (contrary to most
of psychology, which focuses on behavior within the person).
Discovering and describing laws between organism and environment occurs by
specifying 3 things:
occasion upon which a response occurs
the response itself
the reinforcing consequences
2 kinds of variables
IVs – manipulated experimentally
DVs – actual behavior; to be controlled
Departure from Pavlov
Skinner believed that Pavlov’s classical conditioning explained behaviors only
where the initial response can be elicited by a known stimulus.
Responses elicited by a stimulus = respondents
Responses emitted by an organism = operants
Believed that experimental analysis of behavior requires analysis of (a) what it
does, (b) circumstances under which it acts, and (c) consequences of its
actions.
Classical conditioning = Type S (stimulus) conditioning
Operant conditioning = Type R (response) conditioning
Comparison with Pavlov
Skinner’s box – experimental chamber, a cage-like structure that can be equipped
with a lever, a light, a food tray, a food-releasing mechanism, and an electric
grid through the floor
Pavlov’s dogs had a predictable response with placed in a harness and injected
with food powder
Skinner’s rats did not respond as predictably when placed in the box
Operant Learning – when a reinforcer follows a response, the probability that
the response will occur again under similar conditions increases; the conditions
then may have control over the response, serving as a discriminative stimulus.
Reinforcer
A reinforcer is an event that follows a response and that changes the
probability of a response’s occurring again.
In experimentation, the reinforcer should be operationally defined, by
observable and measurable behaviors, eliminating need for speculation.
Reinforcement
Positive reinforcement is a satisfying consequence of a behavior.
Occurs when the probability of a response’s occurring increases as a function of
something being added to a situation
Referred to as “reward”
Negative reinforcement
Occurs when the probability of a response’s occurring increases as a function of
something being taken away from a situation
Referred to as “relief”
Punishment
punishment results in suppression of behavior
when a positive contingency is removed; penalty or removal
punishment
when a negative contingency follows behavior; castigation or
presentation punishment
punishment vs. negative reinforcement
negative reinforcement increases probability of behavior;
punishment does not
negative reinforcement involves termination of an event that
might be considered aversive; punishment involves
introducing negative contingency or terminating a positive one
Reinforcement Schedules
what reinforcements are used and how and when they will be used = schedules of
reinforcement
continuous reinforcement = every desired response is reinforced; no further
choices to make
intermittent (partial) reinforcement = reinforcement occurs only some of the
time
ratio schedule = based on a proportion of responses
interval schedule = based passage of time
fixed or random
fixed = exact time or precise response that will be followed by a reinforcing
event is predetermined and unchanging
fixed-ratio = occurs after every 5th (for example) correct response
fixed-interval = reinforcement will be available as soon as the chosen time
interval has elapsed, immediately following the next correct response.
random = provide reinforcing events at unpredictable times
random ratio = occurs at a rate of 1 reinforcement to 5 responses on average;
exact point of reinforcement is unknown
Random interval = occurs at an interval of 1 reinforcement to 5 minutes on
average; exact point of reinforcement is unknown.
Superstitious Schedules
In many cases, outcomes are noncontingent on organism’s behavior
Superstitious schedule = special kind of noncontingent fixed-interval schedule
in which reinforcement occurs at fixed time intervals without the requirement
that there be a correct response.
Follows from the law of operant conditioning that any behavior just before
reinforcement is strengthened.
Effects of Reinforcement Schedules
effects of schedules on acquisition
continuous schedule = more rapid learning
intermittent schedules = haphazard and slow learning
effects on extinction
continuous = more rapid extinction
fixed schedules = shorter acquisition time than random, as well as more rapid
extinction
Effects on rate of responding
continuous = high response rate
random (variable) ratio = much higher than interval
fixed interval = response rate drops dramatically immediately after
reinforcement but responds again at a high rate just before the next
reinforcement
spontaneous recovery = although extinguished, Pavlov stated that if the CS is
presented again after the passage of time, CR might occur again; also occurs in
operant conditioning after the behavior has been extinguished.
extinction and forgetting
extinction = when an animal or person who has been reinforced for engaging in
behavior ceases to be reinforced; the outcome is a relatively rapid cessation of
the responses in question.
Forgetting = a much slower process that also results in the cessation of a
response, but not as a function of withdrawal of reinforcement; occurs simply
with the passage of time when there is no repetition of the behavior during this
time.
Shaping
technique used to train animals to perform acts that are not ordinarily in their
repertoire
not required for behaviors like pressing a bar but for very complex, impressive
behaviors
method of successive approximations = involving the differential reinforcement
of successive approximations
environment must be controlled; environment arranged to facilitate the
appearance of the desired response
important technique for animal trainers
Chaining
discriminative stimuli, such as sound of food mechanism, become secondary
reinforcers.
Over time, those discriminative stimuli that are further removed (like smell of
the lever) can become secondary reinforcers
Thus, a chain of responses can be woven together by a sequence of discriminative
stimuli, each of which is a secondary reinforcer associated with the primary
reinforcer.
when a behavior is shaped, chains are established.
Involves differentially reinforcing certain responses leading to the final and
complete sequence of responses
Fading, Generalization, and Discrimination
After discriminating between largely different stimuli, obvious differences are
slowly removed or faded over a number of sessions. Through fading, animals learn
to discriminate between 2 stimuli.
Discrimination involves making different responses in similar but discriminably
different situations; making distinctions between similar situations to respond
appropriately to each.
Generalization is making similar responses in different situations.
Generalization involves engaging in previously learned behaviors in response to
new situations that resemble those in which the behaviors were first learned.
Applications of Positive Contingencies
Societies make greater use of aversive contingencies when positive contingencies
would be far more humane.
Classrooms are like giant Skinner boxes, and teachers are experimenters; they
schedule reinforcements, rewards, and punishments.
More information about actual sources of reinforcement is needed for teachers.
Reinforcement is relative, varies among organisms, and may be reinforcing
initially but punishing later.
Premack Principle
Applications of Aversive Consequences
the case against punishment
Punishment draws attention to undesirable behavior but does little to indicate
what desirable behavior should be.
Instead of eliminating behavior, punishment usually only suppresses it; what is
affected is the rate of responding
Punishment can lead to negative emotions, which may become associated with
punisher rather than undesirable behavior.
Often does not work.
less objectionable forms of punishment
time-out – children are removed from a situation where they might expect
reinforcement and placed in another situation where they are less likely to be
reinforced.
Response-cost punishment – when children who have received tangible reinforcers
for good behavior later have some of these reinforcers taken away for
misbehaviors.
Reprimands – soft verbal or nonverbal communication of disapproval (negative
shake of the head or a frown)
Behavior Management
behavior management = deliberate and systematic application of learning
principles in an attempt to modify behavior
behavior therapy = application of Pavlovian principles
behavior modification = operant learning principles
systematic use of rewards and occasionally of punishment is common in behavior
modification programs
exact relationship between specific behaviors and rewards is spelled out in a
contingency contract
specific, well-planned behavior management strategies are typically more
effective than are more informal, less-organized approaches
Counterconditioning
counterconditioning – when undesirable habits that have been conditioned to
certain stimuli are replaced with different, incompatible responses to the same
stimuli; successful in both medical and psychological settings
systematic desensitization – used in psychotherapy to treat anxieties and
phobias.
Critique
well-defined, highly researched system
reflects facts
clear and understandable
allows predictions to be verified
isn’t based on many unverified assumptions
does not explain symbolic processes
says little about contemporary topics of decision making, problem solving,
perception, etc.)
dissatisfied with attempts to explain language through reinforcement theory
neglected role of biology in learning
defenses:
Skinner’s rejection of the usefulness of mental processes does not mean
rejection of their existence.
Contribution to understanding verbal behavior is overlooked.
Skinner’s explanation for “mentalistic” concepts such as self-awareness is
overlooked.
Some psychologists consider Skinner’s view an assault on freedom and dignity; if
we are controlled by the environment, we cannot be free. Skinner insists that
humans are controlled by their environment, but did humans not build these
environments?
Defenses:
Skinner wasn’t trying to provide proof that free will does not exist; rather he
was arguing against what he considered unscientific and futile explanations for
human behavior
Many critics objected not to the theory but to their interpretation of its
implications. They don’t like what humanity seems to be. As Skinner noted, “No
theory changes what it is a theory about; man remains what he has always been”
(1971, p. 215)