Chapter 4

Chapter 4

Operant Conditioning: Skinner's Radical Behaviorism

Emphasized practical usefulness of science of human behavior rather than testing formal theories.
Believed notions of subconscious or other fictions, like Hulls concepts of intervening variables, are misleading and wasteful.
Believed explanations of behavior rely exclusively on directly observable phenomena.
Believed psychology is considered an objective science whose methods involve the analysis of behavior without appeal to subjective mental events.

Basic Assumptions
Human behavior follows certain laws.
Causes of behavior are outside the person and can be studied (contrary to most of psychology, which focuses on behavior within the person).
Discovering and describing laws between organism and environment occurs by specifying 3 things:
occasion upon which a response occurs
the response itself
the reinforcing consequences

2 kinds of variables
IVs – manipulated experimentally
DVs – actual behavior; to be controlled

Departure from Pavlov
Skinner believed that Pavlov’s classical conditioning explained behaviors only where the initial response can be elicited by a known stimulus.
Responses elicited by a stimulus = respondents
Responses emitted by an organism = operants
Believed that experimental analysis of behavior requires analysis of (a) what it does, (b) circumstances under which it acts, and (c) consequences of its actions.
Classical conditioning = Type S (stimulus) conditioning
Operant conditioning = Type R (response) conditioning

Comparison with Pavlov
Skinner’s box – experimental chamber, a cage-like structure that can be equipped with a lever, a light, a food tray, a food-releasing mechanism, and an electric grid through the floor
Pavlov’s dogs had a predictable response with placed in a harness and injected with food powder
Skinner’s rats did not respond as predictably when placed in the box
Operant Learning – when a reinforcer follows a response, the probability that the response will occur again under similar conditions increases; the conditions then may have control over the response, serving as a discriminative stimulus.

Reinforcer
A reinforcer is an event that follows a response and that changes the probability of a response’s occurring again.
In experimentation, the reinforcer should be operationally defined, by observable and measurable behaviors, eliminating need for speculation.

Reinforcement
Positive reinforcement is a satisfying consequence of a behavior.
Occurs when the probability of a response’s occurring increases as a function of something being added to a situation
Referred to as “reward”

Negative reinforcement
Occurs when the probability of a response’s occurring increases as a function of something being taken away from a situation
Referred to as “relief”

Punishment
punishment results in suppression of behavior
    when a positive contingency is removed; penalty or removal punishment
    when a negative contingency follows behavior; castigation or presentation punishment
punishment vs. negative reinforcement
    negative reinforcement increases probability of behavior; punishment does not
    negative reinforcement involves termination of an event that might be considered aversive; punishment     involves introducing negative contingency or terminating a positive one

Reinforcement Schedules
what reinforcements are used and how and when they will be used = schedules of reinforcement
continuous reinforcement = every desired response is reinforced; no further choices to make
intermittent (partial) reinforcement = reinforcement occurs only some of the time
ratio schedule = based on a proportion of responses
interval schedule = based passage of time
fixed or random
fixed = exact time or precise response that will be followed by a reinforcing event is predetermined and unchanging
fixed-ratio = occurs after every 5th (for example) correct response
fixed-interval = reinforcement will be available as soon as the chosen time interval has elapsed, immediately following the next correct response.
random = provide reinforcing events at unpredictable times
random ratio = occurs at a rate of 1 reinforcement to 5 responses on average; exact point of reinforcement is unknown
Random interval = occurs at an interval of 1 reinforcement to 5 minutes on average; exact point of reinforcement is unknown.

Superstitious Schedules
In many cases, outcomes are noncontingent on organism’s behavior
Superstitious schedule = special kind of noncontingent fixed-interval schedule in which reinforcement occurs at fixed time intervals without the requirement that there be a correct response.
Follows from the law of operant conditioning that any behavior just before reinforcement is strengthened.

Effects of Reinforcement Schedules
effects of schedules on acquisition
continuous schedule = more rapid learning
intermittent schedules = haphazard and slow learning
effects on extinction
continuous = more rapid extinction
fixed schedules = shorter acquisition time than random, as well as more rapid extinction

Effects on rate of responding
continuous = high response rate
random (variable) ratio = much higher than interval
fixed interval = response rate drops dramatically immediately after reinforcement but responds again at a high rate just before the next reinforcement
spontaneous recovery = although extinguished, Pavlov stated that if the CS is presented again after the passage of time, CR might occur again; also occurs in operant conditioning after the behavior has been extinguished.

extinction and forgetting
extinction = when an animal or person who has been reinforced for engaging in behavior ceases to be reinforced; the outcome is a relatively rapid cessation of the responses in question.
Forgetting = a much slower process that also results in the cessation of a response, but not as a function of withdrawal of reinforcement; occurs simply with the passage of time when there is no repetition of the behavior during this time.

Shaping
technique used to train animals to perform acts that are not ordinarily in their repertoire
not required for behaviors like pressing a bar but for very complex, impressive behaviors
method of successive approximations = involving the differential reinforcement of successive approximations
environment must be controlled; environment arranged to facilitate the appearance of the desired response
important technique for animal trainers

Chaining
discriminative stimuli, such as sound of food mechanism, become secondary reinforcers.
Over time, those discriminative stimuli that are further removed (like smell of the lever) can become secondary reinforcers
Thus, a chain of responses can be woven together by a sequence of discriminative stimuli, each of which is a secondary reinforcer associated with the primary reinforcer.
when a behavior is shaped, chains are established.
Involves differentially reinforcing certain responses leading to the final and complete sequence of responses

Fading, Generalization, and Discrimination
After discriminating between largely different stimuli, obvious differences are slowly removed or faded over a number of sessions. Through fading, animals learn to discriminate between 2 stimuli.
Discrimination involves making different responses in similar but discriminably different situations; making distinctions between similar situations to respond appropriately to each.
Generalization is making similar responses in different situations. Generalization involves engaging in previously learned behaviors in response to new situations that resemble those in which the behaviors were first learned.

Applications of Positive Contingencies
Societies make greater use of aversive contingencies when positive contingencies would be far more humane.
Classrooms are like giant Skinner boxes, and teachers are experimenters; they schedule reinforcements, rewards, and punishments.
More information about actual sources of reinforcement is needed for teachers.
Reinforcement is relative, varies among organisms, and may be reinforcing initially but punishing later.
Premack Principle

Applications of Aversive Consequences
the case against punishment
Punishment draws attention to undesirable behavior but does little to indicate what desirable behavior should be.
Instead of eliminating behavior, punishment usually only suppresses it; what is affected is the rate of responding
Punishment can lead to negative emotions, which may become associated with punisher rather than undesirable behavior.
Often does not work.

less objectionable forms of punishment
time-out – children are removed from a situation where they might expect reinforcement and placed in another situation where they are less likely to be reinforced.
Response-cost punishment – when children who have received tangible reinforcers for good behavior later have some of these reinforcers taken away for misbehaviors.
Reprimands – soft verbal or nonverbal communication of disapproval (negative shake of the head or a frown)

Behavior Management
behavior management = deliberate and systematic application of learning principles in an attempt to modify behavior
behavior therapy = application of Pavlovian principles
behavior modification = operant learning principles
systematic use of rewards and occasionally of punishment is common in behavior modification programs
exact relationship between specific behaviors and rewards is spelled out in a contingency contract
specific, well-planned behavior management strategies are typically more effective than are more informal, less-organized approaches

Counterconditioning
counterconditioning – when undesirable habits that have been conditioned to certain stimuli are replaced with different, incompatible responses to the same stimuli; successful in both medical and psychological settings
systematic desensitization – used in psychotherapy to treat anxieties and phobias.

Critique
well-defined, highly researched system
reflects facts
clear and understandable
allows predictions to be verified
isn’t based on many unverified assumptions
does not explain symbolic processes
says little about contemporary topics of decision making, problem solving, perception, etc.)
dissatisfied with attempts to explain language through reinforcement theory
neglected role of biology in learning
defenses:
Skinner’s rejection of the usefulness of mental processes does not mean rejection of their existence.
Contribution to understanding verbal behavior is overlooked.
Skinner’s explanation for “mentalistic” concepts such as self-awareness is overlooked.

Some psychologists consider Skinner’s view an assault on freedom and dignity; if we are controlled by the environment, we cannot be free. Skinner insists that humans are controlled by their environment, but did humans not build these environments?
Defenses:
Skinner wasn’t trying to provide proof that free will does not exist; rather he was arguing against what he considered unscientific and futile explanations for human behavior
Many critics objected not to the theory but to their interpretation of its implications. They don’t like what humanity seems to be. As Skinner noted, “No theory changes what it is a theory about; man remains what he has always been” (1971, p. 215)