How often to reward?

The rate at which a behaviour is reinforced will determine how quickly a behaviour i learnt and how quickly that behaviour might become extinct once no further reward is given.

The below was identified by B F Skinner as part of his theory into operant conditioning.

Types of Reinforcement

Continuous reinforcement

Every time a behaviour occurs, reinforcement is given. If the reinforcement stops then the following will happen to the behaviour:

- Response rate is slow

- Extinction rate is fast

Fixed Ratio Reinforcement

Reinforcement is given after a set number of times the behaviour occurs.

- Response rate is fast

- Extinction rate is medium

Fixed Interval Reinforcement

One reward is given after a fixed time interval as long as one behaviour occurs in this time.

- Response rate is medium

- Extinction rate is medium

Variable Ratio Reinforcement

Behaviour is reinforced after a random number of occurrences

- Response rate is fat

- Extinction rate is slow (due to the unpredictable rate of reward)

Variable Interval Reinforcement

One reward is given after a random time interval as long as one behaviour occurs in this time.

- Response rate is Fast

- Extinction rate is Slow.

When to apply the rates of reinforcement?

When we tech a dog a new behaviour the rates of reinforcement depend on what behaviour we want.

To start we use the continuous rate of reinforcement. We ask the dog to sit, they do so, they get a reward. They learn to associate sitting with the word sit. However we don't want to be rewarding a dog every single time they sit as they come to expect the reward which loses its value, but we still want them to respond when asked.

We then move to the fixed ratio ratio reinforcement. Now they know how to sit, we might decide to reward every 5 occurrences. The number of occurrences depends on the dogs competence. Moving straight from every time to every 10th time might be too much in the early stages of learning, so we might reward every 3 times, then every 5 times and so on.

The final stage is to move the variable ratio reinforcement. The above shows that this has the slowest extinction rate. The unpredictable nature means that the dog may think this is the time they get a reward and the reward has high value. As time goes on the ratio to repetition of behaviour expands. you may only reward the sit every few weeks or so, or when you think of it.

The fixed and variable interval reinforcements are best suited for behaviours that require duration such as requiring the dog to stay quiet, or to hold the stay position in a dog trial.

What about behaviours you didn't train?

The rates of reinforcement also explain why dogs continue to do behaviours we don't want.

Your dog is outside and starts to bark to be let in. Your dog continues to bark and bark and wont be quiet. There may be many reasons why your dog is outside, that isn't important, what is, is why is your dog still barking and howling after so long?

Perhaps in the past your dog has barked, you got up and let them in.

Cue: Your dog hears you inside

Behaviour: Dog barks to be let in

Reinforcement: You let the dog in who is now sitting next to you getting a scratch.

But they don't get to get into the house every time, so why do they still try? Because sometimes it works! Variable ratio reinforcement occurred without you even meaning to.

So if you want to extinguish the behaviour of the dog barking to be let in, you need to make sure the dog is only being rewarded when they are performing the right behaviour. Given the previous behaviour of barking has been reinforced using the method that has the lowest extinction rate, be prepared for it to take a long time to stop.

You may want to start by rewarding the dog every time they are quiet, then slowly moving to a fixed ratio reinforcement and then finally to variable ratio reinforcement.

