Operant Conditioning

We’ve all used operant conditioning. Just admit it. You know, when you take someone into a plain white room and hook them up to electrodes and then start–wait, what??!?!

No, not that.

As it applies to UX, operant conditioning is when an application attempts to change the frequency of specific user behaviors. Things like confirmation modals, success and zero states, some kinds of error messages, form validation, etc. It comes in several varieties but I want to quickly hit on three common ones:

Positive Reinforcement

Rewards, payouts, praise, humor and value are all forms of positive reinforcement that encourage users to continue to do whatever-they-just-did to get more. It’s the addition (positive) of something wanted.

Negative Reinforcement

Contrary to the colloquial understanding, negative reinforcement is not punishment. Silencing the alarm, hiding the error state, removing the danger, and opening the path are all forms of negative reinforcement that encourage users to continue to do whatever-they-just-did. It’s the removal (negative) of something unwanted. 

Punishment = Bad

Not going to say a lot here because computers should not be used for punishment. Like the first law of robotics, applications should not harm humans. Don’t do this or… or… you’ll be punished? But to be clear, punishment is the addition something bad or the removal of something good as a consequence for specific behaviors. Remember that punishment can only be marginally effective at stopping behavior, it can’t encourage behavior. As you know from your horrible experience with your second grade teacher, you never forget punishment. Not only is it a damaging method for changing behavior, once the punishment is removed, that behavior often returns! 

So What is Proper Reinforcement?

So the studies have been done. You can read more here, here and here and much of it applies to humans as well as lab rats. Everybody likes good things and hates bad things and there’s some morality here too: if you’re hoping to induce improper behavior in others, or worse, behavior that is good for you and bad for them? Stop reading now. 

But in software applications there are “good” behaviors and “bad” behaviors. Or more accurately, user actions that will get them the value they need and actions that will ultimately frustrate them. Incentivizing proper behavior is a smart thing and will actually contribute positively to the tenuous relationship humans often have with software.  And there is a real science to optimizing the change in behavior with proper reinforcement. While you might think that rewarding every “good” behavior is the best, it’s not. Users will actually have a higher response rate to reinforcement (e.g. start doing it sooner) and will have a slower extinction rate (e.g. will keep doing it) if the reinforcement is variable, both in time and frequency. In short, this means that we shouldn’t reward our users every time, but closer to about 50% of the time.

Key Takeaways

  1. Help users find their value by reinforcing supporting actions (e.g. the animation of a lock icon turning unlocked)
  2. Never punish
  3. Mix up the reinforcement (e.g. don’t toast “Good job!” every time)