** **

** **

**Does More
Deterrence Require More Punishment?**

**[or Should the
Punishment Fit the Crime?]**

** **

** **

** **

** **

** **

** **

** **

** **

**John Henderson**

**Business
Department**

**Lambton
College**

**Sarnia,
Ontario N7T 7K4**

**e-mail: john.henderson@lambton.on.ca**

** **

**and**

** **

**John P. Palmer**

**Department of
Economics**

**The University
of Western Ontario**

**London,
Ontario N6A 5C2**

**e-mail: **jpalmer@uwo.ca

** **

** **

** **

** **

** **

** **

** **

** **

** **

** **

** **

** **

** **

** **

** **

** **

Abstract

Deterrence
as a policy issue is usually a societal, or macro, concern. However, the
traditional approach in the economic analysis of law, building on the work by
Gary Becker, treats deterrence at an individual, or micro, level: an increase
in deterrence is accomplished simply by adding marginal criminals to the set of
already-deterred individuals. In the absence of zero prices, raising *both*
the level of enforcement and punishment is, *ceteris paribus*, the *only*
cost-efficient way to increase deterrence because these two factors are
complements in deterring an individual.

In
this paper we aggregate the Becker-type model to derive a societal level
production function for deterrence. We demonstrate that aggregation can cause
enforcement and punishment to become independent or even substitutes in
production. The expansion path for deterrence can be like that for the Becker
model (positively sloped), or it can be vertical, horizontal, or even
backward-bending. Therefore, a cost-effective increase in deterrence can call
for a rise in the level of one deterrent factor while the other is held
constant or even lowered. In the latter case, some previously deterred
individuals may exit the deterred set, while a larger number enter the set.

Our
paper shows that the Becker-type prescription for increased deterrence may not
be optimal because enforcement and punishment may no longer be complements at
the macro level.

We
conclude with references to considerable recent empirical work that is
consistent with our theoretical results.

** **

** **

** **

** **

** **

** **

** **

**March, 2001**

** **

** **

**Key words**

Deterrence, Crime,
Punishment

** **

*JEL*** classification: K (law and economics)**

**1. Introduction**

An important policy problem in the original, traditional
literature on the economics of crime and punishment can be stated as follows:[1]
The level of deterrence for a specific crime depends, *ceteris paribus*, on the expected fine faced by persons considering
the crime. Suppose that the legal
authority selects a given level of deterrence that, in turn, determines the
appropriate expected fine. The expected
fine then can be decomposed into the product of (1) a fine and (2) a probability of detecting, convicting, and
actually punishing criminals. How
should the expected fine and its components, the fine and probability of
conviction, change if the legal authority wanted to, say, increase the level of
deterrence? The traditional approach in
the field of law and economics has been to postulate that positive costs are
associated both with the imposition of a punishment and with increasing the
probability of conviction; under these postulates, only an increase in both the
fine *and* the probability of conviction can be optimal (i.e.
cost-minimizing for any given level of deterrence).[2] However, we will show that this conclusion
does not necessarily follow when the theoretical models are aggregated across
individuals with different tastes and preferences. Rather, an increase in the
desired level of deterrence can call for a lower fine; indeed it might even
call for a lower *expected* fine.
Furthermore, these results cannot readily be explained within the traditional
law and economics framework.

In our approach, we follow the traditional literature in our analysis of deterring any one individual from committing a crime. We make use of the fact that deterrence of any given crime is a discrete variable — either the crime is deterred or it isn’t. We begin with the individual potential criminal, discussing optimizing behaviour for just one person. However, when we relax the assumption that is usually implicit in many models — the assumption that all individuals are identical (or, if different, then different in such a way that aggregating across them causes no problems), we then show that aggregating the individual results across a large group of different people leads to a very important potential confusion about substitution and complementarity in the production of deterrence. From this step, we approach the problem using conventional production function theory; that is, we treat the probability of conviction and the size of the imposed fine as inputs into the production of deterrence, and explore how the productivity of these inputs affects the decomposition of an expected fine or punishment into its two components: the probability of being punished and the size of the punishment.[3]

In the next section we look at the problem of deterring individuals; then we look at the aggregation problem. In the ensuing sections, we develop a model of optimal selection of (1) the probability of being convicted and punished and (2) the amount of the fine. We establish the necessary and sufficient conditions for an equilibrium choice and examine the comparative statics associated with changes in the level of deterrence, and we show that the effects on factor productivity of changes in factor usage play a decisive role in the decomposition of the expected fine. Further, we use our results to establish the conditions under which it is efficient for the punishment to fit the crime, and we see that having the punishment fit the crime might not always be optimal.

**2. Expected Utility and Potential Criminal
Behaviour**

** **

We begin by considering any
particular individual who might be contemplating whether to commit a specific
crime. We assume that the person maximizes expected utility, and that the
decision the person must make is a zero-one decision — commit the crime or
don’t commit the crime. We let ** p** be the

**(1) V(p,f) = (1 - p)U(w + b) + pU(w - f) > U(a) = A, **where

** V**
is the expected utility function,

** U
**represents the utility of wealth,

** w** is the person's initial wealth
(before committing the crime),

** a ** is the person’s wealth if the non-crime alternative
is selected (

** b **is
the gain the person receives from committing the crime (we assume this gain is
confiscated upon conviction)

* *

This bare-bones expected utility function reduces the
individual's choice to one of either committing the crime or not committing the
crime. Furthermore, all the variables are exogenous to the individual; there
are no continuous choice variables in this simplified model. With this
simplification, it is easy to explore the effects on criminal behaviour of
changing the levels of the parameters,** b**

It is easily seen that ** V_{p}
= -U(w + b) + U(w - f)**, which is always negative, assuming the
individual has a positive marginal utility of wealth, because

**3. Aggregating beyond one individual**

** **Because we have up
to now been considering only one individual and only one crime decision, the
result of the simplified model considered in the previous section is that each
individual either commits a crime or doesn't commit the crime. Hence, either
the ** (p,f)**
combination deters crime or it doesn't.

If individuals
were identical, then any combination of ** p** and

At the level
of the individual, ** p** and

**Figure 1**

** **

**V**

** V(p**'**,f)**

**V(p**"**,f)**

**A** **f **

**f**"** f**'

**Figure 1. An individual is deterred by
combinations of p and f for which V<A.**

(Note:
in this figure, even though ** p**">

Later, we will
use an extension of Figure** **1 to show
that, in the aggregate, the inputs into the production of deterrence may be
complements, substitutes or independent. That is, if we think of deterrence as
an output produced according to the production function ** Q = D(p,f), **aggregation
can give either

When different
individuals are considered within the community, it is difficult to assume that
a given level of police effort will lead to the same probability of detection, apprehension, and conviction for
each of them. It seems more likely that some individuals have more skills than
others at avoiding conviction and punishment. Given these differences, it is
probably inappropriate to talk about a given ** p** for an aggregated
collection of individuals. Consequently, as we extend the analysis to cover a community,
we must now think of

Below, we show
an extension of the analysis to a community composed of different individuals
with different preference functions and different abilities. In Figure 2, the
negatively sloped curves represent the expected utilities of crime for five
different individuals, from the map of all individuals in the ** V-f **plane.
Without loss of generality, we assume that

** **

** **

**Figure 2**

** V**

**V _{1 }**

** **

** **

**
0**

** f**'** f _{}**

** **

** **

** **

**Figure 2. Different Effects of Increasing
the Probability of Being Convicted and Punished.**

The expected
utilities of crime in Figure 2 are drawn for a level of police activity, ** p**'. At fine

From Figure 2, it becomes clear that the relationship
between the size of the punishment, the amount of policing activity, and the
resulting amount of deterrence of crime in the economy is not necessarily a
nice, neat, well-defined function. Nevertheless, many writers argue their
conclusions from the implied assumption that the aggregation problem doesn’t
exist or doesn’t matter. Recent summary articles by Cameron[6]
and by Ehrlich[7] are examples
of this omission. Also, the recent special issue on “Penalties: Public and
Private” of the *Journal of Law and Economics* (April, 1999) contains no
articles which concern themselves, even in passing, with this aggregation
problem. Interestingly, the work by Kessler and Levitt[8]
in this latter volume takes notice of the ambiguities in empirical tests for
whether a deterrence effect exists in addition to an incapacitation effect. But
they, too, seem unaware of the aggregation problem and that this problem might
very well be the source of some of the observed empirical ambiguities.

Ehrlich and Liu[9], in the same volume, present evidence that “…the probability of imprisonment has a greater deterrent effect than its severity. (p486)” In drawing this conclusion, they seem unaware that there might be an aggregation problem when jumping from a model of deterring an individual to a model of deterring crime within a community.

Perhaps the clearest examples of the potential problems of jumping from a model of individual behaviour to a policy discussion of deterrence within a community appear in recent works by Polinski and Shavell[10]. In their articles, the leap is implied, at best, but the aggregation problem does not appear to have occurred to them.

And yet, as we demonstrate in our discussion of Figure 2, it might very well be incorrect to apply to a community, as a whole, a model of deterrence designed to explain the behaviour of only a single individual. And it might very well be the case that the sometimes ambiguous empirical results reported in the literature are a result of this aggregation problem.

**4. A Model of Optimal Fine and Probability of Conviction**

Once the aggregation problem is recognized, it becomes
clearer that when the community is trying to produce deterrence, it may have
little *a priori* reason for believing that the inputs (policing resources
and punishment) are substitutes or complements. Each community might, in fact,
have to explore the possibilities for its own residents. In general, it is easy
to think of policing, ** p**, and fines,

Let ** Q** be a monotonically increasing
index of deterrence of a particular crime.
We assume that

We assume that *D _{p},D_{f} > 0*

We assume that there is an autonomous legal authority
that exogenously sets the desired deterrence level for any specific crime. The legal authority might be an elected
person or body (or possibly some benevolent or not-so-benevolent dictator). We do not explore the public choice
implications of having** **such an
authority, nor do we examine the mechanisms implemented for selecting the
desired level of deterrence. We do,
however, assume that the society is composed of a large number of individual
units, and that their preferences are somehow conveyed, albeit perhaps
imperfectly, to this legal authority.

**Figure 3.**

*p*

**Expansion path**

*p _{1}*

*Q _{1} = D(p,f)*

*Q _{o} = D(p,f)*

**Isocost, I_{o}**

*p _{o }M*

*f _{o} f_{1 }*

**Figure 3. **In this case *p* and *f* are very close substitutes and *f* behaves like an “inferior” input: to
obtain an efficient increase in the desired level of deterrence, society should
seek a higher probability of conviction and punishment, along with __less__
punishment.[11]

Now, suppose that this authority exogenously sets the
desired level of deterrence at ** Q_{o}**. We assume that the
authority seeks to minimize the total cost,

(2) ** L(p, f, **

The first-order conditions for a minimum of (2) are:

(3a) *L _{f}* =

(3b) *L _{p}* =

(3c) *L*_{l}
= *Q _{o}* -

The second-order condition is
that the bordered Hessian of second order partials, *H *< 0.

Equations (3) can be solved to find the optimal values** p_{*}**
and

** **

**(4) **

** **

That is, to minimize the cost
of achieving the deterrence level ** Q_{o}**, the fine and
probability of conviction (and punishment) should be chosen so that the ratio
of their marginal products equals the ratio of their prices. Thus, equilibrium occurs at a point where
the isoquant

Next, we investigate the comparative statics associated
with equations (3). Specifically, we
wish to examine the expansion path for equilibrium. Suppose that the legal authority decides to increase the level of
deterrence to some new level, ** Q_{1}** >

By the Implicit Function Theorem, we can solve equations
(3) for the exogenous variables in the system, and substituting these solutions
back into equations (3) yields these equations as identities. Differentiating these identities with
respect to ** Q** and solving gives

(5a) _{}

(5b) _{} _{}

Note that the first term in the right-most brackets of equations
(5a) and (5b) is negative by the strict concavity of *D*(**×)**,
and that the sign of ** D_{pf}** is indeterminate.

Inspection of
equations (5) reveals, as one would reasonably anticipate from standard
production theory, that the shape of the expansion path depends, *ceteris paribus*, on the cost of
administering fines and on the substitution properties of ** p** and

But we cannot demonstrate on theoretical grounds alone
that seeking more deterrence will always, unambiguously, require the imposition
of higher fines. Consider the situation
with ** p**
and

A cost efficient increase in deterrence that calls for an
increase in policing but a decrease in the fine is not possible at the level of
the individual, because ** p** and

Accordingly,
we define the elasticity of the marginal product of factor ** i** with respect to factor

** _{} ** for

Then a sufficient condition for
an increase in deterrence, ** Q**, to lead to a reduction in the
optimal fine is that the own elasticity of policing be greater than the cross
elasticity of the fine, or

(6) _{}.

To see why, expand equation (6)
and multiply both sides of this inequality by ** D_{p}D_{f}** to get

**Figure 4.**

*p***Hyperbola:
E_{o} = p_{o }∙ f_{1}**

**Expansion path**

*p _{1}*

*Q _{1} = D(p,f)*

*Q _{o} = D(p,f)*

**Isocost, I_{o}**

*p _{o}*

*f _{o} f_{1 }*

One might reasonably expect, after having studied the
traditional treatments of the economics of crime,[13]
that even if it is possible to obtain increased deterrence optimally with a
decreased fine (and increased policing), surely society must increase the
expected fine, ** E=p∙f**, to obtain the desired increase in deterrence. Such
a conclusion might be incorrect, however. Consider Figure 4, which is a
reproduction of Figure 3 with one curve added. This curve is a rectangular
hyperbola representing a constant value for the expected fine. It is drawn
through the expansion path at the point

To investigate this possibility further, we differentiate
** _{} **with respect to

_{}

_{}

Now, the last term on the right
hand side above is negative from the first- and second-order conditions.
Therefore, by a method similar to that used in equation (6) above, we can show
that a sufficient condition for _{} is that

(7) _{}.

** **

That is, a sufficient condition
under which the optimal *expected*
fine, ** E*,** will decrease as the desired level of deterrence increases
is that the sum of the own elasticities exceed the sum of the cross
elasticities.

**5. Expansion Paths and Marginal Deterrence**

It can be seen that ** D_{pf} > 0** is sufficient
for an expansion path that moves in a northeast direction in Figures 3 and
4. In these instances, the optimal size
of the punishment will vary directly with the amount of deterrence desired and
hence will vary with the severity of the crime, assuming that society wishes
more deterrence for more serious crimes.
However, we have also shown that the expansion path could have a
negative slope for

We can link our results for the expansion path to the
problem of marginal deterrence by making an assumption about why the legal
authority might increase the deterrence level, ** Q**. First, fixing the crime at the theft of
hubcaps, an increase in

Alternatively, in addition to hubcap theft, consider the
full range of complementary crimes involving automobiles (e.g.,
break-and-enter, car theft, car-jacking), and let ** q** be a
monotonically increasing index of the legal authority’s evaluation of the
seriousness of these crimes. They might
range from, say, theft of a gascap to carjacking with murder, with the crime of
hubcap theft presumably lying somewhere in between these extremes. Now, assume that, over the range of these
crimes, the desired

This assumption permits us to examine how an increase in
the seriousness of the crimes affects the choice of ** p** and

(8a) _{}

(8b) _{}

The term _{} is positive, so that equations
(8) have the same implications for a change in ** q** as equations (5)
had for a change in

It is reasonable to expect the legal authority to choose
to seek more deterrence for more serious crimes; it also seems reasonable to
expect that the residents of society will want to choose ** f **and

**6. Empirical Relevance**

While it is theoretically possible to develop a model, as
we do in sections 3, 4, and 5, showing that it might be optimal to increase
deterrence by reducing the size of the punishment, it turns out that this
result has some empirical explanatory power as well. Many studies have indicated
that deterrence and/or the reduction of recidivism can best be promoted by
increasing the probability that criminals will be punished.[14]
There is even some indication in these studies that the size of the punishment
is less important than the expected probability that perpetrators will be
convicted and punished. To the extent that these implications from these
studies are correct, it is difficult to reject out of hand the results of our
model as “theoretically possible but empirically implausible.” Instead, it
appears, both from our theoretical model and from these studies, that by
focusing on the amount of the punishment rather than on the probability that
criminals are punished at all, some policy-makers may be emphasizing the wrong
variable. This finding is in direct contradiction to a resolution of the U.S.
Congress that, “Congress encourages all … states to adopt as quickly as
possible legislation to increase the time served by violent felons.”[15]
At the very least, we can readily see that emphasizing only one variable in the
production of deterrence is *not* consistent with efficiency in the
standard Becker-type model of crime, but can easily be efficient within the
context of our revision of the model to allow for aggregation effects.

Public opinion in some jurisdictions seems to be at least somewhat consistent with the reported empirical findings.[16] For example, a recent poll indicated that Canadians are not very confident in the effectiveness of the prison system, but have more confidence in the courts and even more confidence in the law enforcement authorities. One possible implication of these results is that people are not so much interested in having the size of the punishment increased for any given crime, but they do seem to attach importance to making sure that perpetrators face a much higher probability of being punished.

Another example might come from the recent security
procedures at the Summits of the Americas held first in Seattle, Washington,
and then in Québec City, Québec. Policy makers appear to have learned from the
first of the two summits that increasing ** p**, the policing effort,
while not increasing (or possibly even decreasing)

This interpretation of these empirical studies, along with our theoretical results, suggests that the traditional economic approach to crime and punishment may be mistaken. It may be that it is incorrect to generalize from an individual to an aggregation of individuals.

**7. Conclusions**

The traditional approach to the economics of crime and punishment has mistakenly focused on the expected fine or expected punishment as society’s choice variable for creating deterrence. This focus has led many writers to try to explain why we don’t choose very small probabilities of conviction along with very large punishments. The unfortunate result of this focus has been that scholars have devoted far too much time to discussions of marginal deterrence problems or philosophical issues in an attempt to explain why the punishment should fit the crime. If, instead, we address the problem of optimal deterrence from a standard production function approach, we can more easily identify the conditions under which it is optimal to have the punishment fit the crime.

Interestingly, when these conditions are not met, that is, when the expansion path of Figure 3 or 4 bends backward, then it would not be optimal to have the punishment fit the crime; rather, it would be desirable to increase police protection and enforcement, substituting these activities for higher fines (or other forms of punishment) to obtain the desired level of deterrence.

** **

We wonder if perhaps the crime of assassinating a political leader might also fall into this category. Society views such a crime as being so serious that it devotes considerable resources to its prevention and to increasing the probability that the perpetrator would be apprehended and convicted. The punishment for the perpetrator might be no more than for the murder of any other person even though the crime might be viewed as more serious. At this point, it is likely that an increase in the level of punishment might have very little additional deterrence value, and that the desired additional deterrence must be obtained through prevention and detection. The expansion path becomes vertical and might even conceivably bend backward if prevention and detection become extremely good substitutes for punishment in the deterrence production function.

The key to our results lies in our exploration of the production of deterrence. While the traditional literature begins by looking at individual utility functions, we show that by aggregating our basic results, we can avoid the fallacy of composition. This avenue of research led us to look at production functions for deterrence. A secondary point, to give our results empirical relevance, is that these results should be interpreted as dealing with the probability of receiving the punishment (not just conviction) in comparison with the level of the punishment.

Throughout, our point has been very simple: it may not be correct to aggregate from individual analysis to societies without considering the aggregation problem. Many of society’s scarce resources have been deployed trying to understand why the simple, individual models of deterrence do not have the expected explanatory power in empirical tests when, in fact, the problem may not lie with the models or with the empirical tests – rather the problem may simply be another example of the fallacy of composition.

[1] See the
Winter 1996 Symposium on *The Economics of
Crime* in ** The Journal of Economic Perspectives**, Volume 10, Number 1,
especially Isaac Ehrlich, “Crime, Punishment, and the Market for Offenses,” and
the references contained therein.

[2] The most
recent exposition of this analysis is provided by A. Mitchell Polinsky and
Steven Shavell, “The Economic Theory of Public Enforcement of Law,” ** Journal
of Economic Literature**, Volume 38, Number 1, March, 2000, pp 45 – 76.
Unfortunately, these writers seem unaware of the potential aggregation problem,
going from the individual to the societal level of analysis.

[3] The traditional models are usually written in terms of fines and expected fines. We tend to carry on this terminology in this paper, but there is no reason that the analysis would not apply equally to other forms of punishment.

[4] We assume that the option for comparison is not committing the crime. It is likely, of course, that for many criminals the choice is which crime to commit. In such cases, the result of increased deterrence of one particular crime will be simply be an increase in some other crime, not necessarily less serious than the one deterred.

[5] The curves are straight lines for risk neutral, convex from below for risk averse, and concave from below for risk-seeking individuals.

[6] Samuel
Cameron, “The Economics of Crime Deterrence: A Survey of Theory and Evidence,”**
Kyklos**

[7] *Op. cit.,
*1996.

[8] Daniel
Kessler and Steven D. Levitt, “Using Sentence Enhancements to Distinguish
between Deterrence and Incapacitation,” ** Journal of Law and Economics **42
(April, 1999): 343-63.

[9] Isaac
Ehrlich and Zhiqiang Liu, “Sensitivity Analysis of the Deterrence Hypothesis:
Let’s Keep the Econ in Econometrics,” ** Journal of Law and Economics**
42 (April, 1999): 455-87.

[10] A. Mitchell
Polinski and Steven Shavell, “On the Disutility and Discounting of Imprisonment
and the Theory of Deterrence,”** Journal of Legal Studies**, 28
(January, 1999): 1-16; see also n2,

[11] In Figures
3 and 4, *f _{1}*

[12] However, consider the situation in which collecting
the fine is costless so that *s* =
0. In this case the isocost is
horizontal, and the outcome for equilibrium depends on the shape of the
isoquant map associated with the production function *D*(*p, f*). If the isocost turns upward (i.e., the
marginal productivity of collecting fines becomes negative eventually), there
would still be a limit to the optimal size of the fine. But if the isocost
never turns up, no matter how large the fine, then the optimal fine would be
infinite when ** s = 0.** see Palmer and
Henderson, “The Economics of Cruel and Unusual Punishment,”

[13] see, for
example, Cooter and Uhlen, *Law and
Economics*

[14] In addition
to the references provided *supra, n.1,6, and 7*, see Vijay K. Mathur,
“Economics of Crime: An Investigation of the Deterrent Hypothesis for Urban
Areas,” ** Rev Econ and Stat**
60:459 - 66 (Aug, 1978); Paul E. Tracy, Mervin E. Wolfgang, and Robert M.
Figlio,

[15] H Con Resolution 105. September 29, 1995.

[16] The CTV/National Angus Reid Poll, July 11, 1997;

see http://www.angusreid.com/pressrel/crimeandjustice_july1997