IACC-Logo, back to IACC-Home

INTERNATIONAL ANTI-CORRUPTION CONFERENCE (IACC)
Programme Papers from the 9th IACC
Organisers
Registration
past IACCs
Sponsors
Contact

Lima
Declaration 

Durban
Commitment

Privacy Policy

Impressum

The 9th International Anti-Corruption Conference

The Papers


DIAGNOSING JUDICIAL PERFORMANCE: TOWARD A TOOL TO HELP GUIDE JUDICIAL REFORM PROGRAMS

Linn Hammergren
World Bank


This is a working draft, prepared for Transparency International. The contents reflect only the author's views and in no way should be equated with the official position of the World Bank.

TABLE OF CONTENTS

Introduction
Other Efforts and Lessons of Experience
General Principles, Assumptions, and Working Hypotheses
A Methodological Detour: If the Goal is to Eliminate Inappropriate Behaviour, Why not Focus on it Directly?
The Dependent Variables: What Are We Trying to Predict?
Elements of a Checklists
Table I: A Proposed Checklist for Evaluating Judicial Performance
Methodology: How is the Checklist Developed?
Methodology: How is the Checklist Applied?
Methodology: Scoring
Targets and Use
Some Caveats and Further Considerations
References
Annexes 1-6
Annexes 7-9


DIAGNOSING JUDICIAL PERFORMANCE: TOWARD A TOOL TO HELP GUIDE JUDICIAL REFORM PROGRAMS
"These [checklists of indicators of judicial independence] are excellent heuristic devices, but they are not as useful for assessing whether a court is more or less independent than one would hope. . . . An index, which combines scores on diverse criteria to produce a single number, is the standard social science technique for reducing the data to a form that permits comparison; but information about the severity of violations and the relative importance of different measures is lost. . . . In the absence . . . of reliable indicators, the best method may be to allow those who have the most contact with the judicial system, the lawyers, to offer their own evaluation of trends in their countries." (Widner, p. 178)

"But why would you want to do that?" (Comment from a participant in several judicial reform programs in response to TI's checklist proposal)

Introduction

This paper responds to a request from the U.S Chapter of Transparency International's ad hoc working group on the judicial integrity. The author has been asked to develop a checklist for evaluating the transparency and related aspects of judicial performance, suggest how it might be applied, and discuss its use to promote judicial reform. The idea for the list clearly draws on Transparency's experience with its Corruption Index, but the working group just as evidently is expecting differences in methodology, as well as content. Nonetheless, there are important similarities. Like the Index, the list is not primarily a research tool, but is intended to promote reform programs. It is thus aimed at a diverse audience of national governments and judiciaries, their citizens, foreign and domestic investors, assistance agencies, and other potential reform constituents. Consequently, it must target common areas of interest and understanding. The list should also be suitable for global application. It is not to be written with any specific legal system or tradition in mind, but should capture certain universal factors that would help identify real or potential problems in judicial operations.

I have accepted the invitation with some trepidation, in as much as such checklists or judicial report cards have been an item on the agenda of judicial reformers for at least fifteen years.1 Over that period I have collected some half dozen examples (see annexes), participated in a few efforts, and seen most float into oblivion. The present task is somewhat easier than what my predecessors attempted. Rather than encompassing all of judicial performance, it is to focus on those aspects related to efficacy, transparency, accountability, and independence. I have been told for example that the adequacy of the legal framework will be handled by someone else and that I may assume its existence. Given its purposes and target audiences the list should also be simpler and less exhaustive than the massive judicial inventories prepared prior to an actual reform program. This narrowing of the topic eases the burden but does not affect the overriding methodological challenge - our/my ability to identify a short list of characteristics that are good predictors of downstream behaviour for the universe of judicial systems.

I will begin with my own list of explanations and caveats. First, I have accepted the invitation, not because I think can adequately fulfil the working group's expectations, but because they are targeting issues receiving far too little emphasis in current judicial reform efforts, and especially those sponsored by foreign assistance agencies. Whatever is implicitly or explicitly understood as the goal of judicial reform, it arguably transcends mere technological innovation or "modernisation," but these have increasingly occupied the resources, plans, and indicators of progress incorporated in reform programs. For a variety of reasons, ranging from the difficulty of defining them and their highly political nature to our own ignorance as to how to proceed, internal and external reformers have tended to shy away from the more qualitative aspects of judicial performance. In doing so, they run the risk of producing superficially modern, but otherwise unsatisfactory organisations in their wake. As I have written elsewhere, 2 many reforms if implicitly acknowledging the importance of these imponderables, have resorted to providing judiciaries with the tools to produce improvements, trusting that they would be used to these ends. 3 Experience suggests that at least over the short run, more direct action is required.

Second, at the present time, I doubt that anyone is in a position to produce the perfect checklist, first because our knowledge of the factors shaping judicial performance is too imperfect; second because on a global level, both judicial operations and the standards for evaluating them vary widely; and third because in the best of worlds, we are talking in terms of probabilities, not absolute laws. (There is also a fourth reason, addressed in the next point - that judicial performance depends on more than the judiciary). However, with the understanding that perfection is not a reasonable goal, we can advance some general rules of thumb, identify potentially problematic situations, and on this basis prioritise areas where reform is most needed. There will always be exceptions -- judiciaries which despite violating every rule, seem to function better than most, or those which despite an optimal organisation still harbour an abundance of undesirable practices. Nonetheless, in the majority of cases, the rules of thumb should be a useful guide for would-be reformers - and probably more so than the special circumstances explaining the exceptions.

Third, assuming our ability to devise the checklist (which I do promise to advance in the bulk of this report), the judiciary is not an isolated organisation, but rather operates within a surrounding institutional environment. That environment in turn puts limits on what even the most perfectly structured judiciary can accomplish and thus on our ability to resolve judicial problems by addressing only the judiciary or even its immediate links with society writ large. Two years ago, I was a member of a team asked to devise a rule of law reform program for a country recently emerged from a civil war and to our eyes (correctly, it turned out) on the brink of a second one. We were able to identify a series of necessary changes in judicial and police operations. However, our recommendation was to postpone any action until there was a government in place with sufficient interest in promoting an equitable, transparent, rule of law system and even then to proceed with utmost caution. This was an extreme case, but not a unique one.

Even in instances of less dramatic social breakdown, there are any number of environmental factors which will undermine the performance of the best structured system. Highly inequitable distributions of resources (not just wealth, but also status, information, education, and relationships); extreme regional, ethnic, and social cleavages; value systems and social expectations in conflict with formal norms; institutional breakdowns in other sectors; or government's inability to resolve a wide range of pressing critical problems will all interfere with the ability of the judiciary to perform its own work. The scarcity of financial and other resources can inhibit reform efforts in still more mundane fashions. Where monies do not exist to pay salaries or there is no pool of qualified candidates for the bench,4 reform designers will have to be innovative in adapting their rules of thumb.

Finally, it is well to recognise that both in the more developed nations and in many developing countries where reforms have been underway for some time, there is an emerging body of criticism directed at some of our most traditional beliefs about the judicial role and the way it is conventionally performed. I won't address this discussion here, but it has obvious relevance for the task at hand. My checklist, like its predecessors, is based on a conventional, minimalist understanding of judicial performance, one which posits that a well functioning judiciary will apply the law and underlying social norms in an equitable, predictable, and transparent fashion, will be protected from extraneous political and other pressures, and will be governed by some mechanisms to ensure both internal and external accountability. I will not delve into questions of the adequacy of judicial decision making for resolving different forms of conflict; debates about the public or private nature of the judicial good and the implications for what should be provided as a public service and how costs will be assigned; the desirability and possible limits of judicial independence; the recognition of alternative (indigenous) legal systems; or suggestions that certain traditional judicial functions be redistributed among a variety of judicial and nonjudicial bodies.5 Nonetheless, countries in the process of a radical redesign of their judiciaries should be aware of these arguments to avoid repeating some apparently ill-advised policies.6

Other Efforts and the Insights of Experience

Some additional inquiries revealed that my first estimate of the number of relevant prior efforts was far too conservative, Unfortunately, I didn't err as to their overall success. Collectively, they have significantly expanded our notions as to what to look at in reviewing judicial performance. They have yet to provide a single list of variables. One impediment has been their differing sponsors, themes, and intended applications. None it should be noted coincide perfectly with what the working group is proposing, although a few come close in their hidden if not overt agenda.

Roughly speaking, prior efforts include three kinds of activities. The first are the judicial or sector inventories typically conducted following a decision to undertake a reform project and thus as an input to project design. Their terms of reference (i.e. checklists) were usually developed to inform a single assessment and subsequently adopted or proposed as guidelines for others. Often running to dozens of pages, they are intended to ensure that the assessment team collects all information of potential relevance for program design. Early assessments sponsored by USAID in Central America 7 produced several volumes of descriptive analysis and quantitative profiles for each country studied, and it has been said, amassed far more data than anyone ever used. Some subsequent efforts,8 sponsored by the World Bank have produced equally massive documents and in addition to the usual statistical and descriptive sections incorporated input from focus groups and surveys of users and judicial personnel.

Project design will always require detailed inventories, but most of the bits of information they contain have no significance in isolation. Planners need to know the number of judges, administrative, and support staff; prison population, condition of facilities, and budget; number and distribution of courts and how each is furnished and staffed; number of vehicles and other equipment, to whom they are assigned, and how used; person hours per year devoted to judicial training, who is trained, and in what subjects; and total and average caseload in the aggregate and by different types of cases. They also require more descriptive analysis as regards the content of basic codes; rules and real practices for selecting judges and staff; or organisation and activities of bar associations. Few of these items in and of themselves tell us much, even in the context of a single system and certainly not as a basis for cross-system comparison. Collectively, these and a mass of similar data help analysts develop a picture of how a national system operates, and where its weaknesses may lie. Apart from their utility for the programs already proposed, the resulting assessments did and continue to do a great service for countries where such overviews had never existed. They exposed unimagined problems and revealed the inaccuracy of some conventional wisdom about what was wrong and why.9 They substantially expanded our collective knowledge about justice operations and are still a source of ideas as to where reforms might focus.

The size of these first efforts has a second explanation. The absence of prior studies, reliable data bases, and even much understanding of what was wrong, made it difficult to predict what might be relevant. The only way to reach that determination was to collect all the information and analyse it en masse. Early studies and improved national statistics have eliminated part of this problem, although contemporary inventories still assume massive proportions, constrained only by tighter budgets or timeframes. A recognition that these fishing trips pulled in much that was not relevant thus generated a demand for a second approach, a shorter set of questions that could more quickly identify problem areas prior to intensive data collection. The exhaustive inventories could then be targeted to these areas, and the amassing of interesting but unnecessary information curtailed.

This second exercise comes closest to what the working group has envisioned. Its purpose, application, and intended audience are still somewhat different. The demand arose largely from assistance agencies' desire for a simple analytic tool that would allow them to determine the need for reform and the most productive areas for their work. The intended application is a series of individual countries, the audience the donors and their local counterparts in each, and the purpose is to guide joint determination of the outlines of a reform effort. There have been some suggestions that these quick assessments be provided to a larger national audience, but agency guidelines or country sensitivities often preclude that.10 More importantly, there has been little thought to using the results comparatively. Even with a common format, the studies don't lend themselves to easy comparison. While far briefer than the inventories, they remain highly descriptive and qualitative, and their quantitative elements once again have little obvious significance outside the specific national context.11

Moreover, even in single agencies, the formats have rarely been uniform. The priorities of different offices and country teams (whether for example this is seen as a democracy project or a market reform, whether the emphasis is on access to the poor, human rights, or efficiency) shape the thrust of each checklist, and in fact these are less checklists than sets of questions or topics to be explored. Those charged with applying them usually take their own liberties and occasionally attach their own analytic tools. 12 There has also been considerable debate as to whom should apply the tools. USAID's Democracy Center has for some time sought a checklist that a generalist could manage. Others have argued that most of the questions, even on the abbreviated list, require expert advice and that what a nonspecialist can count or identify as present will not tell us much.

As with the inventories, the results have fallen short of the goals, but the process has been useful. Despite agency and professional jealousies (and a consequent disinclination to use a checklist developed by anyone else13), there has been much cross fertilisation of efforts, and a growing convergence on the general topics that ought to be featured. Researchers embarking on a quick judicial reconnaissance now have a variety of examples to inform their work and some helpful suggestions as to how to enter specific topics. Unfortunately, much of this is hard to access, and to be truthful, there are by now far too many examples to allow easy dissemination. A further grand obstacle to the overall goal is that the lists much like the inventories still contain too much information, and as a result can be used to justify a variety of follow-ups. The latter are usually negotiated, and often seem to avoid the high priority problems in favour of those that are less threatening. Sometimes that is the fault of the analysts who pulled their punches or came with their own pre-conceived recipes. More often, it is a consequence of stakeholder politics and a format which requires that problems be identified, but not ordered, prioritised, or weighted. If weaknesses are identified in salaries, budgets, equipment, court management, training, political intervention in appointments, external pressures on judges, and irregular contacts with clients, it is a sure bet that most judiciaries and governments will prefer a focus on the first half of the list and not on the second.

A third set of activities, now underway in all major donor agencies, aims at developing a list of indicators of reform progress. Now that we all plan for results, the idea is to set measurable goals for judicial performance and track and thus design projects to meet them. As suggested by USAID's eventual publication of some 75 indicators which it was forced to term "illustrative," the project is less easily realised than had been imagined. USAID's catalogue does attempt to cover a far wider range of performance than envisioned by Transparency, and initially faced some major battles as to how the major categories would be defined and conformed. It's arguable whether a category of "human and gender rights" makes sense, or whether impact on market oriented reforms should be separated from general efficiency and efficacy. However, once these compromises were reached, the most serious challenges were found within the individual categories -- here variations in where nations started, their legal traditions, and what the authors regarded as good performance or structural characteristics and practices likely to produce it were major impediments. Moreover, when USAID field tested the indicators in selected countries, it found many to be irrelevant and some to point in the wrong direction. Whereas signing of international conventions on human rights, anti-corruption or rights of the child might represent a major benchmark in some nations, others have been parties to these agreements for years with no noticeable decline in the related problems. Increases or decreases in reported abuses in any of these areas would have different significance depending on the country, or the point of time in its reform process. Much the same could be said of other hard indicators - public opinion on judicial fairness, the percent of the national budget given to the courts, or case backlogs and average delay.14

Qualitative indicators fared no better. Once you know there is a bar association, judicial training program, or system for selecting judges, what else must you know to evaluate its performance? My favourite, the creation of separate commercial courts, added at the insistence of those working in the former Soviet Union, would put many Latin American nations, and as an informant noted, the state of Wisconsin, beyond the pale. The list of indicators had been intended to track advances in individual countries and to compare progress across them. One immediate conclusion was that it at best could be selectively used for the former purpose, but could not be recommended for the second. Hence, the initial dream of creating a short list of universal indicators of judicial performance and reform progress has devolved into a heuristic tool to assist reform planners in specifying their objectives and designing qualitative or quantitative benchmarks of change.

This catalogue of misadventures is not intended to dismiss either the real accomplishments or the possibility of now doing what has failed before. Certainly, past efforts to design and used diagnostic tools have advanced our knowledge of what counts in defining and shaping judicial performance, encouraged discussion over how to identify it, and improved the content of actual reforms. Even if we do not always apply it in real programs, we have an increasing understanding of relative priorities and how the various parts of the system interact with each other. And finally, the experience does illustrate the two major challenges faced by these efforts. One is purely technical, relating to the designers' ability to select and prioritise the key categories and to develop criteria for evaluating the status of each. The second is operational: how to ensure the tool will be used to its intended purpose, whether that be reaching an agreement on reform goals and components among stakeholders in one country or one agency, or mobilising broader support for their doing the right thing. The two challenges cannot be neatly separated. The tool's design should be technically informed but must also be shaped by a knowledge of the eventual audience and their desired reactions. It has to play to their prejudices, understandings, and interests as well as to the broader cause. The design also requires its own internal politics. An instrument perceived as reflecting a broad consensus of acknowledged experts is likely to have more force than one produced by a single genius, especially as the excluded experts may form their own opposition. As discussed in the next section, this second set of considerations may have still more relevance for Transparency's proposal given that its expected uses are far more ambitious than anything yet attempted.15

General Principles, Assumptions, and Working Hypotheses

The task is more than the development of an analytic tool, but admittedly that's where I, and everyone I've consulted, first focused. Transparency's request only starts with the list. Its larger purpose is to have an instrument which will convince governments to undertake performance-enhancing reforms, either directly, because they acknowledge the list's validity and authority, or indirectly, through the actions of the rest of the target audience. As far as the list and its application are concerned this imposes certain further requirements:

  • It must be technically sound, especially in its selection of the key variables affecting judicial performance and its criteria for assessing their content.
  • In its normative or prescriptive elements, the values it reflects should be widely shared and it should recognise reasonable variations in the means for their realisation as well as areas of emerging consensus and lingering disagreement.
  • To allow comparison, the list must incorporate a grading or scoring system. This must be credibly and transparently applied, reducing so far as possible charges of subjectivity, cultural insensitivity and bias.
  • It must be relevant and intelligible to all of its target audiences.
  • While it should cover all the relevant categories, it should not be lengthy. Presentation and discussion should be brief and to the point and should lead readers to policy-relevant conclusions.

As the first two points suggest, the list will be based on theoretical assumptions linking three analytic categories: the behaviours (dependent variables) to which we are predicting; the independent variables or judicial characteristics which determine those behaviours, and the criteria used to evaluate those characteristics. The list itself will be composed of the independent variables or judicial characteristics, but its value hinges on their relationship to the other two levels. The selection of the dependent variables or desired behaviours is essentially normative, and thus should represent a broad consensus on what constitutes good performance. The elements of the other levels ideally arise in empirical theory. As that theory is far from complete, logical arguments and some normative preferences are likely to be as influential. By the time we get to evaluation criteria we are going to encounter increasing disagreements as to what really counts and how we will count it. At best the entire structure will derive from an emerging consensus as to the determinants of judicial performance. Where that consensus has not formed, there will be considerable room for disputing the particulars and even the overall objectives.

The likelihood that all these assumptions are not shared leads to the third point, the need for maximum transparency in the application of the checklist and for ensuring that those charged with this task are perceived, individually and collectively, as credible evaluators. I will discuss some means for meeting these criteria in a later section. The point for the moment is that the authority accorded to the list depends as much on how it is applied as on its own internal quality. Authority hinges in part on who the evaluators are; the other part depends on an adequate explanation of how they arrived at their determinations. This second requisite conflicts in some sense with the emphasis on a concise presentation, and my notion that the list's impact will vary inversely with its length and complexity. The solution is a compromise. First no matter how brief the basic presentation of an assessment, it will require some explanation of the grades or scores assigned. (While it has other weaknesses, the Blackton report card16 is a good example in this respect). Second, a far more extensive documentation should be readily available. This serves two purposes: the obvious one of justifying the conclusions and the less obvious but equally important provision of more details for those interested in heeding the implied recommendations.

The last two points, relevance and brevity, address the working group's concern that this list serve as a direct and indirect incentive for reform programs. Their desire that the list reach and influence a broader and fairly diverse public means that it must have significance for nonspecialists and for those whose interest in judicial performance is fairly narrowly focused. While the list can educate as well, there are practical limits. What entrepreneurs think they want from a judiciary and how they think it is achieved is likely to be difference from the views of judges or advocates of social justice All might be encouraged to take a broader viewpoint and recognise connections they had not understood. Still, if the list is to have a wide impact, it is going to have to sacrifice some detail and breadth in favour of a message that is readily and immediately understood.

This sacrifice will only be worthwhile if certain other working hypotheses hold. Most of these relate to the audience's intrinsic interest in having such a tool and the likelihood that it will inspire them to take actions supporting its explicit or implicit recommendations. The purpose will not be served if this becomes just another means of bludgeoning the political opposition, punishing the judges, or a pretext for increasing the extra-judicial side of actors' operations. The operative assumption is thus not that the actions listed below will occur automatically. Instead, it posits that the methodology can be tailored to encourage these responses:

  • Governments and judiciaries will grant sufficient authority to the product to attempt to follow its recommendations
  • When they do not, other audiences will internalise the assessments and recommendations and mobilise their own resources to pressure for change
  • Two potentially powerful constituencies, assistance agencies and investors, will find the list relevant to their concerns and use it to make decisions as to where they will start or expand operations.

As discussed above, even less ambitious efforts have fallen short in this area, and it is in fact the key to the value of the entire endeavour. Reformers already have enough checklists and related tools. What they need is one with an impact. I return to these issues in the last section.

A Methodological Detour: If the Goal is to Eliminate Inappropriate Behaviour, Why Not Focus on it Directly?

Given the difficulties in developing structural predictors of behaviour, this question needs to be taken seriously. If you want to know whether a judiciary is corrupt, unaccountable and vulnerable to politically and other pressures, why not just ask those questions? We have tools to do that, ranging from public opinion polls focusing on the overall judicial image, to surveys tapping real experiences with the judiciary (Did those interviewed pay bribes? Did they face any irregular obstacles in getting their case to court? Did the judge explain her decision and was it consistent with other similar cases? Were there indications of outside pressures being exercised?)17 to more anthropological observations of actual practices (which may extend to the deployment of simulated users).18 I remain convinced that such techniques are important and perhaps the most direct way of ascertaining the existence of certain kinds of problems. I surmise that the working group dismissed this option for two reasons: first they have their doubts about the utility and accuracy of these methods and second, their interest goes beyond identifying problematic behaviours.

Turning first to the doubts, these and other techniques to measure judicial corruption do have their limits. In some cases, judiciaries or political authorities will not permit this kind of investigation. That alone should be a signal that something is amiss, but it is hardly conclusive evidence.19 Beyond that, what the public perceives or informed observers are willing to admit may be inaccurate, incomplete, or outdated. Perceptions and even experiential reports are based on past events - if a judiciary has reformed or done some backsliding, the general public or the occasional user may be slow to recognise the change. Conversely, they may be extremely vulnerable to the vagaries of press coverage and whatever incident is currently attracting attention. Results of surveys in particular are difficult to compare across countries. They provide a valuable benchmark for tracking advances in resolving problems within a single nation, but may not be a good indication of whether such problems are unusually pervasive.20

Moreover, problems may be systematically over or under reported. Losers in a legal case tend to perceive injustices even where they have not occurred; the tendency will be aggravated when legal counsel blames the judge rather than their own incompetence. Lawyers in some countries have been known to solicit and pocket "bribes" from clients; the judges never see the money. Unsophisticated clients may confuse legitimate court fees with pay-offs. Users may also be unaware of irregularities. In the famous Greylord investigation 21 in Cook County, Illinois, some clients who paid bribes claimed not to realise that was the purpose of the monies solicited. Many other kinds of irregularities are virtually invisible - pressures on judges or other court personnel from upper ranges of the judicial hierarchy, concerns about vindictive disciplinary actions or denial of benefits,22 or calculations about the best career moves. Finally, while interviewers have developed techniques to encourage reports of systematic, petty irregularities (as things "others" commonly do), it is unlikely that parties to grand scale corruption will be as forthcoming or that simulated users or random observers will detect it.

For all their shortcomings, such efforts to identify the real incidence of corruption and other problematic behaviour are an essential step in evaluating judicial performance. They may constitute input to the checklist23 or be used to verify its findings and they can guide the prioritisation of eventual reform objectives. They may also be the most dramatic way of calling attention to the need for change. Judicial and political leaders can easily dismiss an expert panel's finding that the judicial selection system is flawed. They will have more trouble with a survey indicating that 90 percent of the citizenry have no faith in their courts. Presumably such quantitative measures also have more impact on potential investors or agencies interested in providing assistance, two additional sources of pressure for reform.

Nonetheless, I share the working group's apparent doubts about their utility as a stand alone tool - especially if the overall objective is to encourage countries and their judiciaries to take positive steps to correct problems. The quantitative nature of the data almost inevitably encourages their conversion into single scores, the creation of national rankings, and, as in the case of Transparency's Corruption Index, endless, often unproductive debates about objectivity, validity and reliability. Furthermore, while the numbers may speak for themselves, it takes a careful listener (or reader) to interpret their implications, which as suggested above may be more or less than their face value. Those who take the time to read through the methodological explanations or entire questionnaires may have a fairly good, if partial picture of what is happening on the ground; those who don't or who just rely on an aggregate score may form their own, possibly very distorted view of events.

More importantly, the focus on real or perceived outcomes is less helpful in identifying underlying causes or possible remedies and does little toward explaining the vulnerability of a judicial system or its potential for eventual abuses. Having indisputably attracted the judiciary's attention, along with that of elites and the public, figures on the real or perceived extent of corruption, backlogs, or "irrational" judgements frequently produce equally dramatic reactions which over the short and long run have only made things worse. Examples include the post-1992 judicial purges and executive interventions in Peru, justified by the judiciary's abysmal rating in opinion polls, and threatened attempts to imitate them in Venezuela, Guatemala, and Haiti.

The working group's desire for a judicial checklist incorporates a broader purpose - to call attention to a problem while simultaneously suggesting remedies and encouraging their adoption. This means that the list will focus on structures, characteristics, and practices which, while one step removed from the outputs or behaviours we want to influence, are key determinants of their content and which furthermore, lend themselves to modification through the normal reform inputs - resources, training, legal and procedural change, reorganisations, policy dialogue,24 and so on. The list may not tell us what the real level of corruption, political intervention in decisions, delay, or arbitrariness is, but it ideally should identify points of vulnerability worthy of correction. It offers an invitation to a dialogue rather than a confrontation.

This more complex strategy is far harder to mount and faces numerous obstacles of detail. One challenge is to identify characteristics that are sufficiently generic as to have a universal application. Many of the lists so far developed have fallen down on that point -developed from the authors' experience with a few systems, they tend to suffer from a marked ethnocentrism, equating all the characteristics of a specific system which seems to work with what is necessary for any and all systems to operate well. Once applied beyond the author's area of expertise, their flawed logic quickly becomes apparent. Latin Americans, for example, and many experts working in the region, have come to equate judicial independence with an earmarked 6 percent of the national budget and the elimination of any role for the executive (especially the Ministry of Justice) in appointments and administrative management. Many countries with judiciaries marked by fair to excellent performance would fail on one or both counts. The European Union's current efforts to set judicial standards for aspiring members has been beset by similar arguments over how generic requirements can be separated from what specific countries commonly do.

A second challenge is the issue of objectivity. The list will inevitably require subjective judgement calls, both as regards its composition and any scoring system, and these will just as inevitably raise charges of bias or lack of cultural sensitivity. Means to lessen these problems are discussed below but there is no way to make them disappear. Its subjectivity and the kinds of details included may also make it a less effective rallying point for the reform coalition. A flawed selection system and lack of access to information on cases are not the kinds of issues that elicit street protests or a reduction in foreign investment whereas highly publicised opinion polls could conceivably do just that. Finally, a dialogue implies negotiation and raises the risk of bargaining away the key points or of entering into unacceptable agreements. It's all well and good to suggest that economic subsidies may be decreased incrementally. The argument is hard to push for political interference in appointments (only in every other one this year?) or the incidence of bribe taking or human rights abuses. True reforms do require incremental change, but for obvious reasons often do not lend themselves to even tacit acknowledgement of this principle.

The Dependent Variables: What Are We Trying to Predict?

Although the central topic is corruption, the working group's interest extends to other related aspects of judicial performance. I am calling these "dependent variables." The quotation marks are important; the exercise is marked by considerable subjectivity and a substantial lack of rigor as regards the predictors, what we are predicting to, and the linkages between them. The group's suggestions as to what they want predicted have left me considerable initial freedom in its further definition. I would include, in no particular order, the efficiency and efficacy of judicial operations, the courts' equitable treatment of and accessibility to all citizens, the timeliness and predictability of decisions, their consistency with the formal law, common standards of interpretation, and certain broadly shared notions of justice, the absence of internal biases and susceptibility to external pressures, and a reasonable match between what the public expects and the quantity and quality of what the courts are able to provide. To this we might also add a satisfactory legal framework for the judiciary to apply and mechanisms ensuring the enforcement of judicial decisions. I am taking the liberty of omitting the first of these additional dimensions, and slighting the second, but they obviously will affect performance.

As one of the key elements of the proposed strategy is the impact on foreign audiences (especially investors and donor agencies), it should be noted that some of the desired behaviours extend beyond their immediate interests. Investors in particular, will be interested in how commercial cases are handled and less concerned with access for the poor, equitable treatment, or broadly shared notions of justice. I am including these dimensions for two reasons. First, a well functioning justice system must address them, and that is our fundamental aim, not just good service for foreign clients. Second, if they do not do so already, investors (and donors whose main entry to the topic is economic growth) should be encouraged to look beyond the immediate impact on business disputes. In the end, these can be resolved by insisting on international arbitration, creating special courts, or just by striking deals with the government. However, in a country where the rest of the justice system does not operate well or at all, there are other, possibly more important negative consequences, ranging from uncontrolled crime and civil violence to business' difficulties in keeping permanent staff, who may face their own, unrelated legal problems. For this reason, the check list also should probably extend beyond the courts, to include prosecution, police, and the independent legal profession.

As regards the extended list of desired behaviours or dependent variables, several obvious comments are in order. First, the various components can be conceived as dimensions of a single construct - the quality of judicial performance. I have resisted ordering them because they are all usually regarded as essential. Advances in one or two dimensions without comparable advances in the others could well produce disastrous results -- efficiency without equitability, timely, but arbitrary results, internally consistent standards which remain unknown or unintelligible to users, and so on. Second, the standards against which any dimension is measured are relative and at their extremes, probably not only unattainable but also undesirable - - at least insofar as regards perfectly predictable, immediate decisions, machine-like efficiency, or excruciatingly detailed justifications of each and every one. This will clearly produce scoring problems -- can a judiciary be too independent or too efficient, and if so, how is that reflected in the score assigned? Finally, just as there is some overlap among the dimensions there are also internal contradictions. Efficiency may interfere with accessibility and conformity with legal norms, and all three may contradict public expectations. And although corruption was where we started, it is not the paramount objective. It is completely conceivable, in fact highly likely, that a system designed to eliminate all corruption would deliver no other results.

As a partial solution to these dilemmas, I am grouping the behavioural characteristics in three broader dimensions, suggesting that performance hinges on their dynamic interaction: the judiciary's creation of an internally consistent process (institutional integrity), its accountability to society writ large, and its maintenance of a certain level of independence vis-a-vis its external environment. The first dimensions relates to the judiciary's ability to set and enforce standards for its own operations, and the second and third on its cross boundary exchanges with other socio-political systems.25 Like the more detailed list of desired behaviours, the three dimensions are essentially normative. This is probably most true of independence and accountability, both as regards their initial selection and the identification of the factors determining their achievement. Whereas in the case of institutional integrity, the factors derive from a more complex, if very basic model of requirements for organisational sustainability, those for the other two dimensions more closely resemble extended definitions. This is the difference between saying that a transparent organisation requires a transparent selection system and noting that institutional integrity demands that an organisation choose members on their ability to perform necessary functions. As the author of a checklist, I'm not happy with my taxonomic apples and oranges, but I see no way around the problem.

Elements of a Checklist

The extended definition of judicial performance is the basis for the checklist; the list itself is composed of the characteristics believed to be critical in producing the desired patterns of behaviour. They are presented as general categories (e.g. selection of judges) and series of criteria or questions for evaluating them. So where did I get the characteristics and criteria? The short answer is that many were cribbed from pre-existing lists and from the working group's own suggestions as to what they thought might be important. The longer, more intellectually respectable answer is that they draw on an accumulated body of knowledge and understandings about how judicial systems operate, and that this in turn is based on academic studies, theoretical arguments, and the experience provided by real reform programs. As in all social science, the reach and sheer quantity of theory and hypotheses far exceed the support of empirical verification. Thus the relationships between judicial performance, the selected characteristics and the evaluation criteria are at best based on working hypotheses. We believe the ways judges are selected and subsequently treated by their institution have a strong effect on what they do; we believe irregular intervention by the other branches of government in these processes will have undesirable consequences (lesser predictability, equitability and consistency with the law, etc) on their behaviour. The beliefs can be defended logically and are supported by some evidence, but the strength and precise details of the relationship by no means constitute self-evident truths.

There are two logical ways to represent the relationships figuring in the checklist - one is to take each desired behaviour (e.g. efficiency, access, predictability,) and specify the structural characteristics most likely to produce it. The other is to select certain broad structural categories and leave the linkages for the subcriteria. I am taking the second tack, as have most other authors of such lists. This loses the one-to-one correspondence, but offers the advantage of focusing on systems as reformers will see them. It also avoids much repetition or excessive detail, and finesses the very real gaps in our knowledge of the linkages. I will, however, present the major categories under the three overarching dimensions of institutional integrity (not just corruption but the ability of the judiciary as an institution to set and enforce internal standards of performance), transparency and independence. The first dimension is much broader than, and even in this simple format, imperfectly distinguished from the other two. (Are adequate salaries a part of institutional integrity or independence? Is public input into selections systems part of transparency, or institutional integrity?) I will discuss scoring and other details below. Whether or not an overall score is assigned, each of the three dimensions and each category within them should receive a score, based on the scores or answers for each of the evaluation criteria.

The checklist is, I would stress, only a first cut and hardly intended as the final product. It draws blatantly on others work, and might well have been replaced by one of the examples included in the annex. The purpose is to put something on the table for discussion, and the only possible advantage of my version is that it is more specifically tailored to the further requirements of the working group. For lack of time, I have not done two additional tasks that the working group undoubtedly expected.26 The first is the lengthy explanation of each category, its intellectual provenance (especially as regards theoretical arguments, any empirical research, and more casual observation) and its impact on the dimension in question. Impact would be largely addressed in the second missing task, a discussion of the various criteria as they would be applied to concrete cases. As demonstrated by ABA/CEELI's checklist for judicial independence (included as Annex V) this latter discussion could be lengthy, incorporating a specification of the linkages (e.g. why are adequate salaries important?), a review of the range of known variations (from the irregularly paid $15 or $20 for Cambodian or Liberian judges to the amounts received by judges in Singapore, the US, or Europe) and a discussion of how they would be treated (how do real salaries compare with the external reference points and what should be regarded as adequate?). Technically, the checklist as presented publicly would not incorporate any of this detailed discussion. It should be available for those interested, would necessarily inform the work of application, and would be subject to constant modification on the basis of the results of the list's use.

In reviewing the list, some readers may be surprised by the absence of the usual quantitative indicators figuring in other examples. As I've discussed above, many of them (number of judges, judges per 100,000 population, average caseload, percent of budget spent on the judiciary) have little known independent significance. If there is a range of acceptable answers it is probably broad and has yet to be defined. Others (average time to resolution, backlog, percentage of users or of the general population expressing satisfaction with their courts) look more like downstream behaviour, although they too have to be interpreted in context. Should time to resolution, for example, be weighed against user expectations, legally set limits, or some universal standard? It would be helpful, as an independent exercise, to establish a data base incorporating these statistics for a wide range of developed and developing countries. This would discourage the misuse of individual statistics27 and help identify those quantities with some broader significance as well as the acceptable range of variations. Obviously some of the criteria included below require that the scorers know these numbers, but they will only be part of the input in answering a single question.


TABLE I

A PROPOSED CHECKLIST FOR EVALUATING JUDICIAL PERFORMANCE

I. INSTITUTIONAL INTEGRITY

A. Selection of judges

  • Criteria for judicial selection are set, publicised and followed
  • Criteria are based on job-relevant28 merit
  • Criteria incorporate exclusions (with background checks) for those with criminal records, outstanding cases or professional disciplinary actions pending against them

B. Management of the judicial "career"29

  • Judges have permanent tenure or fixed, renewable appointments
  • Rules of conduct exist as does a process for monitoring compliance, disciplining violators, and appealing disciplinary decisions
  • Standards for performance (number of cases decided, average time limits, reversals on appeal, service to users, etc) exist and are monitored to help judges improve their work and where relevant, to affect decisions on tenure, promotions, transfers, and discipline
  • Promotions, transfers, dismissals, and/or renewal of appointments are based on publicised, transparent criteria
  • There is a transparent appeals process for judges in the case of denial of promotion, transfer, or renewal
  • Training programs are available and participation is encouraged and facilitated; some sort of entry level training or orientation is compulsory.

C. Internal administration

  • Administrative processes (at the systemic and court room level) follow set rules and procedures
  • Budgets, procurement and management of resources are monitored and audited
  • There is a management information system (manual or automated) to facilitate planning and budgetary oversight
  • Administrative staff are chosen, promoted and retained through transparent, merit-based procedures
  • Administrative staff have an ethics code, performance standards and their own career and disciplinary systems.
  • Adequate training is provided for administrative staff

D. Resources

  • Changes in the overall judicial budget are commensurate with the growth of the national budget and also reflect increases (or decreases) in demands for judicial services
  • Staffing, equipment, and offices provided to judges and aministrators are adequate to allow performance of their duties
  • Staffing, equipment and offices provided to judges and administrators are no worse (no better?) than that for the rest of the public sector
  • Internal resource distribution is based on need and workload

E. Judicial Processes

  • Procedures for handling cases are standardised and mechanism exist for ensuring they are followed
  • Rules of evidence and standards for evaluating arguments exist and are applied in a predictable fashion
  • Assignment of cases follows standardised procedures and results in a reasonably equitable distribution of work
  • Procedures are reasonably efficient and designed and reformulated in the interests of eliminating unnecessary steps and bottlenecks.
  • Judges have the power to move cases ahead and to punish or deny efforts to create additional delays
  • Where judicial decisions are not complied with, courts have additional means to enforce them
  • There is a regularised process for appealing judicial decisions, and decision are not reversed in any other fashion
  • The pre-trial settlement of disputes is encouraged but not forced
  • There exist duly recognised alternative dispute resolution mechanisms, both court annexed and free standing, which provide a viable alternative to judicial processes

F. Legal Profession:

  • There is a transparent process for entrance into the profession, based on educational background and other relevant criteria
  • There exist laws and professional codes of ethics to govern the profession; they are widely known and enforced
  • Denial of entry or disbarment is subject to transparent rules, and has its own appeals process
  • Where there is a shortage of qualified professionals, there is a provision for lay representation or performance of some legal duties, but these individuals are also subject to rules of conduct

II. INDEPENDENCE30

A. Selection of Judges

  • Any external input (by other branches of government or private individuals and organisations) to the appointment process is subject to transparent rules and occurs only in accordance with established procedures
  • Evaluation of candidates is done by a body or office separate from that making the final selections
  • Judicial appointments are made as vacancies occur, not to coincide with changes in national administration

B. Management of the Judicial Career

  • Judicial salaries meet living wage and some reasonable proportion of good wage in private sector
  • Additional privileges (housing, vehicles, trips, training) are allocated through a transparent process with no nonjudicial input
  • Where external actors have complaints about judicial performance these can only be entered through the normal disciplinary process

C. Internal Administration

  • The selection and further management of administrative staff is handled through transparent rules and regulations and is not subject to intervention by officials not legally authorised to provide specific inputs.
  • Whether handled by an external body (e.g. Ministry of Justice) or by the judiciary itself, oversight of internal administration responds to judicial needs, not to the administrators' agenda

D. Resources

  • Salaries and budgets cannot be reduced nor their distribution altered by other branches of government
  • When judicial workload reaches unmanageable limits, the judiciary is able to obtain more resources

E. Judicial Processes

  • Other branches of government do not override or ignore judicial decisions, and when they do, they are subject to legal action
  • Decisions and powers accorded to the judiciary are not usurped by other governmental actors
  • Judiciary is able to set its own rules for internal operations; where those rules are limited by enacted law, they have substantial input into shaping the latter

E. Legal Profession

  • Access to the professional status is managed only according to officials rules
  • Whether the judiciary or the bar association is responsible for admittance and discipline, it does this without irregular outside intervention
  • Ability of lawyers to form professional associations is reasonably open.
  • Internal operations of bar associations are determined by the members themselves

III. TRANSPARENCY/ ACCOUNTABILITY

A. Selection of Judges

  • Public input is solicited as a part of the judicial selection process
  • Appointments are adequately publicised
  • Selection process is open and transparent

B. Management of the Judicial Career

  • Standards for judicial performance and ethical behaviour are publicised
  • There is a process for registering complaints about judicial misconduct
  • Public input is solicited as part of the judicial evaluation process

C. Internal Administration

  • There is a process for registering complaints about administrative misconduct
  • Adequate information is publicly provided on the roles and responsibilities of administrative officials attending the public

D. Resources

  • Judicial budgets, salaries, and results of audits are publicly available
  • Judicial requests for additional resources are presented publicly
  • Proposals for major investments in infrastructure or equipment are presented publicly with opportunity for discussion

E. Judicial Processes

  • The rules for how cases will be processed are well publicised
  • Court users have access to information on the status of their case
  • Hearings are publicly announced and open to the public
  • Judicial decisions are publicised
  • Press and other nonjudicial groups may comment on decisions without fear of reprisals
  • Courts services are readily accessible to the entire population, and there are no unreasonable geographic, monetary, or legal barriers

F. Legal Profession

  • Information as to accredited bar members and any paralegal profession is easily available to public
  • Disciplinary actions and disbarments are publicised
  • There is an easily accessible process for providing complaints about attorney's actions

Methodology: How is the Checklist Developed?

My suggested checklist is intended as a tool for discussion. The first rule on the creation of a final product is that should not be developed by one person, but should be the product of extensive discussions among a representative group (or groups) of judges, lawyers, and others with experience in judicial reforms. Ideally, most if not all of the members should have familiarity with several legal systems Given the variety of backgrounds and perspectives to be reflected in the product, it seems inconceivable that any single working group would suffice. Hence, the process would involve a series of subgroups, organised to reflect different legal traditions or geographic regions. An alternative organisation might be structured around the three broad dimensions or six basic categories, with regional variations reflected in each of them. There would be an interactive exchange among the two levels - the principal working group designs a first version, which is sent to the subgroups for their comments which in turn are sent back to the principal group. As the USAID experience with its list of indicators demonstrates, field tests, and revision based on the results will also be required.

From my own experience with these exercises, I would add two cautions. First, while judges and lawyers may be the best sources of information on how a judicial system operates, the list could easily come to reflect only their perspectives and thus a series of unrealistic, skewed, or impossible requirements. Several other viewpoints need to be incorporated both for technical soundness and to ensure relevancy and buy-in. These include policy-makers (both from national governments and assistance agencies), reform practitioners (and especially those who have authored prior lists), court users, and groups, apart from these, which the list is intended to reach. Some may be included in the working groups. This is probably most important as regards the assistance agencies, as they are a major audience with their own views on the subject. (They also incorporate some of the other categories and constitute a far smaller group than government policy makers or court users) Others should have an opportunity to offer opinions at some point in the process. Obviously in the case of large constituencies like court users or national governments, one will have to rely on proxies. The subgroups, especially if they are geographically dispersed, may be able to contact associations or individuals who can speak authoritatively for these categories. Here, as in the other demands for representativeness, the concern is more than technical quality. The tool's authority and credibility will also hinge on who, besides its immediate authors, will buy into the process.

A second caution is that the process of setting up the working groups and their design of the product could easily take so long as to preclude its ever being used. If Transparency goes ahead with the exercise, it must set itself some temporal and substantive limits. It is entirely reasonable to allow a few weeks to define the participants, working methodology, and provide either a first working draft or a collection of examples and a general format for what is wanted. The rest of the process, and the various iterations should also have their time limits and specified products. In the end, a very few individuals will do the actual writing - the others will have their chance to provide input, and if they object too much to the final version, to provide alternatives. And of course, one individual should be charged with monitoring the entire process and ensuring it occurs more or less on schedule. My further advice is that this individual not see him or herself as the major author - leaving this to the most eminent judge or legal expert is a virtual guarantee either that this person will dominate the entire procedure or that it will never get done at all.

Methodology: How is the Checklist Applied?

Leaving scoring for the next section, I will address the organisational issues here. The list will inevitably be dominated by questions with a highly subjective content and thus lend itself to application by expert panels. The times questions can be answered with a simple yes or no or a quantitative measure will be conspicuously few. To answer them adequately, panel members will need an in-depth knowledge of specific cases - unless of course one plans to increase the costs astronomically by having someone else do the field research for them. In the interests of comparability and consistency, we cannot have one panel of experts for each country, but ideally would have one panel do all ratings. Obviously that panel would either be too large to function or constitute the null category.

The solution, which also may resolve some of the complaints about cultural biases, appears to be a small number of panels, most probably specialised in specific regions, with care taken to ensure that the combined experience of the individual members provide a reasonable familiarity with all the countries covered. (Care would have to be taken that this familiarity not extend to marked biases, or that the affected expert recuse him or herself in the case of any country where that might be objected - calling in a substitute or relying on the knowledge of the remaining panel members.) I think in all cases, three to five members would be adequate - and where they are not, an extra member might be called in for any country they feel they cannot cover.

Staffing the panels and putting them into operation is the only easy part of the process. Members could be drawn from comparative legal experts (lawyers, political scientists, sociologists and other obvious disciplines; judges are less likely candidates, which further counters the likely judicial bias in the list itself) and especially those with reform or other field experience. It's a growing group, and I suspect most members would be willing to join - especially as most exchanges could take place by internet or in a brief series of real meetings. It would be useful to have an orientation session to discuss methodology, possibly uniting all the panels or at least each regional one. The orientation would use the results of the field tests (which I assume have already been done) and the problems encountered in them. I suspect an initial ranking of an entire region might require a months work or less from each member, but this could be spread out over a longer period - not too long or the advantage of immediacy is lost.

A central working group, composed of separate members or a representative from each of the regional groups would review the combined results, identify problems or inconsistencies (some groups will be harder graders), and find ways to resolve them. As a precautionary measure, the first exercise might cover only a few countries in each region. That way the problems of cross-regional comparisons could be addressed while the sample is still manageable.

Methodology: Scoring

Comparability within and across regions requires a common unit of exchange. One hundred descriptive analyses will not serve this purpose and thus some kind of scores will have to be introduced. This is a transparent process and one not pretending more precision than it can deliver. Thus, I take heed of John Blackton's excellent device to conceive of these as grades, not scores, and perhaps to adopt his title of a "judicial report card." As he notes, the term grade already connotes a certain subjectivity and less than absolute precision, and in this case, that seems an appropriate aim. The grade for each criterion, category, dimension, and possibly the overall status could be given in terms of letters (A to F) or numbers (a 0 to four or five point scale).

There is a further problem of how to aggregate each country's scores, as one moves up from level to level. Even as I look at the categories I have suggested, I cannot pretend that they are all equally important. One could ignore that and work on the basis of the cumulative averages, reconfigure the evaluation criteria to make them more equivalent, or attempt some weighting system. While I might give equal weight to all the criteria under selection systems, I might give career systems for administrative staff twice the value of a management information system. Another solution (which I have adopted by accident in some cases - for example separating selection from career management) is the constructive use of redundancy. If something is very important, break it into parts or take several cuts at defining it. This I suspect is the best remedy. The larger point is that the checklist constructors should do a series of simulations before they go to press, or even to a field trial. As anyone who has attempted such a list can contest, some of the biggest errors develop from a failure to consider what the scores will look like. When those countries which we intuitively believe are good or bad performers have incompatible scores, or if all the scores cluster at one extreme or another, then the list has to go back to the drawing board.

Do you want to do an overall grade? It poses some of the same problems addressed above, and I have compounded them by my introduction of the three dimensions. Maybe that is sufficient reason for eliminating the dimensions, or perhaps they deserve different weightings. I think on the whole there is more to be said in favour of an overall grade than against one. If you don't devise one, the readers will. If you do, you have more control over how it is calculated and explained.

Which raises the final issue of explanations. The grades are inevitable, but the important message lies in the brief explanation that accompanies them. Once again I reference Blackton's report card as a good example of how this might be done. This is especially important for any overall score and at the level of dimensions and categories where countries with similar rankings will suffer from a variety of different problems and deserve to have those recognised. The explanations, like those on a conventional report card, will also indicate what improvements are required. This information is important for those responsible for making them and also for the broader audience comprising the potential reform constituency.

Targets and Use

As the working group indicated from the start, their goal is not to rank judiciaries, but to give them an indication of where they stand in terms of performance and where their particular weaknesses lie. A second purpose is to provide publics, potential users, and international partners a basis on which they may select with whom they work and, it is hoped, to encourage them to mobilise their own resources to press for change. It is not entirely clear how Transparency envisions the release and further dissemination of the results. Individual countries, users and donors might request or be provided the scores on single judiciaries, but clearly the format requires a release in regional or global groups. It might be of some help to know that Colombia scored 3.0 overall, with lower grades in judicial processes and transparency and higher grades in independence, but both impact and utility hinge on knowing how that compares to everyone else, or at least its regional neighbours.

Judging from experience, to the extent I know it, with the Corruption Index, wider publication is guaranteed by ranking countries. Putting Colombia in category B might arouse some press interest; ranking it X of a total of 100 countries will guarantee far more. However, publicity is no guarantee of longer term impact, and the jury is still out as to how many concrete improvements the Corruption Index actually encouraged. For that reason I have opted against overall rankings, and thereby probably undermined the potential for automatic, free dissemination of the list.

The further question, however one does the initial presentation, is how to encourage the various audiences to react in the desired fashion. Street protests may serve some purpose, but the intent is the direct and indirect impact on reform efforts - how to get those responsible to read beyond the scores or grades to the explicit and implicit recommendations. The recommended format attempts to enhance those chances by including a transparent, disaggregated scoring system with brief explanations of how each score was derived. A lot depends on how Transparency explains the overall effort and even its further agenda. However, relying on readers' common sense and rationality is not fool proof. Thus, as I explain in the final section, still more depends on how the process is organised from the start, who is involved, and how their buy-in can be further guaranteed. The undertaking is valuable and worth supporting. In the real world that alone is rarely sufficient to make something work.

Some Caveats and Further Considerations

I began with a question raised by one of the people I consulted for further insights on the task: "Why would you want to do that?" Allowing for my possible misrepresentation of the goals, it is still a good question. Coming from a judge, albeit one involved in international reform programs, the response undoubtedly reflects normal professional reservations about an effort, not only to evaluate individual judiciaries, but also to publicise those evaluations and compare them across countries. Even from outside the profession, it is not hard to recognise the potential dangers, some of which I've raised above.

Beyond this, there is the question of whether the exercise would really serve its intended purpose of pressing judicial and political elites to initiate fundamental reforms, which might undercut their immediate interests, or to rally wider support to force them to do so. One issue is whether this tool in any of its possible variations will really tell people things they don't already know, and whether that knowledge would affect their actions. Looking just at Latin America, the region I know best, it strikes me that the public, elites, and court members are hardly unaware of problems of corruption, inefficiency or just plain incompetence. They may not know who is responsible, why this occurs, or what could be done about it, and their present impressions may err on all those counts. Because 90 percent of the public believed the Peruvian judiciary was corrupt, does not mean that 90 percent of the judicial personnel participated in those practices - although on the basis of actions taken in 1992, that appeared to be the conclusion reached. Arguably, those actions did not resolve the problem, and better information as to its causes and incidence might have produced a different kind of reform. Hence, to the extent the tool can go beyond calling attention to problems to diagnosing them and presenting effective remedies, it could be a help.

However, this also assumes the audience will act on those recommendations. I've already discussed some reasons why that might not be the case for donor agencies or local reformers.31 There is a third category added by Transparency, entrepreneurs and especially foreign investors, which deserves further attention. With few exceptions, I don't believe this group suffers from extreme naivete about problematic judicial practices. The real problem, as another colleague volunteered, is that judicial performance is about twelfth on the list of things concerning individuals considering an investment in a foreign country, or in their own. Which is to say that by the time they have reached that factor, they have done a lot of other research, and if it affects them at all, it is likely to be in terms of how to avoid a problematic judiciary, not whether or not to invest in the country.32 We could be wrong, and maybe if such a judicial check list existed, entrepreneurs would consult it first, rather than last or not at all.

If we're right, this means that the main consumers of the list will be the usual suspects - political and judicial elites, assistance agencies, citizens of the surveyed countries, and various public interest and advocacy groups. It also means the main source of financing will come from the donor community, private foundations, and possibly private businesses, but as a public service, not something they intend to use extensively or would buy on a pay-per-use basis. (This would be hard to do anyway without excluding the rest of the audience.) Excluding the business community as targeted users (or paying clients) does offer several advantages. It eliminates the need to focus on topics which might interest only them, as well as that of justifying the inclusion of some topics they might find less relevant. Now of course the question is how to justify to the usual suspects the financing and use of the ultimate check list or report card, when they have already paid for so many.

Past failure and unfulfilled need may appear to be sufficient justifications, but the real task is to convince the potential financiers and users (i.e. the assistance agencies) that it will work this time. The problem is not just funding. I'm sure Transparency can find some foundation willing to fund a year or two's worth of effort. However, that foundation is not a major consumer and its interest will not guarantee use. There may be other ways to bridge that gap, but a quicker solution is to enlist the donor community from the start, including them both as financiers and participants in the endeavour.

Participation is important for another reason. After all, why should Transparency with no track record in judicial reform, suddenly propose to provide the framework to guide future efforts? The answer is that Transparency will be the co-ordinator, not the author, of a joint effort, and that it will actively elicit contributions from interested parties at all stages of the process. Perhaps Transparency could run a competition for checklists, or ask parties to nominate participants to the various working groups. It might also want to exclude areas (court administration, delay reduction, infrastructure, legal framework) which are extremely technical and less directly related to the quality of judicial performance. My proposed checklist either ignores or downplays them, revealing my own biases. There are other less personal reasons for their exclusion - the most important of these is to focus the list on the forgotten variables, not those which already receive more than their share of attention. Among the neglected themes are those related to improving the quality of judicial personnel. I would argue that this (with a few internal nods to independence and accountability) is the most important element in combating corruption and thus most directly related to Transparency's own mandate.

The need for a universal judicial report card is a given. It could provide a way of uniting reform efforts and emphasising themes currently receiving too little attention. Many of those involved in reform efforts would welcome a tool which supported an emphasis on the right rather than the politically feasible programs. As is well known by anyone who has ever confronted a roomful of politicians and bureaucrats dead set on buying computers and ignoring the appointment system, technical correctness is not a very valuable weapon. Having an external authority weighing in on those decisions could be the only way of turning the tide.

However, as is often the case, Transparency is only one of a long line of organisations to propose to fill an unmet need. Some of its most powerful detractors could be those who tried and failed. I've devoted a lot of time (because that was expected) to discussing how such a check list or report card might be developed and applied. The most important issues however are the last ones: how Transparency can build on past experience, turn potential detractors into supporters, and ensure that the final product is both technically superior and used as intended. The short discussion of these last themes is not intended to discourage these efforts. It is rather, in the context of the group who will be reviewing this paper, advanced as a challenge. Many of the Durban participants have considerable familiarity with what has been tried. Some have been direct participants in prior efforts. The question is whether they can help Transparency build an better mousetrap and ensure that the world (at least of judicial reformers) beats a path to its door.


Notes and References

Notes

  1. I am sure the efforts go back still further, but not in this universal form. More importantly, they appear to be enjoying a resurgence at present. Conversations with colleagues revealed several on-going projects (the tip of the ice-berg, I suspect), ranging from improved judicial inventories (see later sections for an explanation) to undertakings that look a good deal like what Transparency has in mind. Despite the obvious disincentives I've forged ahead, but the reminder that this is not a unique experience accounts for my emphasis on some methodological and operational issues the working group may not have anticipated.

  2. Hammergren, (1998).

  3. This may give too much credit to the purveyors of infrastructure, equipment, and modern management techniques. Some of them clearly believe that the real cause of judicial inadequacies lies in the absence of these elements. Many, however, do justify their focus by arguing about the broader impact of technology (or better buildings) on behaviour. Their argument may be valid, but it discounts the many countervailing forces working to diminish or distort that effect.

  4. As another example I was part of a second team asked to evaluate a judicial assistance program in Cambodia. Aside from the other problems faced in that country, the fact that it had only about 40 graduated lawyers (virtually none of whom were currently on the bench) suggests problems for some time to come in recreating even a minimally adequate judicial organization.

  5. For those interested in these topics, especially as they have been addressed in Latin America and Southern Europe, see Correa and Pena, Pastor, Garapon, Toharia, and articles in Tate and Vallinder.

  6. Toharia argues for example that both Europe and Latin America have gone to questionable extremes in privileging judicial independence. A case in point might be Ecuador which lets the Supreme Court select its own members (cooptacion) as well as other judges, and at present, exempts them from impeachment. A more common error is the aspiration to provide all services to all comers, with no consideration for budgetary limitations.

  7. These were designed and conducted by Florida International University's Center for the Administration of Justice (FIU/CAJ) and ILANUD (the United Nations Latin American Institute for Crime Prevention and Treatment of the Delinquent). Condensed versions were subsequently published privately by FIU/CAJ. A list is appended in the annexes.

  8. See Chemonics. An example of the terms of reference developed by Waleed Malik (World Bank) is appended in the annexes.

  9. In Latin American, these studies exposed, for example, the enormous percentage of pretrial detainees in the prison system, a previously unrecognized problem. They also raised questions about the assumed importance of an inadequate number of judges or an excessive workload in explaining court delays and large backlogs.

  10. This has been a continuing problem with the inventories. Two early USAID assessments had their release delayed for years because of complaints from the country under evaluation (Guatemala) or USAID's concern about political repercussions (Panama). The Bolivian Supreme Court only released a World Bank funded study after leaks to the press made that preferable to allowing the rumors about its content to continue unchallenged.

  11. Improvements in the quality and availability of basic judicial statistics in developing (and developed) nations have allowed some initial efforts to investigate the impact of a few potential quantitative indicators. However, the results, comparing obvious choices like the number of judges per 100,000 population, percentage of the national budget spent on the judiciary, or average caseload, against a more intuitive assessment of how the courts are doing, suggest that these are not what matters. A preliminary Bank Study (Buscaglia and Dakolias) finds some relationship between the percentage spent on infrastructure and use of computers, and delay reduction. However, of their six "good performers," three are widely regarded as having other serious performance problems (i.e. corruption), at the very least raising doubts that delay can serve as a proxy for overall quality.

  12. For example, one local contractor which has done a lot of work for USAID has introduced a stakeholders analysis (inventory of groups opposed to or supporting reform) which occasionally pays more attention to defining the political actors than the content of the reform they might support. Another, which has worked for both AID and the World Bank, is big on focus groups composed of users and judges, as a principal analytic tool, as well as a means of generating support for reform during the assessment.

  13. When lists have been used repeatedly, it is usually because the same project manager or consultant directs the process.

  14. As my colleague, Richard Messick, has noted, success in increasing confidence in the courts, should increase demand, and absent other changes, may well lead to more delay.

  15. The problem of encouraging the use of technically sound solutions is not unique to judicial reform. See Reimers and McGinn for a discussion in the context of education programs, where as with judicial reform, "….problems as they are faced by policy makers lack the precision to be found in a systematic study that can prespecify all relevant variables." (p. 27).

  16. Annex II.

  17. Although focused more on the executive bureaucracy, the World Bank Institute (WBI) is testing such instruments to develop a universally applicable tool.

  18. This method is explained in some detail for the detection of administrative corruption in Lopez Presa et als. It was also used in the Greylord investigation in Cook County, Illinois. See Special Commission.

  19. It is not just the retrogrades who object to these techniques. Many of the caveats raised here were also suggested in informal interviews with judges already engaged in reform work. Many are speaking from experience and the observable fact that even "good" judiciaries often receive less than perfect scores on public confidence or impressions of bias.

  20. Survey responses may say as much about the expectations of the informants as they do about the behaviour being reported. Differing norms on conflicts of interest or nepotism or about what constitute bribes (as opposed to normal social attentions) will obviously affect results and impede cross national comparisons. Outsiders (international entrepreneurs) may have a far different perception of the level of corruption, based on their different standards and possibly on a certain level of ignorance as to how local systems operate. Although lying outside the theme at hand, Jose Juan Toharia's example of Venezuelans' relatively low reporting of victimization by street crime is a case in point. It was only after repeated interviewing that the researchers discovered respondents were not including "minor" incidents like purse snatching or thefts of objects from cars. Cited in a lecture for the World Bank, July 14, 1999.

  21. See Special Commission.

  22. For example, in Ukraine and probably in other countries in the region, local authorities provide housing to judges and are said to use this, rather effectively, to influence judicial decisions. Ephraim Ugwuanye in private conversations and a paper written for the Bank recounts how such benefits (especially luxury vehicles) have been used to the same end in Africa. In El Salvador, until recently Court Presidents had their own slush fund, the use of which was not reported. The last president to enjoy that privilege used the fund as a campaign chest (providing meals, trips, vehicles and other equipment to lower ranking judges and private bar members) in his unsuccessful bid for reelection.

  23. The Messick checklist, Annex IV, in fact has two types of information - one focusing on problems, many of which derive from surveys, and the other focusing on structural and procedural traits. Unfortunately, this makes it extremely long and probably inappropriate for comparative use. It also may fall short on linking some of its extra- judicial problems to the judiciary -- violence, crime, and undesirable business practices certainly have causes other than judicial failings.

  24. The policy dialogue is important in those areas (e.g. higher budgets and salaries, some legal and constitutional change) where external actors cannot operate directly. Assistance agencies may lobby for higher judicial salaries, but they rarely if ever provide funds for implementing them.

  25. Reddy and Periera summarise the required changes as "micro institutional reform, concerning the modification of the internal procedures ," "state-societal reform, involving the restructuring of links between public entities and civil society," and "macro institutional reform, involving the inter-institutional relationships between different branches and agencies of the state." (p. 27).

  26. However, Annex I does address some of these issues very briefly.

  27. It is very common for a project justifications to incorporate country level data as indicators of massive problems. A little comparative work often reveals that the number represents an average or better than average rating. One example is the Latin American fixation on allocating 6 percent of the budget to the judiciary, a phenomenally high proportion on a universal basis.

REFERENCES

Buscaglia, Ed and Maria Dakolias (1999), "Comparative International Study of Court Performance Indicators: A Descriptive and Analytical Account." Working Document, Legal and Judicial Reform Unit, World Bank.

Chemonics (1996), "Informe del Diagnostico Institucional y su entorno." Report prepared under World Bank Proyecto de Reformas Judiciales, for Supreme Court of Bolivia.

Correa Jorge S., Carlos Pena G. and Juan Enrique Vargas (1999), "Poder Judicial y Mercado: Quien debe pagar por la Justicia," Daft paper, Centro de Investigacion, Facultad de Derecho, Universidad Diego Portales, Santiago Chile.

Dakolias, Maria (1995), "A Strategy for Judicial Reform: The Experience in Latin America." Virginia Journal of International Law, 36:1, pp. 167-231 Florida International University (see Annex VIII)

Garapon, Antoine (1996), Le Gardien des Promesses, Justice et Democratie. Paris: Editions Odile Jacob.

Hammergren, Linn (1998) "Fifteen Years of Judicial Reform in Latin America: Where We Are and Why We Haven't Made More Progress," Paper presented at Regional Conference on Judicial Reform, Corporacion de Excelencia en la Justicia, Bogota, Colombia, July.

Lopez Presa, Jose Octavio (1998), coordinator, Corrupcion y cambio. Mexico: Secretaria de Contraloria y Desarrollo Administrativo.

Pastor, Santos (1998), Ah de la Justicia! Politica judicial y economia. Madrid: Editorial Civitas.

Reddy, Sanjay, and Anthony Pereira (1998), "The Role and Reform of the State," United Nations Development Program, Bureau for Policy Development, Office of Development Studies, Working Paper Series, No. 8, August.

Reimers, Fernando and Noel McGinn (1997), Informed Dialogue: Using Research to Shape Education Policy Around the World. Westport, Connecticut: Praeger.

Shihata, Ibrahim (1998), "The World Bank" in Edmundo Jarquin and Fernando Carrillo, editors, Justice Delayed: Judicial Reform in Latin America. Inter-American Development Bank, pp. 117-31.

Special Commission on the Administration of Justice in Cook County (1998), Final Report. September.

Tate, C. Neal and Torbjorn Vallinder (1995), editors, The Global Expansion of Judicial Power. New York University Press.

Toharia, Jose Juan (1999), "La Independencia Judicial y La Buena Justicia," unpublished draft based on a paper presented at the Workshop, "El Juez ante el Signo XXI: Etica y Funcion Judicial," Spanish Judicial School, May 18-21.

Widner, Jennifer (1999), "Building Judicial Independence in Common Law Africa," in Andreas Schedler, Larry Diamond, and Marc F. Plattner, eds., The Self-Restraining State: Power and Accountability in New Democracies. Boulder: Lynne Rienner, pages 151-176.

Ugwuanye, Ephraim (1999), "Rule of Law and Anti-Corruption Reform: Issue Paper," unpublished paper written for World Bank Institute.

USAID, Center for Democracy and Governance (1998), Handbook of Democracy and Governance Program Indicators. Washington, August.

return to table of contents