12th Greek Australian Legal and Medical Conference
Samos, Greece 2009

The Question of Proof - Contrasting Approaches in Medicine and the Law

by Nicholas Bett

Some years ago I was asked to represent the Cardiac Society on a panel of medical practitioners appointed to assist the Professional Services Review (PSR) scheme [1], which was established to protect patients and the community from inappropriate practice by health practitioners and to protect the Commonwealth from having to meet the cost of these services. In short, it is there to discourage or to deal with those who seek to rort the system. The government takes this very seriously, given the very large sums of money involved. Australia spends around 10% of GDP on health care, two thirds of which is government expenditure.

Typically investigations by Medicare Australia into a practice are prompted by scrutiny and statistical analysis of claims for reimbursement, which may disclose a high volume of services or excessive prescribing of drugs under the Pharmaceutical Benefits Scheme. One trigger is known as the “80/20 rule”; if a medical practitioner renders more than eighty professional attendances on each of twenty or more days in a 12-month period he or she (nearly always he) is deemed to have practiced inappropriately unless there are exceptional circumstances. Other examples of inappropriate service include carrying out unnecessary procedures or not providing adequate clinical input.

If there appear to be grounds for concern Medicare Australia may refer the practitioner to the Director of PSR, who will examine a sample of his medical records and produce a report to which he is invited to respond. If the Director then believes that there is a case to be answered he may negotiate with the practitioner (eg to reimburse Medicare, modify practice) or refer the matter to a PSR Committee.

Typically a Committee comprises as chairperson a Deputy Director, who has had some training and experience in conducting hearings, and two members chosen from the PSR Panel because their practises are similar to that of the person under review. The members of this Panel are appointed by the Minister for Health and Ageing,

A Committee examines and considers the clinical records provided in response to a demand from the Director of PSR and any other evidence including submissions from the person under review; it then produces a draft report for the Director. The role of such a Committee may be similar to that of a tribunal of judges or examining magistrates such as we see in the French inquisitorial system of justice.

Inappropriate practice may be deemed to have occurred if the practitioner's conduct would be unacceptable to the general body of his peers. An unfavourable decision by the Director or the PSR Committee is referred to a Determining Au-thority to decide what sanctions should apply: these may include reprimand and counselling by the Director, repayment of Medicare benefits and partial or full disqualification from Medicare for a time. The practitioner may make submissions and seek judicial review in the Federal Court. Medicare Australia may refer up to one hundred cases a year to PSR.

The Director may publish such details as the practitioner's name and address, profession or speciality, the nature of the inappropriate practice and the sanctions imposed [1] . These are three fairly typical cases.

Dr T is clearly a man of great energy and stamina but was caught in the “80/20” net. He saw so many patients in the course of a year that in order to have pro-vided appropriate care he would have to have worked for up to twenty three hours each working day. Sadly, his clinical management was inadequate and his records seriously deficient. He issued a prescription, usually for more that one drug, at almost every consultation. He was disqualified from Medicare for two years and required to repay $138,000. He has employed every possible legal action to postpone the enforcement of these sanctions.

Dr S provided more services than 99% of the medical practitioners in Australia. His clinical input was scanty and his records rudimentary. His practice of asking patients if they felt hot instead of using a thermometer was thought to be unacceptable. He was disqualified from Medicare and directed to repay $117,000.

Dr B’s practice was inappropriate in that he had prescribed antibiotics when there was no indication of bacterial infection. He lacked knowledge of the use of common drugs and failed to examine his patients properly. He conceded that he did not understand the readings generated by his respiratory function testing machine. He was disqualified from Medicare, required to repay $250,000 and brought to the attention of his state Medical Board.

As I have noted, an investigation begins with a data-dredging exercise by Medi-care Australia. I have no doubt that they have a number of algorithms to identify patterns of practice which are outside statistical norms, of which the “80/20” rule is a well-known example. There are standard techniques of sampling information

and of determining the likelihood that particular patterns might be found by chance; these are used to identify those whose practices warrant scrutiny. Recently there has been some concern about ensuring the confidentiality of medical records if Medicare or another government agency has access to them.

We must remember that in exercises of this kind there is always the possibility that a particular conclusion will be reached by chance. Chance, or luck, may be a treacherous mistress.

I found my limited experience of the PSR procedures fascinating although they may be very stressful for someone being investigated. I was particularly interested in the process and in comparing this to how our legal system works and how we try to make scientific judgements and to arrive at what might be truth.

For my purposes I have identified three kinds of proof which differ in how they are derived and the degree of certainty we may attach to them. What I term absolute proof is exemplified by Pythagorus’ theorem [2]. Lawyers weigh evidence, which is often unclear and conflicting, in order to reach a categorical decision, whereas physicians and scientists employ a variety of statistical tests to answer the question What is the probablility that the same data would be found and the same conclusion reached by chance?

It is appropriate that my first example of absolute proof is the theorem [2] of Pythagoras [3] of Sámos. Once this has been understood there is nothing more to be said. The facts admit of no other interpretation. Epicurus [4], another native of Sámos, declared For in the study of nature we must not conform to empty assumptions and arbitrary laws, but follow the promptings of the facts.

My other example of absolute proof is the demonstration by William Harvey [5] of the circulation of the blood. Harvey studied at Gonville and Caius College in the University of Cambridge and in Padua under Fabricius of Aquapendente before returning to become the Lumleian lecturer of the College of Physicians in London and Physician to James I and Charles I. He was unwilling to accept the authority of the great anatomists and natural philosophers of antiquity, including Galen and Vesalius. It is base to receive instructions from others’ comments without nation of the objects themselves, especially as the book of nature lies so open and is so easy of consultation [6].

In 1628 he published the account of his investigations as Exercitatio anatomica de motu cordis et sanguinis in animalibus7. It includes this figure which illustrates how venous valves permit blood to flow only towards the heart. This was part of his elegant demonstration that blood does not ebb and flow but circulates constantly throughout the body.

I expect that those who practice law rarely have the luxury of such certainty. As I understand it, the level of proof required in civil cases is substantially less than it is in criminal cases: it may be sufficient to establish that that there is a preponderance of evidence in favour of a case.

Lord Denning, who died ten years ago in his one hundred and first year, spoke of a balance of probabilities whereby it is sufficient to establish that an outcome is more probable than not. The learned judge may be best known for his inquiry into the Profumo scandal. He had some very firm views on a number of issues and has become something of a cult figure [8]; there is now a Lord Denning Appreciation Society on Facebook where you may find some more or less scurrilous anecdotes.

In criminal cases the criteria are more stringent, so that a verdict of guilty requires clear and convincing evidence, beyond reasonable doubt. In British law the man on the Clapham omnibus is a fictional character who may be invoked as a hypothetical person against whom a defendant's conduct might be judged. His Australian cousin is the man on the Bondi tram.

AP Herbert [9] was a law-reform activist, novelist, poet, playwright and the independent MP for Oxford University. His comment on the reasonable person is worth noting, as are some of his views on other aspects of contemporary society. The reasonable person has been called an excellent but odious characterHe is an ideal, a standard, the embodiment of all those qualities which we demand of the good citizen ... [he] invariably looks where he is going,... is careful to examine the immediate foreground before he executes a leap or bound; .. neither star-gazes nor is lost in meditation when approaching trapdoors or the margins of a dock; ... never mounts a moving [bus] and does not alight from any car while the train is in motion, .. uses nothing except in moderation, and even flogs his child in meditating only on the golden mean.

In science and medicine our task is commonly to make statistical sense of data acquired from observations of a number of subjects or patients, be they rats or men. The question posed is generally What is the probability that these findings could occur by chance? It usually comes down to using one or more of the statistical tests which have evolved since the early years of the 20th century.

The smaller the effects of treatment or of an intervention the greater will be the number of patients or subjects we shall have to study in order to show a statistically significant effect.

An example of this is a study of two drugs which interfere with the function of platelets and hence reduce the ability of blood to clot. This was published over ten years ago [10]. The drugs compared were aspirin, which has been around for over a century, and clopidogrel. Patients with arterial disease were randomised to take one tablet each day of aspirin or clopidogrel.

Over nineteen thousand patients were enrolled around the world and followed for up to three years. In order to keep us honest we and our patients were blinded so that we were unaware of who was taking which drug. The end-points were ischæmic stroke, myocardial infarction (heart attack) or vascular death.

The first undertaking for those designing a trial of this sort is to come up with a suitable acronym, such as CAPRIE. The next task is to estimate how many patients or subjects will be needed in order to have a reasonable chance of finding a statistical difference between the treated groups. This involves making some assumptions and an educated guess about the relative efficacy of the drugs you are comparing.

The usual convention is that a difference is statistically significant if the probability that it has occurred by chance is less than one in twenty (5%). This may be expressed as a value for p of less than 0.05. In the case of CAPRIE this magic threshold was achieved, with a p value of 0.043.

This plot shows on the y-axis the cumulative rate of events in the two groups against time (x-axis). The lines (aspirin dashed, clopidogrel dotted) diverged slightly out to three years but the risk of an event differed by only around half a percent a year. You may make your own judgement about whether this difference is important. Although this trial cost around two hundred and fifty million dollars the pharmaceutical companies involved probably felt that the expense was justified.

Annual sales of clopidogrel are now around seven billion dollars, making it the world's second-largest selling drug. These sales are driven largely by the need for patients with coronary artery stents to take clopidogrel, often indefinitely.

My other example is of a rather ludicrous situation which arose because of an obsession with levels of statistical significance. This comes from a study which was initiated and driven by Boston Scientific, a company which manufactures medical devices including coronary stents. The market for these is around five billion dollars annually, of which Boston Scientific has a large share.

In order to persuade the FDA to approve their new Liberté stent they conducted a study known as ATLAS in which they compared the outcomes of patients in whom these had been used with another group treated with their first generation stent. They were delighted to be able to report that their Liberté stent was not inferior to their previous model [11]. Unfortunately for Boston Scientific, the Wall Street Journal published its own analysis of the ATLAS findings, using a number of other statistical tests [12].

This plot is modified from one which appeared in the Journal: it shows that whereas the Wald interval test used by Boston Scientific gave a value for p which was less than the magic level of 0.05 (to the left of the vertical dashed line) a number of other perfectly reputable tests produced values which were very slightly greater (p> 0.05), so that on these criteria the non-inferiority test had failed. In short, the Journal alleged that that the ATLAS value for p was based on a flawed statistical equation that skewed the results in favour of the Liberté stent.

You may feel that this argument is as pointless as to debate how many angels can dance on the head of a pin? The FDA [13] has approved this stent, which is a perfectly ceptable device which most of us find easier to use than its predecessor.

The choice and use of appropriate statistical tools is a discipline in its own right and is much more complex than I have the time or indeed the understanding to explore further today.

Learned professional societies such as the American College of Cardiology and the Cardiac Society of Australia and New Zealand and government organisations publish guidelines on a number of clinical issues. In Britain the National Institute for Health and Clinical Excellence [14] (this forms the catchy acronym NICE) was established to remove inequalities and inefficiencies and to rationalise the provision of health care by providing clear and robust advice to NHS staff. Some of the issues were discussed in a very erudite way by the chairman of NICE, Michael Rawlins, in his Harveian lecture to the Royal College of Physicians last year [6] . He outlined a hierarchy of levels of evidence which may support decisions about the choice of therapy.

NICE faced a good deal of public scrutiny and criticism over its deliberations on the provision of trastuzumab [15] for patients with breast cancer. This is an antibody which is effective for some patients, but its use is complex and expensive. It may cause heart failure, so that those who are treated require repeated tests of cardiac function. The annual cost for treating a patient with this drug is around $50,000; it is estimated that the median increase in survival is around five months. NICE was castigated for taking several years to approve its use for NHS patients, although clearly the issues are complex.

I have tried to examine some of the differences in the way in which lawyers and physicians examine and analyze evidence to discern what is truth. Lawyers seem to have a fairly pragmatic or commonsense attitude whereas doctors or scientists (some are both) try, sometimes successfully, to formulate a more structured or mathematical approach. The processes followed by Medicare Australia and Professional Services Review may have some elements of both cultures.

I am very grateful to the organizers and sponsors for inviting us to this splendid meeting on this beautiful island. It has been a memorable experience.

Appendix

Epicurus made some other memorable comments.

Lord Denning was sometimes outrageous but often amusing and always interesting.

I alluded also to some remarks by AP Herbert on the reasonable man. Here are some of his other thoughts on life.

Footnotes

[1] www.psr.gov.au

[2] a2 + b2 = c2

[3] Pythagoras, Sámos circa 580 BC-Metapontum circa 490 BC

[4] Episkopos, Sámos 341 BC– Athens 270 BC

[5] 1578-1657

[6] quoted in Rawlins M. De testimoni: on the evidence for decisions about the use of therapeutic interventions. Lancet 2008; 372: 2152

[7] An Anatomical Exercise Concerning the Motion of the Heart and Blood in Animals

[8] http://www.guardian.co.uk/uk/1999/mar/06/claredyer1

[9] 1890–1971

[10] Clopidogrel versus Aspirin in Patients at Risk of Ischæmic Events. Lancet 1996; 348: 1329

[11] Mark A. Turco MA, Ormiston JA, Popma JJ et al. Polymer-Based, Paclitaxel-Eluting TAXUS Liberté Stent in De Novo Lesions: The Pivotal TAXUS ATLAS Trial. JACC 2007; 49: 1676

[12] Winstein KJ. Boston Scientific Study Flawed. Wall Street Journal August 14th 2008

[13] Food and Drug Administration (www.fda.gov)

[14] Rawlins M. De testimoni: on the evidence for decisions about the use of therapeutic interventions. Lancet 2008; 372: 2152

[15] Herceptin™