Chirantan Banerjee, MD

Howard G, Voeks JH, Meschia JF, Howard VJ, and Brott TJ. Picking the Good Apples: Statistics Versus Good Judgment in Choosing Stent Operators for a Multicenter Clinical Trial. Stroke. 2014

Despite the most compulsive adherence to pre, intra and post-procedure protocols, notable disparities in patient outcomes after surgery or procedures persist. Current guidelines specify that the procedural morbidity and mortality rate for procedure in asymptomatic carotid stenosis patients should be <3% and for symptomatic patients <6%.  The Carotid Revascularization Endarterectomy versus Stenting Trial (CREST) was lauded for its low periprocedural stroke and death rates in asymptomatic and symptomatic carotid stenosis.
 


In this study, Howard et al. look at the data from this ever important trial and assess the scope of contribution by statistics in estimating an operator’s true complication rate. In CREST, a lead-in registry was used to assess potential stent operators, and average of 24 cases were evaluated for each operator by an interventional management committee. Thus, for an operator to have <3% complication rate, he could not have had even 1 complication among the 24 cases. Say an operator does 24 procedures without any complication, the conclusion that he will have a 0% complication rate may not be true, as the same operator may have 3 complications in his next 24 patients. So, the assessment of performance by this method is prone to error. In comparison, as the outcome is binary (complication v/s success) and thus has a binomial distribution, probability of exactly x complications out of the 24cases can be calculated. Thus, if we were to arbitrarily establish that an operator with say x complications out of 24 is a good operator, assessment can be made about the strength of this rule in identifying good and bad operators. The authors defined “error rate” as the sum of the expected percent of good operators (with a “true” 2% event rate) excluded, plus the percent of poor operators (with a true 6%, 8% or 10% event rate) included. They then go about calculating for each cohort of 100 operators with “true” complication rates ranging from 0 to 20%, who each perform 24 procedures, on average how many operators will have no complications, how many 1 complication, how many 2 complications, and so forth. This reveals that if statistics alone is used, assessing “good” and “bad” operators is fraught with significant error, unless the sample size is much higher. The fact that the committee was able to successfully include “good” operators strongly indicates the merit of combination of reviewing performance, volume and a subjective review of technique over pure statistics.   

When I performed a cursory search on how to quantify operator performance, I was surprised to learn that the data is very limited. An article in the NEJM last year found that technical skill of practicing bariatric surgeons in Michigan as assessed by peer rating of operative skill was associated with fewer postoperative complications.  At the end of the day, it probably relates to the fact that procedural skills are as much of an art as science. And only experience and peer evaluations may have the best odds at correctly triaging operator skill and technical prowess!