Aurora Semerano, MD

Kamel H, Navi BB, Parikh NS, Merkler AE, Okin PM, Devereux RB, Weinsaft JW, Kim J, Cheung JW, Kim LK, et al. Machine Learning Prediction of Stroke Mechanism in Embolic Strokes of Undetermined Source. Stroke. 2020;51:e203–e210.

In 2014, when the concept of embolic stroke of undetermined source (ESUS) was proposed,1 confidence existed that ESUS could represent a single entity which would have benefitted from a unified treatment. However, after two randomized clinical trials did not show benefit of direct oral anticoagulation for secondary prevention of ESUS patients,2,3 it is now common opinion that these patients rather represent a heterogeneous population and are likely to benefit from tailored, personalized therapies. Today, ESUS represents a useful definition to identify patients deserving extended diagnostic workup, while prevention therapy for these patients remains elusive, and clinical stroke recurrence is still an issue. Both subgroup analyses from the above-mentioned clinical trials and new research studies have been developed or are ongoing, to better understand the pathophysiology of ESUS and help in patient selection.

In such a phenotypically heterogeneous population, one big effort is to identify patient subsets with a single or group of underlying mechanisms likely to respond to an established treatment. With this right purpose of uncover the “hidden structure” in a complex scenario, the recent study from Kamel et al.4 employs a machine learning approach. Firstly, a supervised machine-learning algorithm was developed to distinguish cardioembolic versus non-cardioembolic strokes in a population of 1083 patients with known stroke etiology, by entering data about demographics, comorbidities, vitals, laboratory results, and echocardiograms. After the learning process, the system finally resulted to distinguish cardioembolic from non-cardioembolic strokes with excellent accuracy (area under the curve, AUC=0.85).

Secondly, the classifier was applied to a population of 580 ESUS patients and predicted that 44% of ESUS cases (95% credibility interval, 39%–49%) could result from occult cardiac embolism in the studied cohort. When comparing ESUS patients with the lowest and those with the highest quartile of likelihood for occult cardioembolism, patients with the highest probability of a cardioembolic mechanism resulted to be older; more likely White; with a clinical history of coronary artery disease, heart failure, and peripheral vascular disease; they were less likely active smokers; and they presented more severe strokes, lower ejection fractions, larger left atria, and lower blood pressure.

Finally, in order to test the validity of the classifier, the results of the prediction model were compared with the diagnosis rate of atrial fibrillation after hospital discharge, since ESUS patients routinely underwent 30 days of continuous heart-rhythm monitoring with an external loop recorder in the authors’ center. Interestingly, atrial fibrillation was detected in 9.1% of patients in the highest quartile vs. 2.9% of those in the lowest quartile of predicted likelihood for a cardioembolic source.

The presented article is an encouraging example of how machine learning techniques can complement and extend conventional statistical methods in stroke medicine, providing unique tools to better understand complex data sets. Which are indeed the potential clinical implications and perspectives of this study? The proposed machine-learning approach may assist in setting the stage for future clinical trials, helping in the definition of appropriate eligibility criteria. It is interesting to note that the two ongoing randomized clinical trials,5,6 which are assessing the efficacy of oral anticoagulants in secondary prevention in ESUS, include a selected population of patients with markers of cardiac disease, being illustrative of the most recent attitude towards a personalized treatment in stroke medicine. Also, machine-learning could represent a tool to help in the decision about the intensity and the most appropriate protocols of the diagnostic workup of these patients. In addition, the data-driven comparison between the two distinguished (cardioembolic vs. non-cardioembolic) stroke subpopulations may provide novel pathophysiological insights.

Limitations are also extensively discussed in the article. Results need validations in other cohorts since local factors can influence the performance of artificial intelligence models. Also, the final validity assessment by means of post-discharge diagnosis of atrial fibrillation (lasting > 30 seconds) represents indirect, though not definitive, proof of an underlying cardioembolic mechanism, and longer-lasting atrial fibrillation could be proposed as an alternative validation criterion. Brain imaging features from a subset of patients were also analyzed, but did not result to add value to the prediction model. However, comprehensive data from imaging of large vessels should be the object of further investigations. Future machine-learning approaches focusing on other occult stroke mechanisms (e.g., non-stenosing large artery atherosclerosis) and/or examining recurrent stroke rates to identify the characteristics of patients at the highest risk are only two further examples of possible applications of artificial intelligence to stroke etiological investigation.

Artificial intelligence is an exciting resource in stroke research, with promising implications in various aspects of clinical stroke care.7,8 It is daily practice for every stroke physician to report big amounts of data in electronic records, which often result to be underutilized in the busy clinical setting, and can be hardly handled by conventional statistical methods. Nonetheless, it is now emerging that they may turn into information able to potentially influence clinical evaluations if properly integrated and interpreted by machine-learning approaches. Intensifying the dialogue with data scientists, as well as providing specific training for interested residents, fellows, and specialists, are both desirable for the present and the future of stroke medicine. Public dissemination in scientific literature with reported details about employed methods, data transformation, handling of missing data, etc., is pivotal for research generalizability and development of useful clinical algorithms.


1. Hart RG, Diener HC, Coutts SB, et al. Embolic strokes of undetermined source: the case for a new clinical construct. Lancet Neurol. 2014;13(4):429-438.

2. Hart RG, Sharma M, Mundell H, et al. Rivaroxaban for Stroke Prevention after Embolic Stroke of Undetermined Source. N Engl J Med. 2018;378(23):2191-2201.

3. Diener HC, Sacco RL, Easton JD, et al. Dabigatran for Prevention of Stroke after Embolic Stroke of Undetermined Source. N Engl J Med. 2019;380(20):1906-1917.

4. Kamel H, Navi BB, Parikh NS, et al. Machine Learning Prediction of Stroke Mechanism in Embolic Strokes of Undetermined Source. Stroke. 2020.

5. Kamel H, Longstreth WT Jr, Tirschwell DL, et al. The AtRial cardiopathy and antithrombotic drugs in prevention after cryptogenic stroke randomized trial: rationale and methods. Int J Stroke. 2019;14:207–214.

6. Geisler T, Poli S, Meisner C, et al. Apixaban for treatment of embolic stroke of undetermined source (ATTICUS randomized trial): Rationale and study design. Int J Stroke. 2017.

7. Shameer K, Johnson KW, Glicksberg BS, Dudley JT, Sengupta PP. Machine learning in cardiovascular medicine: are we there yet? Heart. 2018;104(14):1156-1164.

8. Mouridsen K, Thurner P, Zaharchuk G. Artificial Intelligence Applications in Stroke. Stroke. 2020;51(8):2573-2579.