摘要:In this article, we propose a novel scheme for diagnosing intermittent faults for cloud systems. We have investigated the characteristic of high-level symptomatic behavior on top of a cloud system and identified that (1) arrival counts of high-level symptoms go up with the number of fault injections at different speeds, which may help us to differentiate one fault model from another; (2) the nested level of fatal traps is found to be an indicative of fault duration, which is helpful for fault model diagnosis; (3) fatal traps triggered by certain faulty units is explored, providing useful information for locating faults. Based on these features, an n-dimensional space taking symptom’s arrival rate (grown up skew of the arrival count) as each dimension, which formulates the diagnosis problem as a pattern recognition problem is defined. Then, a backpropagation neural-network-based online hardware fault diagnosis scheme is proposed. Experimental results show that diagnosis accuracy of fault location is 99.2%,...