摘要:Existing sensitive attributes diversity models do not capture the semantic similarity between sensitive values, so they cannot resist semantic similarity attack. To address the problem, we present a method to measure semantic similarity of a categorical sensitive attribute based on the attribute’ semantic hierarchy tree. On basis of the measurement, the paper proposes a ( l , e )-diversity model which has two constraints in each equivalence class: (1) there are at least l well-represented values; (2) any two sensitive values are not e -similar. Furthermore, the paper designs a liner-complexity maximum bucketization greedy algorithm to implement the model. Experimental results show that the anonymous data satisfied ( l , e )-diversity has a higher diversity degree than that satisfied l -diversity, so ( l , e )-diversity can protect privacy more effectively than l -diversity.