摘要:Insertion sequence (IS) 6110 is found at multiple sites in the Mycobacterium tuberculosis genome and displays a high degree of polymorphism with respect to copy number and insertion sites. Therefore, IS 6110 is considered to be a useful molecular marker for diagnosis and strain typing of M. tuberculosis . Generally IS 6110 elements are identified using experimental methods, useful for analysis of a limited number of isolates. Since short read genome sequences generated using next-generation sequencing (NGS) platforms are available for a large number of isolates, a computational pipeline for identification of IS 6110 elements from these datasets was developed. This study shows results from analysis of NGS data of 1377 M. tuberculosis isolates. These isolates represent all seven major global lineages of M. tuberculosis . Lineage specific copy number patterns and preferential insertion regions were observed. Intra-lineage differences were further analyzed for identifying spoligotype specific variations. Copy number distribution and preferential locations of IS 6110 in different lineages imply independent evolution of IS 6110 , governed mainly through ancestral insertion, fitness (gene truncation, promoter activity) and recombinational loss of some copies. A phylogenetic tree based on IS 6110 insertion data of different isolates was constructed in order to understand genome level variations of different markers across different lineages.