Multiple-input multiple-output (MIMO) technology enables higher transmission capacity without additional frequency spectrum and is becoming a part of many wireless system standards. Sphere detection has been introduced in MIMO systems to achieve maximum likelihood (ML) or near-ML estimation with reduced complexity. This paper reviews related work on sphere detector implementations and presents an application-specific instruction set processor (ASIP) implementation of K-best list sphere detector (LSD) using transport triggered architecture (TTA). The implementation is based on using memory and heap data structure for symbol vector sorting. The design space is explored by presenting several variations of the implementation and comparing them with each other in terms of their latencies and hardware complexities. An early proposal for a parallelized architecture with a decoding throughput of approximately 5.3 Mbps is presented