Transactional memory proposes an alternative synchronization primitive to traditional locks. Its promise is to simplify the software development of multi-threaded applications while at the same time delivering the performance of parallel applications using (complex and error prone) fine grain locking.
This study reports our experience implementing a realistic application using transactional memory. The application is Lee's routing algorithm and was selected for its abundance of parallelism but actual difficulty of expressing it with locks. Each route between a source and a destination point in a grid can be considered a unit of parallelism. Starting from this simple approach, we evaluate the performance of a transactional parallel implementation and explore how it can be adapted to deliver better performance. The adaptations do not introduce locks nor alter the essence of the implemented algorithm, but deliver up to 20 times performance improvements. The adaptations are derived from understanding the application itself and transactional memory.