I was told Google engineers discuss algorithm efficiency everyday, including lunch time. I guess the intensity could be even higher at HFT shops.
I feel latency due to software algorithm might be a small component of overall latency. However, the bigger portions of that latency may be unavoidable – network latency, disk write(?), serialization for transmission(?), … So the only part we could tune might be the software algorithm.
Further, it's also possible that all the competitors are already using the same tricks to minimize network latency. In that case the competitive advantage is in the software algorithm.
I feel algorithm efficiency could be more permanent and fundamental than threading. If I compare 2 algorithms A1 and A2 and find A2 being 2x A1's speed, then no matter what threading or hardware solutions I use, A2 still beats A1.