Locks and condition variables are important to threading and Wall street systems, but the highest performance market data systems don't use those. Many of them don't use java at all.
They dedicate a CPU to a single thread, eliminating context switch. The thread reads a (possibly fixed sized) chuck of data from a single socket and puts the data into a buffer, then it goes back to the socket, non-stop, until there's no more data to read. At that time, the read operation blocks the thread and the exclusive CPU. Subsequent Processing on the buffer is asynchronous and on a different thread. This 2nd thread can also get a dedicated CPU.
This design ensures that the socket is consumed at the highest possible speed. (Can you apply 2 CPUs on this socket? I doubt it.) You may notice that the dedicated CPU is idle when socket is empty, but in the context of a high-volume market data feed that's unlikely or a small price to pay for the throughput.
Large live feeds often require dedicated hardware anyway. Dedication means possible under-utilization.
What if you try to be clever and multiplex a single thread among 2 sockets? Well you can apply only one cpu on any one thread! Throughput is slower.