(Fastest is FPGA, but Let’s put that aside. )
Low latency developers generally prefer C and avoid OO (runtime overhead), but they do use C++ templates to achieve power and flexibility.
In terms of data structures, I think they use STL too. Array is essential. In contrast, graph data structures incur additional allocation due to the graph Node objects — no such overhead in arrays.
Java’s issues —
* autoboxing — market data use mostly of primitive objects
* Every Object.java instance takes something like 8+ bytes.
* Indeterminate garbage collection
* virtual function overhead
* Even purely local variables are often put into heap for delayed clean-up
* JVM could reach good throughput wrt c++, but only after a slow warm-up.