You can make use of java.util.concurrent.CyclicBarrier to maximize concurrency in your test case.
Simply fire up an arbitrary number of threads that wait at a CyclicBarrier, which will release them once that number is reached.
Also make sure that once a thread has finished its task, it waits at another CyclicBarrier.
That way further test logic (assertions for instance) will only run after every thread has finished.
Build a man a fire, and he'll be warm for a day. Set a man on fire, and he'll be warm for the rest of his life.