Yes, each thread in jmeter thread group is like a virtual user simulating the actions of a real user.
When creating a test plan, it's important to keep in mind the actual use cases - that is, how do different users use the system.
Any system will have atleast a dozen and usually many more use cases, related to its domain.
Not all users will be performing the same use case.
And not all 3000 *registered* users will be using the system every second (registered is not the same as concurrent).
So you need to work out a representative usage scenario - that is, how many percent of users perform which use case and how often?
Usually, this input comes from the functional specs, the client and/or domain experts.
If it's a new product, take the figure you get from them with a pinch of salt, because the client may not have any idea either and is likely to produce a number out of thin air that sounds good to their ears, unless they have based it on some objective survey or study.
But if it's not a new product, then they can provide such data from their access or analytics logs.
The other number you need is the target number of concurrent users that system should support while maintaining some quality of service (like response times within some defined limit).
Again, this number should be reached using a representative sampling based on different use cases
The client's input and their specs also should be taken, though they are only likely to give you a ceiling on this number.
Take that ceiling number and add a buffer like 20% to it - that is the number the system should handle. It's never good idea to aim for exactly the estimated number given by client - it's better to aim for something slightly higher.
This number is only a desirable upper limit, but the system could start failing much earlier.
What the engineers of the system are interested in is not just a yes/no answer to "does it work with 3000 concurrent users?".
What they are interested in is how do their systems perform both externally (ie, as seen by users via response times, availability, zero errors, etc) and internally (ie, resources like memory usage, IO speeds, etc) - at various load levels.
If they can find system faults at lower levels, then they can think it over, redesign the system, try out various scenarios and possibly make it endure much higher levels.
So
you should start with a small number of concurrent users and then build it up in increments.
All the servers should have monitoring software installed, which can alert if any resource goes above limit.
At each increment, measure response times and the standard deviation (SD).
The SD is a measure of the reliability of those results - if the SD is small, then response times are fairly consistent and close to the mean value.
But if the SD is large, it's possible the system is already starting to buckle at that load level.
Developers have a habit of taking the most optimistic values from a set of results and ignoring other results as anomalies. The SD will help decide whether some results are indeed anomalies, or whether system is buckling.
The other thing to keep in mind, when having large numbers of users like 1500 concurrent users, is - don't test it from a single test machine running jmeter.
The problem with a single machine is that it's likely to clog the outgoing network bandwidth of your test machine, making the test unrealistic.
Instead, distribute the testing over a cluster of test machines, with each machine simulating a smaller number of users that its bandwidth can handle.
The jmeter wiki explains how to set up such a cluster of test machines and run a test plan from multiple machines.