Yeah, threads are allowed to keep copies of a variable "somewhere". This is intentionally vague, so platforms are free to optimize code as they see fit. In practice, "somewhere" may refer to both the CPU registers and the CPU caches.
If two different threads perform operations on the same variable, they may actually perform operations on their cached copy of the variable, and when they finally write the result back to memory, they will overwrite the value another thread wrote to memory earlier. A volatile variable may not be cached, so when a thread performs an operation on it, another thread will immediately see the results of that operation. This does not guarantee however, that operations are performed in the proper order. For this you need to synchronize access to the variable.
Note that
Java requires all platforms to update the variable in main memory when a thread leaves a critical section (synchronized block). This means that the volatile keyword is almost never used, because synchronizing your variable access is sufficient. Volatile variables are usually only used when they serve to signal one thread from another, and no other synchronization between them is needed:
Here the run() method is continuously running, performing some action until another thread signals the object to stop. A volatile boolean can be used instead of two synchronized methods, because the two threads perform atomic operations on it (a simple assign, and a simple access).