aspose file tools*
The moose likes Threads and Synchronization and the fly likes CopyOnWriteArrayList Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Threads and Synchronization
Bookmark "CopyOnWriteArrayList" Watch "CopyOnWriteArrayList" New topic
Author

CopyOnWriteArrayList

sarah Marsh
Ranch Hand

Joined: Mar 06, 2001
Posts: 282
Hello,

When, why to use CopyOnWriteArrayList?

Thanks!
Steve Luke
Bartender

Joined: Jan 28, 2003
Posts: 4175
    
  21

When you need a List which is:
1) thread safe
2) highly performant on reads
3) safe to traverse without synchronization
4) likely to have more read operations than write operations



Steve
luri ron
Ranch Hand

Joined: Dec 11, 2008
Posts: 87
p.s., I have not yet seen a pratical use case for this class. the idea of duplicating the data structure anytime a new member is added is quite expensive.
Nitesh Kant
Bartender

Joined: Feb 25, 2007
Posts: 1638

luri ron wrote:p.s., I have not yet seen a pratical use case for this class.


How about when maintaining a list of listeners for an event in your system. These listeners are usually added once in a while during startup but no one really stops them from being added during processing.
You will usually be iterating over this list (read) when giving a callback. Although a very remote possibility but someone can register a listener while you are iterating.
To avoid data corruption, one way is to synchronize the reads and writes and the other way is to safely use a CopyOnWriteArrayList.

luri ron wrote:the idea of duplicating the data structure anytime a new member is added is quite expensive.


Sure it is, that is why it is used when writes on the list are very occassional but reads are frequent.


apigee, a better way to API!
Steve Luke
Bartender

Joined: Jan 28, 2003
Posts: 4175
    
  21

I am a bit late re-chiming in here, but I wanted to add some more info:

The more time you spend in read operations - either because you iterate frequently over the list, or because your iteration takes time (a lot of work to do with each object in the list), then the more you benefit from the CopyOnWriteArrayList.

1) Multi-threaded reading is faster because one thread doesn't have to wait for another thread to finish reading.
2) Writing is faster because it doesn't have to wait for the Reads.

The fact is that when you have frequent or long reads from the List, then writing to CopyOnWriteArrayList may be faster than nearly any other kind of thread-safe list because the writes don't have to wait for the reads to complete. The next fastest implementation - when iterating heavily - is with a LinkedList where access is synchronize with ReadWriteLocks. Of course the LinkedList + ReadWriteLock is rougher looking code, and is easier to make a mistake with (as well as being ever so slightly slower). So even if the LinkedList was a bit faster I would probably choose COWAL in most cases for the shear simplicity of not having to worry about concurrent modification.

[note] Above description is based on tests done on mid-sized Lists (size = 10,000 to start) and situations where stress is given to Reading the list. When Reading and Writing are given equal precedence then the ReadWriteLock - synchronized LinkedList is the most performant on the Write side by far, and behaves nearly equally on the Read side compared to CopyOnWriteArrayList. Real-world situations are going to be much more complex than simple speed tests. My tests showed the above behavior when small chunks of data are written at each Write-point. The more data (more records) to be Written, then less performant COWAL became on the Write side compared to the ReadWriteLock LinkedList, but the more performant the Read side was in comparison to the RWLocked LinkedList. So your mileage may vary...
luri ron
Ranch Hand

Joined: Dec 11, 2008
Posts: 87
hm.. why use lock? use ConcurrentSkipListSet for your listeners. no need to lock to iterate through the listeners, and there is no concurrentmodificationexception.
Steve Luke
Bartender

Joined: Jan 28, 2003
Posts: 4175
    
  21

luri ron wrote:hm.. why use lock? use ConcurrentSkipListSet for your listeners. no need to lock to iterate through the listeners, and there is no concurrentmodificationexception.


If a Set is an appropriate holder for the data you are using. If you want to permit duplicates or need to maintain insertion order (rather than natural order) then the Set you describe is not useful. And for those limitations, in the test scenario I set up for the lists I got no increase in performance using the ConcurrentSkipListSet on either the Read or Write side (it is basically the same speed as the CopyOnWriteArryList). This, to me, suggests that the ConcurrentSkipListSet is an appropriate Set implementation that feeds the same niche as the CopyOnWriteArrayList does for List implementations. I would have to do some more tests to see how bulk additions behave with CSLS, since my implementation took advantage of the ability to add the same value to the list lots of times and so isn't directly copy-able to the Set...

[note] 'Basically the same speed' means that in 10 seconds, the CSLS and COWAL both ran 2300 +/- 100 Reads per thread, 5 threads per test (no sleeps per Read. A Read is a complete traversal of the Collection using the Enhanced For loop), and 95 Writes in the same 10 seconds (100ms sleep between Writes, one value written per cycle).
luri ron
Ranch Hand

Joined: Dec 11, 2008
Posts: 87
how many listeners you have in the structure for your performance test?

the search time a list is O(n) compared with skiplist is O(log(n)) which is faster for removal and search.

in term of duplicate, it is probably better not to have duplicated listener?

send notification in order is an advantage for the list, but in a distributed environment. this has a lot to do with the network rather than what order the listeners are stored in.
Steve Luke
Bartender

Joined: Jan 28, 2003
Posts: 4175
    
  21

luri ron wrote:how many listeners you have in the structure for your performance test?


I was running general iteration tests. The tests begin with 10,000 elements and with additions could grow from there.

the search time a list is O(n) compared with skiplist is O(log(n)) which is faster for removal and search.


If a large portion of your application is spent searching and removing data, then yes, you would be better with a sorted collection rather than one where order is maintained on insertion order and searches are linear (Lists). Choose the right collection for the job.

in term of duplicate, it is probably better not to have duplicated listener?

For listeners I generally agree, but I usually let the client determine that behavior, so I use a List since it doesn't impose any limits (for the most part I think the Swing components and default implementation of Bean property listeners also use lists). I don't necessarily think the List is a better approach than a Set, just the one I use...

Anyway, the Listeners is a simple example. But the List interface exists for those cases where duplicates are allowed to exist, and/or insertion order is needed. A lot of heavy data maintenance applications fall into this category (whether it is keeping other apps/clients up to date on current information, distributing work to multiple clients, then pooling the results together, or any situation where the collection is considered a 'data' source only ...)

The point is that the choice between a List and a Set is distinct and independent of the choice between concurrent/thread safe or not, and if concurrent then what type of synchronization works best. The first choice (List or Set) should be based on:
1) Duplicates allowed?
2) Ordering requirements? (including considerations for searches, removals, etc...)

Then choose within the List/Set family to choose synchronization based on expected/measured usage.

And like I said, real world examples are much more complex than any speed test. You can manage an initial decision based on relevant speed tests. Then implement and do your own benchmarks to compare speeds in situations closer to your needs.

I suspect the 'advantage' to the COWAL is that niche where you need a collection that is :
1) A List
2) Accessible by multiple Threads (probably best-case for multiple reading threads with few writing threads)
3) High performant Reads that consume more program time (either because they take longer or occur more often) than modifications
4) Infrequently modified, or modifications can be accumulated to use the bulk (addAll(...), removeAll(...)) operations
5) Eays to use, you don't want to worry about implementing your own Locking system (which could probably be made to outperform COWAL in some cases, but require more work and would be more prone to error).

The ConcurrentSkipListSet seems to work similarly, except the collection is:
1) A Set
2) Accessible by multiple Threads (no knowledge of balance between Read vs Write threads)
4) Possibly frequently modified, but Reads don't necessarily need to include the recent changes



send notification in order is an advantage for the list, but in a distributed environment. this has a lot to do with the network rather than what order the listeners are stored in.
Steve Luke
Bartender

Joined: Jan 28, 2003
Posts: 4175
    
  21

The code I used for tests is as below (other Runners/Producers were used to compare synchronized collection, read/write locked collections, and vectors).

 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
 
subject: CopyOnWriteArrayList