File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Developer Certification (SCJD/OCMJD) and the fly likes Tracking Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Certification » Developer Certification (SCJD/OCMJD)
Bookmark "Tracking "free" lines in a dedicated set" Watch "Tracking "free" lines in a dedicated set" New topic
Author

Tracking "free" lines in a dedicated set

Norbert Lebenthal
Ranch Hand

Joined: Sep 23, 2010
Posts: 74
Hi

While coding URLyBird data access class, I started to wonder if, in fact, dealing with the free lines (meaning the ones containing deleted records) could not be quite easy.

The concept would be the following: on creation, each time a line isn't valid, I would add it into a deletedRecordsLocations, some kind of threadsafe collection, presumably some queue (like the ConcurrentLinkedQueue), containing the start of the location in the file which is free.

Then, when I need to create a new record, I would first try to retrieve an element from this deletedRecordsLocations collection (using poll()). If I got a non null element, then I use this line, otherwise I go at the end of the file.

When a line is deleted, it would just be a matter of marking it as such on the data file and then add the location in the deletedRecordsLocations.

Does it sound crazy ? Is there any obvious gotcha I missed ?

thanks in advance
nono
Roel De Nijs
Bartender

Joined: Jul 19, 2004
Posts: 5126
    
  12

Seems like a valid approach. The only gotcha I could see, is that using a thread-safe collection will NOT guarantee that your actions on this thread-safe collection will be thread-safe. Sounds funny, isn't it So you could better use a simple Queue or List, and use it in a thread-safe way (which you implement yourself).


SCJA, SCJP (1.4 | 5.0 | 6.0), SCJD
http://www.javaroe.be/
Norbert Lebenthal
Ranch Hand

Joined: Sep 23, 2010
Posts: 74
hi

I was wondering about this thread safety issue.

It looks like I just need to poll to get the first available empty slot, so it's an atomic operation, as well as offer to provide a new empty one.

Indeed the deleteRecords operations (or whatever its name) should already be synchronized somehow, since anyway I want to avoid duplicate deletion of the same content. I guess it's done probably through the activeRecordsLocations map lock.

So if I'm in this already synchronized context I should be fine don't you think ?
Roel De Nijs
Bartender

Joined: Jul 19, 2004
Posts: 5126
    
  12

Hi Norbert,

Let me try to illustrate the possible problem. Let's take the StringBuffer class for example, according to the API: "A thread-safe, mutable sequence of characters." So the append and length methods are thread-safe, but if I use them like this, it's not thread-safe anymore.

And in a main-method:

Because each thread (t1 and t2) can be kicked from the CPU (by the scheduler) between the length-call and append-call, you can end up with a StringBuffer containing more than 10x "a".

If you are in a synchronized context, then there should not be a problem (but then you don't need the StringBuffer and you can just use a StringBuilder):
Hope it helps!
Kind regards,
Roel
Norbert Lebenthal
Ranch Hand

Joined: Sep 23, 2010
Posts: 74
thanks a lot for your detailed answer.

I'm fully aware of this issue: it's not because a class is threadsafe that its usage is as well.

Yet, in my usecase, I thought my usage of the queue would be thread safe. Which now makes me wonder if I need the queue to be threadsafe at all... lol

Let's review this point.
- first parsing of the file and filling of the locationsMap and freeLinesList: monothreaded env, no concurrency issue
- deleting a record:
getting the writeLock on the locationsMap
synchronized block on the RandomAccessFile to actually modify the file
addition of the location to the freeLinesList
unlock the write lock
- writing a new record:
getting the writeLock
poll to get the freeLinesList
synchronize on the RandomAccessFile
if poll returned null, go at the end of the file, otherwise go at the given location
write the new record
populate the locationsMap
release the writeLock

=> looks like I don't need a synchronized queue, a normal one would do since it's always accessed while having the writeLock

looks good... if I'm right. do you think so ?



thanks again
bye
Norbert

Norbert Lebenthal
Ranch Hand

Joined: Sep 23, 2010
Posts: 74
btw, I was realized this locking strategy was unnecessarily restrictive: the lock on the record is acquired before accessing these methods. The writeLock/ReadLock could be hold for less time

hopefully the rest of the logic isn't flawed...
Roel De Nijs
Bartender

Joined: Jul 19, 2004
Posts: 5126
    
  12

Norbert Lebenthal wrote:=> looks like I don't need a synchronized queue, a normal one would do since it's always accessed while having the writeLock

looks good... if I'm right. do you think so ?

I also used a simple Map to store my record cache and the locked record numbers. So you don't need a synchronized queue.
Norbert Lebenthal
Ranch Hand

Joined: Sep 23, 2010
Posts: 74
Roel De Nijs wrote:
Norbert Lebenthal wrote:=> looks like I don't need a synchronized queue, a normal one would do since it's always accessed while having the writeLock

looks good... if I'm right. do you think so ?

I also used a simple Map to store my record cache and the locked record numbers. So you don't need a synchronized queue.


thanks again

Side note: the website in your signature is linked to a single picture website..??
Roel De Nijs
Bartender

Joined: Jul 19, 2004
Posts: 5126
    
  12

Norbert Lebenthal wrote:Side note: the website in your signature is linked to a single picture website..??

I know It's the website of my own consultancy company (started January 2010) with just 1 employee (me) at this moment. No time so far to make the company website a bit decent. I finished 2 other websites: the futsal team of my brother and a glazier (which was a former neighbour). So I can if I have enough time
Norbert Lebenthal
Ranch Hand

Joined: Sep 23, 2010
Posts: 74
regarding this tracking business:in the end I use a ConcurrentLinkedQueue

The reason is the following: when I'm creating a record, I poll this queue outside of any lock context. Then I write (in a synchronized block) the new record to the file and only when this is being done I lock the recordsMap to put into it the new recNo to Record entry (a Record being a location and its cached content).

As such, it doesn't hinder any other thread/lock whether there's some freeLocations (or whether I've to go at the end of the file).

does it sound crazy ?
Roel De Nijs
Bartender

Joined: Jul 19, 2004
Posts: 5126
    
  12

Norbert Lebenthal wrote:does it sound crazy ?

It doesn't sound crazy, but it seems a bit odd to me. In your delete method I'll guess you access your queue from a synchronized context (to add the record number to the free locations) and thus you don't need a concurrent queue. In your create method you don't access the queue from a synchronized context and thus need a concurrent queue (and you may not use a sequence of method calls on this queue, otherwise it will not be thread safe).
Don't forget simplicity (for a junior programmer) is a bigger item than performance. Using different approaches in different methods is not helping simplicity and easy-to-read and easy-to-maintain code.
Norbert Lebenthal
Ranch Hand

Joined: Sep 23, 2010
Posts: 74
Roel De Nijs wrote:
Norbert Lebenthal wrote:does it sound crazy ?

It doesn't sound crazy, but it seems a bit odd to me. In your delete method I'll guess you access your queue from a synchronized context (to add the record number to the free locations) and thus you don't need a concurrent queue. In your create method you don't access the queue from a synchronized context and thus need a concurrent queue (and you may not use a sequence of method calls on this queue, otherwise it will not be thread safe).
Don't forget simplicity (for a junior programmer) is a bigger item than performance. Using different approaches in different methods is not helping simplicity and easy-to-read and easy-to-maintain code.


you're right, the approach to this queue isn't symetric, which would be better.

The reason for this mismatch is, in createRecord, that I want first of all to make sure I persist the record to the DB. As such, I first write it down and then only update the recordsMap (containing the recNo,Location and cached record).

I could for sure get the writeLock before looking into the queue, but it would make the synchronized bloc handling the writing to the DB part of the writeLock, which pontentially could slow down other threads.. Yet maybe I should do so for clarity sake... The performance gain isn't this huge anyway...
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Tracking "free" lines in a dedicated set
 
Similar Threads
Deleted Flag
exception: java.net.ConnectException: Connection timed out: connect
Dynamic data class
Reusing deleted entries ?
strange error