File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Distributed Java and the fly likes Checkpoint question Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of EJB 3 in Action this week in the EJB and other Java EE Technologies forum!
JavaRanch » Java Forums » Java » Distributed Java
Bookmark "Checkpoint question" Watch "Checkpoint question" New topic
Author

Checkpoint question

jason williams
Greenhorn

Joined: Nov 17, 2004
Posts: 14
I am learning to program system which needs to survive over process crash in the cluster environment. And after reading and searching papers on the internet, I vaguely understand that would require program to provide checkpoint so that the state can be saved to stable (replicated) storage and recover later from there. I understand to achieve fault tolerance it would require other components e.g. failure detector, etc., but at the moment I want to gain more understanding on checkpoint issue.

However, most of the papers emphasize more on abstraction level. For instance, `Design Patterns for Checkpoint-Based Rollback Recovery' tells that communication induced checkpoint can prevent domino effect and it provides diagrams explaining the interaction between different components e.g. failure detector, checkpointer, etc. But now my problem is `how can I checkpoint to stable storage and recover seamlessly?' For instance, I will checkpoint a running program to a storage e.g. hadoop hdfs; when trying to recover the state, how can I ensure the program would resume to continuously execute as it were without a problem?

I appreciate any suggestion.

Many thank.

 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Checkpoint question
 
Similar Threads
Failure mode of two reference calls to one thread
how to implement "buffering"
JFileChooser difficult problem
My SCEA Part 1Study Notes
Should Bean developer throw RemoteException & related stuff