Mark Jame

Greenhorn
+ Follow
since May 19, 2014
Mark likes ...
Netbeans IDE VI Editor Java
Merit badge: grant badges
For More
Cows and Likes
Cows
Total received
0
In last 30 days
0
Total given
0
Likes
Total received
0
Received in last 30 days
0
Total given
0
Given in last 30 days
0
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by Mark Jame

I have been a member of both the IEEE and ACM for several years; I am based in the UK and also have a BCS membership.

I am downsizing a little on memberships and want to drop either the IEEE or ACM. I spent a little time trying to see the main differences between these groups but am undecided.

Which do you think is a better organization for experienced software engineers?
9 years ago
Not had time to look at this in more detail yet, but trying to compile for 1.7 gives a warning (I prefer not to have any warnings):

What is this?

I will look at changing the path next.
9 years ago
I have several versions of Java installed.

Considering parts of the JRE and JDK are added to the PATH, and that Oracle has now added C:\ProgramData\Oracle\Java\javapath (what is this?) to the PATH; is there an easy way with a script of some sort to quicky alternate between Java versions?

The main objective is to change Java version (for both compiling and running) possibly several times a day without rebooting.

I normally use Cygwin for command line stuff on Windows but could resort to a DOS box if needed.

In an ideal world I would use seveal versions at the same time, for example two cygwin (or DOS) windows, one with Java 7 and one with Java 8; if that's not possible then I could use a single window to dynamically switch version.

Anyone know if/how this can be done?
9 years ago
I have general queries about the bonus content (CD-ROM) for K&B7.

The two main pieces of bonus material (for OCA/OCP 7) are a chapter about classpaths and one about serialization.

For claspaths, there is no mention of it as an exam topic in OCA or OCP, is this just included for general information? Assuming I read the book from start to finish, at what point should the classpath chapter be read?

Similarly for the serialization chapter, the first few pages in the book say it was re-introduced as an exam topic then a few pages on it says it is not? So is serialization in the exam (in which case the extra chapter is not a bonus but a requirement) and is this chapter the only resource? Additionally as above, at what point should it be read?

Sorry for being pedantic, but I would prefer clear explicit advice in terms of study material and study order.

Although I will close with well done to the team, I have several books from the authors and this one looks just as good!
Good news, but what about the UK?

Amazon UK is saying January!

The McGraw website mentions a PDF version, is this DRM free like O'Reilly (great for GoodReader on iPad), Adobe digital editions is unusable for me.

My apologies for incorrect terminology.

I have looked at this a little more but have to stop for a few days, when I have some free time again I will post some example programs (mine are mixed into larger apps with several threads).

My examples I mentioned above assume I need to deserialize a very large object (that was created by an external source). I do not have time to setup my different machines at the moment, but I can simulate machines with difference resources by tweaking the jvm options for maximum heap.

I can (and have) created a very large object (a list) that I successfully serialized and then deserialized on a (simulated) machine with lots of memory.

I have then tried to deserialize that large object on a (simulated) machine with much less memory, and as I thought it ran out of memory. This was one of the examples I was trying to look into. In this case I do not believe there is any way to deserialize the file?

The other example was writing and reading many objects, this I can now do successfully; writing Integer.MAX_VALUE objects one at a time (on my machine the file was about 32GB), then reading these objects one at a time, also with a simple check to confirm they were deserialized correctly.

Thanks again for all the feedback; I'll be asking more questions soon ;)
9 years ago
Once again thank you.

It is not easy (for me at least) to express true intent in words alone (without tone of voice and facial expressions), I really am grateful to everyone and do not mean to be difficult, I am just trying to understand the details.

I know for example I cannot create a simple object, add it to a list and repeat until there are billiions of simple objects in the list because I would run out of memory before the list is complete and before I can serialize it.

But could I create a simple object, serialize it and then repeat (billiions of times) to end up with a huge file? If so, how do I then deserialize it, I assume I just read it one simple object at a time? Using readObject?

Similarly, another external machine with say tens times the amount of memory that I have may create a massive list with billions of objects (that do fit into memory) and then serialize it.

In this case, is it only possible to deserialize the whole object (a list of billions, which I do not have memory for) or can I read x number of objects from the list at a time? If I can what method(s) should I be looking at?

Thanks again all.
9 years ago
Thanks for the hints about parsing Junilu and Paul.

I have worked on programs that read a continuous stream of data (from several servers) every second, 24 hours a day, but this question is more about serialization.

Perhaps I am getting myself confused with the details and need to read Javadoc more but I would like to get a head start if anyone can help.

Is there a difference between (for example) serializing 3 objects (say Widgets) one at a time and serializing a list containing 3 objects, in terms of how they can be deserialized?

Are the same methods used (and which ones?) to read a file with 3 single Widget objects and a file with a list of 3 Widget objects?

9 years ago
There are certainly cases where the preconditions to certain problems are maintained.

For example you may work on a system where one part creates files and another part reads and processes thoses files. In this case you know before you start to design the reading part that the size and/or number of files created are within certain limits.

Now suppose you have to design a program to read files that are created by an external system. You do not know the maximum size of the files so you want to design defensively to ensure your program will never break.

The actual size of the externally created files are irrelevant, they could be 1K or 1000TB, the point is you want a program that can process them safely (OK, it may take hours or even weeks) but it will process them withought breaking.

This is the main point of my question, how to design an effective solution to this type of problem.
9 years ago
Thanks again for the feedback.

Distributed processing, clustered servers etc obsiously improve the situation, but again there will always be a case where it is not possible to fit all of the data that needs to be processed into the availale memory, wether it is 1 machine or 1 milliion machines.

I now know I can read (or write) one object at a time, the object in question could be a single object or a larger object that is itself a collection of objects like a list.

So in terms of performance is it more efficient to read (or write) one object at a time or several?

I would assume it is better to read/write several objects at a time, in which case how do I know how many I can read/write without running out of memory?

In pseudocode for example, is there a way of doing something like the following:

9 years ago
Don't stop yet - please!

Winston Gutkowski wrote:If the things you're deserializating are fairly simple and don't need much in the way of context in order to "record" what you need, the chances are that you can just deal with them individually in batches; but if there's anything more needed...


Eureka! That's what I want, batches! The objects are simple, I read one or two fields then discard them.

So the question is how to read a large file (of serialized objects) in batches?
9 years ago
Thanks for the feedback so far everyone.

I understand what you say about using a database, etc.; and of course I agree.

But my question is more about how to solve the problem assuming I cannot change certain things, so in this example the assumption is that the file I want to read and/or write will NEVER fit into memory.

Similar problems do exist; sorting very large files too big for memory, solved for example using some kind of external merge sorting algorithm.

What I am looking for is advice and/or examples of how to process (read and/or wrirte) very large files of serialized objects that are too large for memory; this will always be a possiblity that is out of my (programs) control.

Perhaps another example may help. I may have a program that polls a data directory looking for files (of serialized objects) that are created by an external program. My program has access to the API (classes only, that define the objects, a Widget class for example) so I know what the objects look like and how to get information about them (getters for example). The job of my program is to deserialize the data files, recording some information about each object then discard them. My program has no idea how big the files will be or how many objects they may contain.

So how can I design my program (and process the files and objects) to read the files and deserialize safetly without running into memory issues?

9 years ago
I'm just trying to learn more by thinking about problems that are not easy to solve.

At the moment I am looking at how to deal with huge numbers of objects, too many to fit into memory (pretend my memory will never be big enough) and files that are too large to fit into memory.

Perhaps I need to turn the question around.

Lets say I have a loop to create a large number (say millions) of objects (that will never fit into memory at the same time) and I have to serialize them into a single new file in each iteration (say file1, file2, file3, ...).

How can this be done if there is not enough memory to fit all the objects into memory.

Then the next step (assume another program) would be my original question of how to read the files (and millions of objects) that are too large to fit into memory. Once deserialized assume I get some values from each object to use in a calculation.

Any ideas?
9 years ago
Assume I don't know. Why, does it make a difference?
9 years ago