aspose file tools*
The moose likes Performance and the fly likes hyper fast storage system? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Performance
Bookmark "hyper fast storage system?" Watch "hyper fast storage system?" New topic
Author

hyper fast storage system?

Elhanan Maayan
Ranch Hand

Joined: May 04, 2009
Posts: 113
hi..

we are looking for a storage framework at memory swapping speeds, we don't care about DML, or query options, just something that can store and retrieve data, so HSQL and the likes, are not suitable as they are an overhead.


also was wondering if there are any recommendation on good NIO framework (saw some comparisons between netty and mina, and looking for more info).

also on the same note, i'm looking for a way i can create automated micro benchmarking test cases, i've heard there are frameworks for that, but i cant' remember where.
Pat Farrell
Rancher

Joined: Aug 11, 2007
Posts: 4659
    
    5

MEMCACHED
Elhanan Maayan
Ranch Hand

Joined: May 04, 2009
Posts: 113
thanks i'll look into it, but is there anything like that in pure java implementation?
Pat Farrell
Rancher

Joined: Aug 11, 2007
Posts: 4659
    
    5

I thought you wanted hyper fast.
Elhanan Maayan
Ranch Hand

Joined: May 04, 2009
Posts: 113
still do, however this thing is gonna be distributed between many clients, and i see that it has several implementations, (linux, windows etc..)....
Elhanan Maayan
Ranch Hand

Joined: May 04, 2009
Posts: 113
actually i've just been told that even memcache is not fast enough..
Pat Farrell
Rancher

Joined: Aug 11, 2007
Posts: 4659
    
    5

Elhanan Maayan wrote:actually i've just been told that even memcache is not fast enough..


Then you are in deep doo doo.

memcached is how google gets their speed. (in addition to having brilliant algorithms, people, etc.).

Have you shown that memcached is actually slow in your case? or is this just speculation. If its too slow on serious hardware, then I'd change the requirements.
Elhanan Maayan
Ranch Hand

Joined: May 04, 2009
Posts: 113
i know, i'm still trying to learn the requirements, its should be crazy fast. the project is called sopf.

i've been told that memcached not close to memory-swapping speeds.

i've been meaning to create automated micro benchmarking testcases, (that' why i've asked for a framwork for it).
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12803
    
    5
Exactly how much data is being stored/retrieved here?

Why are you sure that storage speed is the critical link? Seems to me that network speed limits might overwhelm the problem.

(perhaps it is time for my -patentpending- psychic time travellers database - returns data you are going to ask for before you request it)

Bill
Marco Ehrentreich
best scout
Bartender

Joined: Mar 07, 2007
Posts: 1282

Hi Elhanan,

i've been told that memcached not close to memory-swapping speeds.

Who told you that memcached won't be fast enough? Probably he/she just told you something without really knowing the facts ;-)

As all the others here already said there are surely other problems regarding performance that will hit you long before memcached (or something similar) has reached its performance limits.

Moreover it seems that you're looking for a solution with which you get the ultimate performance boost "for free" and that will definitely be not the case with any framework or caching library. I've seen this a few years ago on a real project where memcached could give only small performance improvements because the rest of the application (code design, database, network traffic etc.) was bullshit. Maybe this works better for your project but take this as an advice that most probably you can't simply plug in something like memcached and expect it to greatly improve overall performance if you didn't take care of the rest right from the beginning.


Marco

labi yaps
Greenhorn

Joined: Jun 08, 2009
Posts: 4
Wanted to make a quick note since I am driving some of the requirements Elhanan noted. As Elhanan mentioned, the requirement of this project is to read data at a memory-swapping speeds. We are trying analyse and design a cache-module (our project consist of 3 modules and two plugins). And if you need to understand the need for speed; it is bcos we have an execution platform that persist data (via local lateral-cache) at defined cycle-intervals. We want the persistence to be as fast as possible. Based on my knowledge, I know that disk-writes is at best 3Gb/s (with SATA3). Anyway my analysis reveals that persistence is one of the bottle-neck of our execution platform as well as data-synchronization. Although data-synchronization is a more significant bottle-neck, I'll save that for another conversation.
Any suggestions from performance-addicts?
Pat Farrell
Rancher

Joined: Aug 11, 2007
Posts: 4659
    
    5

labi yaps wrote:Any suggestions from performance-addicts?


This is really a simple problem. If disks are too slow, have more ram. This argument was addressed in Peter Denning's initial Working Set paper/thesis.

memcached is one obvious implementation that uses memory. I (and others) don't buy that you have shown the memcached can't provide the speed you need, you have to test it.

If its too slow, you need to either (1) invent something much faster or (2) change your algorithm to need less.

While I'm sure that it would be possible to speed up memcached by 20% in some cases, I don't believe that you will find something easy that is ten times faster. So in that case, all you have is #2.
labi yaps
Greenhorn

Joined: Jun 08, 2009
Posts: 4
This is really a simple problem. If disks are too slow, have more ram. This argument was addressed in Peter Denning's initial Working Set paper/thesis.

The data is already in memory, I don't consider data-in-memory as persisted. At some point data has to be put in disk before it can be considered persisted. Please note I am trying to optimize data-persistence not data-reads, I don't need to optimize data-read, I will have the data in memory.
I don't believe in benchmarking everything. A framework's architecture tend to reveal enough detail to come to a decision. In the case of memcache, you introduce two sources of performance-cost (in addition to the disk-speed); network limitations as well as framework overhead; putting two and two together I can assure that you cannot achieve memory-swapping speed with memcache.
Elhanan Maayan
Ranch Hand

Joined: May 04, 2009
Posts: 113
just a pop question..

by looking over the requirements trying to make a mental pic (and persisting to an enclosed jpeg), i see that there are 2 types of persistence, one of the lateral one of the backup (aggregating the laterals) my questions is who needs the persisted data? why are we waiting for it to be persisted?

to form a different question, can't we extract the component doing the persistence to operate on an asynchronous event based activity? perhaps even send back an asynchronous event response notifying the data was persisted and do an action based on that?



[Thumbnail for cache.jpg]

Marco Ehrentreich
best scout
Bartender

Joined: Mar 07, 2007
Posts: 1282

Hi,

from your descriptions of the requirements it seems to me that you're real bottleneck won't be any problem with a in-memory cache but disc and network I/O you have to expect when persisting the said data. Is this correct?

I think it's hard to discuss this here without being involved in all details of the project but to me it seems like you would like to make your disc (and network) hardware faster than it really is. In my opinion there won't be no caching solution which will gain you anything if you have to persists the data synchronously to disc. Moreover disc I/O is definitely many, many times slower than in-memory transfer speeds. And if you write to disc synchronously any cache will just add an additional layer which probably costs some performance overhead.

Additionally from the drawing I guess there are more than one machines involved. Then network traffic comes into play, too?!? This will be even many times slower than disc I/O. Not to speak in contrast to memory speeds.

If I got this right your only solution will probably be to use specialized hardware. In particular any high speed (and expensive) storage solution would surely give you some performance boost. A super computer will most probably do, too Besides that I can't imagine any library, programming language or any other peace of software which can increase your physical disc speed. Not even to the theoretical transfer speed of the bus system (like S-ATA).

Marco
labi yaps
Greenhorn

Joined: Jun 08, 2009
Posts: 4
from your descriptions of the requirements it seems to me that you're real bottleneck won't be any problem with a in-memory cache but disc and network I/O you have to expect when persisting the said data. Is this correct?

That is correct.
Thanks for your response Marco, I thought as much. I think our best optimization might be to make the persistence synchronous and asynchronous, allowing the user to make a choice, of the performance hit, they are willing to accept. The persistence does not involve non-local machine, the data-backup does involve non-local machines.
Marco Ehrentreich
best scout
Bartender

Joined: Mar 07, 2007
Posts: 1282

Hi Labi,

now that you actually confirmed my assumptions I guess you'd be better off looking for some ways to improve the performance regarding the persistence part of your application. In my opinion there's not much memcached can do for you in this situation. Clearly it may help you to manage the data in memory but this won't avoid hitting the discs as soons as you have to persist some data.

Of course it's hard to give you any helpful advices without knowing all details of your application, the environment and the requirements. But from what you described here I would guess that it's not really the programming language, a library or a framework which will make you trouble regarding performance. A good optimization point you can do on the software side is probably the plan to wisely choose between synchronous and asynchronous I/O.

Having said that I think it's very likely that your real problem is the storage and network hardware. I'm pretty sure there are existing solutions which will be nearly as fast as memory speed but this is definitely just a question of how much money you are willing to spend.

Marco
labi yaps
Greenhorn

Joined: Jun 08, 2009
Posts: 4
I am not trying to achieve memory speed, I am trying to achieve paging speed; specifically the speed at which the kernel writes memory to disc, which is what I am using as the bar for file-write speed of an application. I am suspecting it involves some sort of file-mapping rather than a sequential byte read/write. I know java nio package provides file-memory mapping functions.
Marco Ehrentreich
best scout
Bartender

Joined: Mar 07, 2007
Posts: 1282

Aah, I see. This makes the problem less hard :-)

Unfortunately I didn't have to worry too much about high speed file I/O in Java, so I can't give you reliable information. But from what I read NIO should be able to do the job. I'm not sure if memory-mapped files are a solution to speed or just a more convenient way to access file data. Perhaps someone other can tell you more about it,

What I can tell you is, that at least on UNIX/Linux tuning of the underlying RAID and/or file system makes a very big difference if you want to bring your hardware to its limit! So you should take care that a wrong file system with bad parameters hides a bottleneck for which you will blame Java ;-)

Marco
Elhanan Maayan
Ranch Hand

Joined: May 04, 2009
Posts: 113
yea, i just thought about being protablity issues between platforms cpu's and operating systems on this level, have no idea to test this correctly, (using emulators?)


btw are there any NIO based java cahce solutions?
Marco Ehrentreich
best scout
Bartender

Joined: Mar 07, 2007
Posts: 1282

Your doubts regarding portability issues are definitely justified. But in this case I think it is a trade off between performance tuning and keeping the application portable.

From the description of your requirements I'd say it won't be possible to achieve very high performance and at the same time keep the application 100% portable. To be more precise it may be possible to keep the application portable but you can't completely ignore the environment (OS, file system, hardware etc.) where your application will be running.

For you question regarding a cache solution: I'm pretty sure that almost all modern caching frameworks will use Java NIO. Moreover even the standard I/O classes of modern Java JDK version use NIO partly under the hood.

Marco
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: hyper fast storage system?