Meaningless Drivel is fun!*
The moose likes Meaningless Drivel and the fly likes What impedence mismatch? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Other » Meaningless Drivel
Bookmark "What impedence mismatch?" Watch "What impedence mismatch?" New topic
Author

What impedence mismatch?

Frank Silbermann
Ranch Hand

Joined: Jun 06, 2002
Posts: 1387
I've been thinking about Hibernate and object-relational mapping, and am beginning to wonder whether the problem goes deeper than a clash in styles.

Even when programming in a non-OO language, such as C, code to interface with a relational database was long, clumsy, error prone, tedious and boring.

Is there any programming language that can interact with a relational database in a natural, easy way? Somehow, I doubt it.

There seems to be a fundamentally different approach we take to data in a program versus data in a database, between information in volatile memory versus information on disk. In a programming language, it is easy to create structures that become irretrievable (memory leaks or food for garbage collection), but in a database there is an unstated assumption that anything put in there should be retrieveable. When dealing with databases it seems obvious that we should be able to say "Give me all the objects of this kind with these properties" -- but nobody expects Java to have a utility saying "Make a list of all the objects in the system that implement this interface."

Is there some fundamental, philosophical difference between data in memory versus data on disk that lies behind our difficulties in interfacing the two? Perhaps if we understood this better, we would not have so many difficulties.

Comments?
Peter Rooke
Ranch Hand

Joined: Oct 21, 2004
Posts: 802

Maybe it’s to do with SQL being a high level language (a functional one - I believe) and relational databases being based on (proven) mathematics, and Java being an impetrative OO (not quite pure!) language. But this is hard computer science - way too complex for me to comment [knowledgeably] any further. I did attend a talk by a Hugh Darwin a few years back, he didn’t half have a go at the database vendors

Here is a bit from an old article (C.J. Date - Java ?)
InfoWorld: So what do you make of Java?). Its a bit old but still:

-----------------------
Date: I haven't looked really closely at Java. I must admit, this is kind of heretical thing to say, but I find it strange that people think the salvation to everything is a new programming language. We seem to have had that before.

InfoWorld: So as an industry, are we making any real progress in the last 10 years?

Date: Yes. Some of the new object-relational products are implementing a piece of the relational model that was never implemented before, and that is domains. It's such a shame, if you take the original papers, it was all there.
-----------------------------

Maybe I'm hitting a broken drum, but there was a Structured Language called Informix 4GL / SQL language. Compiled down to C / (E)SQL - it used database cursors etc. Only it never had any real modern user interface capabilities - awful green screen stuff. I'll maybe post a small bit of code latter, to demonstrate.

---------
"The hearsay of duplicates,
The hearsay of nulls,
Lead to great complications,
A combination of dulls." [Hugh Darwin]


Regards Pete
Ellen Zhao
Ranch Hand

Joined: Sep 17, 2002
Posts: 581
I agree with Peter. I noticed this thread days ago, tempted to comment but didn't feel qualified. :FaceRed:. The original question by Frank certainly reached some depth in computer science, for this question I dag out the script of most headache-causing lectures I had had, but didn't find any definitive answer. I guess the key could lie in the differences between ways of addressing in harddisk (which houses the database) and the in memory. The address of data in database is relatively more static, while in the memory, the addresses of a certain block of data (be it an object or a simple row fetched from harddsik) is much more dynamic. And, once a database is created and a data scheme is defined, it's not difficult to arrange data address in harddisk. But when it's loaded into memory, it has to cooperate with other operative bytes.....In order to get the addressing issue in memory straight, there has to be treatment, so the R/O mapping is more wordy than purer manipulation of data...I know this is very inarticulate and it's just my guess. Any enlightenment will be highly appreciated. I'm very interested in this topic. Thanks in advance.

Regards,
Ellen
[ December 08, 2004: Message edited by: Ellen Zhao ]
Michael Ernest
High Plains Drifter
Sheriff

Joined: Oct 25, 2000
Posts: 7292

Originally posted by Frank Silbermann:

Is there some fundamental, philosophical difference between data in memory versus data on disk that lies behind our difficulties in interfacing the two? Perhaps if we understood this better, we would not have so many difficulties.

There absolutely is, and I wish I had time right now to lay out a few of them. But we can start with a few that seem to have little or no bearing on the immediate question, and see if others can work out a few reasoned steps from there. If no one's "figured it out" by the time I can address this in more detail, I'll have a go.

1) The foremost principle of any operating system is the protection of data integrity.

2) The fundamental benefit of disk is the ability to preserve data in the event of loss of power (or host access).

3) The fundamental benefit of data in system memory is speed of access to the CPU.

That's the heart of it.


Make visible what, without you, might perhaps never have been seen.
- Robert Bresson
Jayesh Lalwani
Ranch Hand

Joined: Nov 05, 2004
Posts: 502
I think a factor in how we use data depends on the speed of storage versus the cost of storage

Accessing data in a RAM is inherently faster than accessing data from disk. This is because RAM has a) random access functionality(hence the name) and b) faster communication speeds between the CPU and RAM. But, both of these things come at a cost. You cannot keep all your data in RAM, without having a freakish amount of RAM(100GB of RAM anyone?). If you want more RAM then you need bigger addresses for the RAM, and hence you either have to a) increase your bus size between CPU and RAM or b) increase your bus speed between CPU and RAM. And that is why you use RAM only for data that you need immediately. Since, disk storage is cheaper, you can afford to waste space and store data that you dont need immediately(or may never need at all). Hence the paradigm:- use RAM for volatile data and use disk storage for permenant data
Michael Ernest
High Plains Drifter
Sheriff

Joined: Oct 25, 2000
Posts: 7292

Originally posted by Jayesh Lalwani:
I think a factor in how we use data depends on the speed of storage versus the cost of storage

In part. Even if all permanent storage operated at the speed of on-board memory and cost the same, I don't think we'd fully reorganize nonvolatile storage to resemble "in-memory" data. If all data is "live" all the time you have much bigger problems that so-called data impedance to solve.
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24183
    
  34

Originally posted by Michael Ernest:
If all data is "live" all the time you have much bigger problems that so-called data impedance to solve.


But I sure used to like programming for the Apple Newton, which fits exactly this description -- from the programmer's perspective, anyway.

In any case, lots of "pure object database" solutions exist, but they've all got one problem in common: you can't find your objects unless you store them in indexed data structures -- thereby replicating some of what an RDBMS has to do. Object databases just push some of the database's traditional job off onto the programmer.


[Jess in Action][AskingGoodQuestions]
Warren Dew
blacksmith
Ranch Hand

Joined: Mar 04, 2004
Posts: 1332
    
    2
Frank Silbermann:

Is there some fundamental, philosophical difference between data in memory versus data on disk that lies behind our difficulties in interfacing the two? Perhaps if we understood this better, we would not have so many difficulties.

I think that there is.

More specifically, I think there's a difference between how we use direct access memory and how we use databases.

I think that data in memory is there to support an application's doing a specific job. That's why we throw it away, by freeing it or allowing it to be garbage collected, when we're done with it.

In contrast, a database is designed to preserve data to be used by unspecified future applications. Given this, it doesn't make sense to throw away database data when you no longer have a current reference to it, because some new application might want it in the future.

I don't think that the database paradigm applies to all use of disk, though. When using the disk only for persistence, the application memory paradigm applies. That's why save files are often stored in undocumented application specific binary formats - only the application that stored the data to disk needs to be able to read it back.
Peter Rooke
Ranch Hand

Joined: Oct 21, 2004
Posts: 802


I've been thinking about Hibernate and object-relational mapping, and am beginning to wonder whether the problem goes deeper than a clash in styles.


There's a mismatch between the way a RDBMS, which uses values to link (join) the various normalised tables (relations) together. Programming languages like Java (C, C++) need to work with memory (of course Java hides most of this from us . In the case of OO we also encapsulate data and methods. I know Rumbaugh (et al) talks about this mismatch in �Object Oriented Modelling And Design�, he suggests a way round the mapping:

1) Each class into its own relation, and a lookup relation with foreign keys. (Poor performance)
2) Collapse into one table (Good performance)
3) Two classes to two tables with one referencing the other. (Poor performance) [Most developers will see that, this looks familiar].

There�s also a fundamental difference between a functional language (SQL) and an imperative one (Java) � in a functional language, you say what you want (SELECT * FROM T_PERSON) and the language does most of the work (but can be slow). Of course with the imperative type you have to say how the program is going to do it. I'll let more academic members explain in more detail!

Lastly, can anyone explain to me Dates argument that:

It should be: Domain (type) = Object class AND NOT Relation (table) = Object class.

I guess we all use the second one. Never quite understood his objection.

"Of complication, despond and general distress, are two nulls equal? I fear both No and Yes!" [Hugh Darwin (again)].
[ December 08, 2004: Message edited by: Peter Rooke ]
Frank Silbermann
Ranch Hand

Joined: Jun 06, 2002
Posts: 1387
Originally posted by Peter Rooke:

There's a mismatch between the way a RDBMS, which uses values to link (join) the various normalised tables (relations) together. ... I know Rumbaugh (et al) talks about this mismatch in �Object Oriented Modelling And Design�, he suggests a way round the mapping:

1) Each class into its own relation, and a lookup relation with foreign keys. (Poor performance)
2) Collapse into one table (Good performance)
3) Two classes to two tables with one referencing the other. (Poor performance) [Most developers will see that, this looks familiar].
Date's list of options gives a few ad-hoc approaches to dealing with one class inheriting from another. Now, what are we to do if we wish to persist objects from a rich (deep and broad) class hierarchy with many levels of inheritance? What if we also want our DBMS schema to reflect interfaces implemented by objects?

I suppose people would say I'm trying to reinvent the "Java-based object-oriented database" wheel (which wasn't particularly successful even when the DB experts tried it). Would there be any sense to attempting to implement such a thing on top of a RDBMS, or would that be stretching the RDBMS paradigm too far?

Is it even a reasonable hope to eliminate or substantially reduce the "impedence mismatch" -- and if so, which paradigm (programming language or DBMS) is the one that should change? (I think PeopleSoft's PeopleCode minimizes the impedence mismatch, but to me it seems quite a crippled and ugly programming language.)
[ December 08, 2004: Message edited by: Frank Silbermann ]
Michael Ernest
High Plains Drifter
Sheriff

Joined: Oct 25, 2000
Posts: 7292

EFH: But I sure used to like programming for the Apple Newton, which fits exactly this description -- from the programmer's perspective, anyway.

ME: Oh? Which RDBMS were you running on it?

EFH: Object databases just push some of the database's traditional job off onto the programmer.

ME: Ding-ding-ding-ding! Persistent data requires an index to reliably organize entries. It doesn't really matter what form the persistence takes. Live data has one problem: it's all live, meaning every thread of control in your system must link to it -- i.e., index it. And cache it. And provide for local serialization to conserve memory when necessary. And provide a transfer scheme for synchronizing live data with persisted data.

That's a lot of work, and it's reduced considerably by bringing into physical memory only those elements that are needed.

The performance cost if impedance mismatch isn't so much a problem as a cost of doing business, although people like to problematize such issues in order to raise an industry for 'solutions'.

As for programmerly inconvenience: well, sorrrrrREEEEE, but your job has some tedium in it just like mine does. Deal wid it.
Frank Silbermann
Ranch Hand

Joined: Jun 06, 2002
Posts: 1387
Originally posted by Michael Ernest:
The performance cost if impedance mismatch isn't so much a problem as a cost of doing business, although people like to problematize such issues in order to raise an industry for 'solutions'.

As for programmerly inconvenience: well, sorrrrrREEEEE, but your job has some tedium in it just like mine does. Deal wid it.
Aren't most commercial uses of computing dominated by the need to create, maintain, retrieve and analyze persisted data? If there is no _nice_ way to express class and interface hierarchies of persisted data, that would severely reduce the great productivity advance promised by the evangelists of the object oriented paradigm.

I mean, it's really nice that interfaces and inheritance finally give us a type system that can combine the needed type flexibility with a fairly rigorous level of type safety, and nicer control structures that allow us to separate stable business logic from that which is likely to change. But the advantage of object-oriented analysis and design seems terribly compromised if the business object description bears little resemblence to the way information is organized in the system.

I used to assume that the techniques of object-oriented analysis and design were developed to teach organizations how to use OO-technology. Sure, they said you _could_ implemented an OO design using a non-OO language, but is OOAD really such a more efficient way of gathering system requirements and specifications as to justify the process even in the absence of OO-technology?

And with no natural way to represent objects when persisted, we really are only using object technology within a few small niches (e.g. the controller part of a MVC web application, and maybe also the view in those rare cases where we still build a fat GUI client).
Warren Dew
blacksmith
Ranch Hand

Joined: Mar 04, 2004
Posts: 1332
    
    2
Frank Silbermann:

Is there any programming language that can interact with a relational database in a natural, easy way?

PL/SQL.
Michael Ernest
High Plains Drifter
Sheriff

Joined: Oct 25, 2000
Posts: 7292

FS: Aren't most commercial uses of computing dominated by the need to create, maintain, retrieve and analyze persisted data?

ME: Absolutely. And you therefore have an inescapable tension: getting the product you want from the resources available at the lowest possible cost. Compare this to say, academic or research computing, where costs and technology limit what it is possible to research, but is not measured against a potential for revenue.

FS: If there is no _nice_ way to express class and interface hierarchies of persisted data, that would severely reduce the great productivity advance promised by the evangelists of the object oriented paradigm.

ME: The current solutions are aimed at the EFH camp, i.e., reducing the cost in developer time of creating such structures. I don't know Hibernate from Adam, but JDO is aimed directly at automating the translation of relational data structures to OO interfaces.

Impedance mismatch, as I have read about it, is a problem for the same kind of person for whom mechanical disks are a "problem." Well, ok, but you have sooooo many other issues, such as developer time, which can be directly attacked and mitigated right now. Why wring hands over something that has no apparent solution when you can solve real problems today? You'll be on your way to solving the "big" problem when you have the foundation you need to attack it. Baby steps.

FS: But the advantage of object-oriented analysis and design seems terribly compromised if the business object description bears little resemblence to the way information is organized in the system.

ME: You'd think that, wouldn't you? But in asking for a simple "straight-through" transliteration of a business object into a coherent set of data blocks on a disk, you're casting aside the primary strategy we have always used to deal with these translations: namely, the interface itself.

The promise of a programming interface is that in return for relying on the services available through it -- the service contract, if you will -- the programmer is freed from responsibility for knowing the internal implementation. Naturally some cost is associated with having to go through some 'protocol' to get what we want. Some costs are real; the interface may not be very well laid out, or the internal implementation might be less than optimal. Some aren't: the time take to 'worry' that the implementation could be better if only one could examine it, test it, manipulate it, and assure oneeself the contract has been fulfilled to the degree possible.

It may seem simpler at first glance to say, "but if everything were oragnized the same way everywhere, we'd always be able to access things the same way no matter where we were, and that would be more efficient." On first blush, that sounds to me like saying a forklift in a warehouse should operate no differently than a cook in a pantry, just on a different scale. And in an abstract sense, that doesn't seem so silly; but in the details, the forklift operator and the cook might not care for the comparison.

FS: Is OOAD really such a more efficient way of gathering system requirements and specifications as to justify the process even in the absence of OO-technology?

ME: Is CORBA faster than RMI? How can you tell without applying these protocols to context and seeing what you get? Same thing for OOA&D versus other approaches to design. It depends: on how well the principles are applied; on how well suited the principles are to the task at hand; and how well the principles are applied by the implementers.

On paper, though, OOA&D has several inherent advantages one only hopes its adopters will exploit: it provides a more succinct way to control code on a larger scale. Anyone who has seen the chaos that can emerge even from a disciplined, but large-scale, use of the Nicholas Wirth approach (data + algorithms = program) can see the potential, but not automatic, benefit.

Good OO design makes it easier to divide the task of coding among large groups while maintaining a coherent overall plan. One could argue that this goal is possible through good design period, but I'd maintain that OO approaches still simplify and emphasize the design effort.

Many groups have demonstrated the power of that facility. Still more projects have shown how the best exploits of an OOA&D approach can be rendered nutless in short order. One still has to understand the benefits and be willing to commit to a discipline that sets those benefits as a primary goal.

FS: [W]ith no natural way to represent objects when persisted, we really are only using object technology within a few small niches (e.g. the controller part of a MVC web application, and maybe also the view in those rare cases where we still build a fat GUI client).

ME: It's not that I disagree, it's that I don't think this issue is currently 'interesting' outside the academic realm. Impedance mismatch is not, generally speaking, a primary bottleneck in contemporary computing. It's compute-bound, but only by scale, not in principle.

I wonder if we have a mathematician with us who could state whether object-to-relational mapping is fundamentally an NP-complete problem or not: that would be interesting to know.
[ December 09, 2004: Message edited by: Michael Ernest ]
Ellen Zhao
Ranch Hand

Joined: Sep 17, 2002
Posts: 581
Ach Michael the Ernest you are my new idol.

Thank you very much for ringing the bell for me from your first follow-up in this thread on.
Peter Rooke
Ranch Hand

Joined: Oct 21, 2004
Posts: 802

I said this above:
There�s also a fundamental difference between a functional language (SQL)

However I've just found out SQL in NOT a functional language. Mind you It does feel like one at times! Apologies for posting unqualified information.

Is OOAD really such a more efficient way of gathering system requirements and specifications as to justify the process even in the absence of OO-technology?

It�s strange that us technical people will argue which is the 'best' process, method, methodology, language (I could go on) to develop given project. I would like to suggest that any �successful� project would rely on the people who work and manage it, more that the various technologies.
Its a becoming a common theory that projects tend to fail because of people problems, not technical ones.

One of my favorite military quotes: "I would not fear a pack of lions led by a sheep, but I would always fear a flock of sheep led by a lion."
 
wood burning stoves
 
subject: What impedence mismatch?