permaculture playing cards*
The moose likes XML and Related Technologies and the fly likes Call Me Michael Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCA/OCP Java SE 7 Programmer I & II Study Guide this week in the OCPJP forum!
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Call Me Michael" Watch "Call Me Michael" New topic
Author

Call Me Michael

Michael Ernest
High Plains Drifter
Sheriff

Joined: Oct 25, 2000
Posts: 7292

Hello, XMLer's -
My name is Michael Ernest, and I'll be helping Ajith moderate this forum, as certain of his professional duties will reduce the time he can spend with us. My job is to make sure you really, really wish he didn't have to cut back his contributions.
Seriously, though, as my contact page entry says, I have maintained ignorance on XML up until now; I have a leering suspicion of technologies that get praised as silver bullets, which is why I spend no time on EJBs either.
Having said that, it's time to walk a few miles in the shoes of my enemy, and learn through moderating why you all are excited. I'm picking up a few books on the subject, and I'm going to get in there and see if I can't even be of some technical help after a little experimenting.
The question I hope to answer for myself is, 'Why do I want to switch to a text-encoded means of interprocess communication when binary-encoded forms weren't a problem to begin with?' Well of course I've heard the marketing-oriented answers; now it's time to see if the technology itself really supports the pat answers.
I'm ready to be educated! Let me ask a few ignorant questions, if you don't mind, and I hassle you about the naming policy in return.


Make visible what, without you, might perhaps never have been seen.
- Robert Bresson
Scott Bain
Ranch Hand

Joined: Dec 21, 2001
Posts: 46
Michael:
Moving from binary representations of process communication to text basically enables platform-neutrality. When your remote method call, for example, does not "care" whether the system at the receiving end is NT, Linux, Unix, Solaris, etc... or whether the object being addressed is written in Java, C++, VB, Python, etc... then we're creating an environment truely amenable to distributed objects and a "semantic web".
XML is not a silver bullet, you are right, and in fact it's not even a good idea for many applications. However, where you want to create platform and language neutrality, it is very powerful. The "Web Services" concept, which is really just a way to allow systems to grow more organically, is made possible by this neutrality.


Scott Bain<br />Senior Consultant<br />Net Objectives<br />425-591-5844<br /><a href="http://www.netobjectives.com" target="_blank" rel="nofollow">Net Objectives</a><br />----------------------------<br />* Sign up for our free newsletter by sending an e-mail to<br />info@netobjectives.com<br />* Learn about and join our design pattern community of practice by going to<br /><a href="http://www.netobjectives.com/dpexplained" target="_blank" rel="nofollow">www.netobjectives.com/dpexplained</a><br />* Alan Shalloway & Jim Trott's - Design Patterns Explained: A New Perspective on<br />Object-Oriented Design is now available<br />* Our new CDROM-based XML training is now available as well
Corey McGlone
Ranch Hand

Joined: Dec 20, 2001
Posts: 3271
XML is not a silver bullet, you are right, and in fact it's not even a good idea for many applications.

One thing I had run into in the past was what I referred to as "file bloat." I started with an application that utilized an applet to parse a comma-delimited file and my job was to convert the comma-delimited file to an XML file and use that instead (I won't get into any more details).
Anyway, what I found was that, after adding all of the necessary markup, my file (which was already ~500KB) inflated to ~1.5MB! This ended up having a significant impact on the performance of the application.
When I looked into this a little more, I found that, in the XML files I created, the files were roughly 66% markup and 33% data. Noticing that, it only made sense that my 500 KB file grew 300% to 1.5 MB with the extra markup. Likewise, I would expect any file to grow proportionately. For example, a comma-delimited file that was 10 MB to begin with, would become 30 MB after converting it to XML - an increase of 20 MB! It would seem to me that converting large files to XML in this way could lead to some pretty serious performance problems.
Usually, I seem to hear the solution of "Throw more hardware at it, and it'll run faster." Unfortunately, that doesn't really solve the problem - it simply avoids it. Is this, in practice, a problem with XML files, or is this easily solved in some way?
Sorry that I got away from binary files, Michael, but I thought the the growth of ASCII files was somewhat along the same path of thought.
Corey


SCJP Tipline, etc.
Mapraputa Is
Leverager of our synergies
Sheriff

Joined: Aug 26, 2000
Posts: 10065
Welcome abroad, Michael Ernest!
It's nice to see that your daily communications with other JavaRanch members and moderators finally prompted you to abandon your ignorance on the most prominent technology trends.
Should I update our contact page?
'Why do I want to switch to a text-encoded means of interprocess communication when binary-encoded forms weren't a problem to begin with?'
One of the best sources of XML wisdom is XML.com weekly reports from the XML developer mailing lists. This is my favorite text "what XML is about":
http://www.xml.com/pub/a/2001/01/24/deviant.html
I wouldn't recommend it for absolute beginners, but I recommend it for you with gloating delight.
I hope this article answers Corey's concern as well.
-------------------------------
Mapraputa Is,
Sun Certified Programmer for the Java�2 Platform.
IBM Certified Developer - XML and Related Technologies.
Co-moderator of XML Certification forum
Co-author of Java 2 Certification Passport


Uncontrolled vocabularies
"I try my best to make *all* my posts nice, even when I feel upset" -- Philippe Maquet
Michael Ernest
High Plains Drifter
Sheriff

Joined: Oct 25, 2000
Posts: 7292

At the risk of starting yet another "Why XML?" topic, you both make good points I'd like to respond to.
If using text representation is an enabler, it's because it goes back to the lowest common denominator for data exchange -- pure string recognition. We've been doing binary data exchange for decades with well known, (sometimes widely-used) standards -- MIME, RPC & XDR, EDI, to name a few. We can teach our browsers to understand a new MIME type and boom, we can just play a music file instead of storing it. This approach conserves network resources by making the "DTD" a prerequisite to the exchange. We've been doing it this way long enough that the problems of distributed computing are well-known.
In the meantime, some proponents of XML -- usually people simply hellbent on pushing a new technology, whatever it is, intend simply to exploit those problems as bad, or business-hostile, simply to push their own new simple thing.
In XML we say "let's add the data definition to the data itself and let's make it both readable and self-describing." Fine. There are appropriate uses for that, no question. I personally wouldn't hang my hat on "platform-neutrality" because I think it's a perceived benefit, not a measurable one. And the inherent danger is clear to Corey as well as me, a former RPCer who sees this truck coming and sidesteps it: you can easily spend more resources describing and re-describing data than you do pushing it, and it's an open-ended problem. If you don't spend time designing that text representation -- we more or less have to on the binary side -- then the hidden bear of wildly inflated I/O cost bites you good.
And in the end, you haven't solved the main problem, you've just made it accessible to a wider audience. Namely, you still have to agree with your exchange partners on the data format. Ok, so now you don't have a standards committee telling you what the standard is. You can argue without waiting for someone else's ruling, and that has benefits too, but I don't think they are technical ones.
This is my snobby, elitist argument: XML in the short run will fool people into believing that data exchange is now easy because design is now irrelevant. I hypothesize that my preaching here is largely to the choir (I will gladly be wrong if the evidence permits), and so here my expressed anxiety may not hold much weight. But it helps describe my position: I haven't shunned XML up to this point because it's a bad or dangerous technology; I've shunned it to avoid using it alongside bad and dangerous programmers who jump at anything that looks easy.
[ January 15, 2002: Message edited by: Michael Ernest ]
Mapraputa Is
Leverager of our synergies
Sheriff

Joined: Aug 26, 2000
Posts: 10065
In the meantime, some proponents of XML -- usually people simply hellbent on pushing a new technology, whatever it is, intend simply to exploit those problems as bad, or business-hostile, simply to push their own new simple thing.
Isn't it too simple a view, Michael?
There are many benefits in using XML and many ways how it can be used, some of them can be quite orthogonal. You may not need all XML virtues for some particular application. Mapraputa, for example, used XML as a cheap simple alternative to a database, to keep data for our contact page, and benefited from standard tools for data transforming (XSLT). She wasn't interested in platform portability, internationalization, validation etc. but inexpensive maintenance should be named. I think you do not really object such modest applications, it is "simple" tag that arouses your animus.
This is my snobby, elitist argument: XML in the short run will fool people into believing that data exchange is now easy because design is now irrelevant.
I understand your concern. I wont say any more here, since I already was criticized by G Vanin for my malicious intentions to complicate essentially easy XML stuff, but I want to utter certain sympathy to your elitist vision.
But it helps describe my position: I haven't shunned XML up to this point because it's a bad or dangerous technology; I've shunned it to avoid using it alongside bad and dangerous programmers who jump at anything that looks easy.
Mm... Just few words of caution. Rephrasing Joseph M. Williams, "since our language seems to reflect our quality of mind more directly than does our ZIP code, it is easy for those inclined to look down on others to imagine that using XML signals mental or moral deficiency"
And in the end, you haven't solved the main problem, you've just made it accessible to a wider audience. Namely, you still have to agree with your exchange partners on the data format.
Isn't it an inherent "defect" of any language - it has to be shared to be understood?
To all other participants in this discussion: please, ignore Michael's and mine courteous hostility, it's no more than a shared language.
[ January 15, 2002: Message edited by: Mapraputa Is ]
Scott Bain
Ranch Hand

Joined: Dec 21, 2001
Posts: 46
There is little doubt that an XML document is going to be less svelte than a comma-seperated-values document will be. So, I submit, will a SQL table or just about any other format for information that "does more" than simple CSV.
It's also possible to reduce the XML overhead through better design -- an attribute-heavy approach, for one. Leaving out tabs and linefeeds (flat serialized XML) helps too.
Also, issues like compression and security are intentionally left out of the XML specification, allowing the developer to use whatever is desired (or, perhaps, mandated by a government or industry standard). That's part of the whole "Extensible" thing.
XML is certainly not a magic bullet, nor will it replace SQL on the database side. Whether or not platform ubquity is "relatively minor" depends an awful lot on your problem domain. To accomplish the goals of Web Serives and/or .Net absolutely requires an approach that crosses languages, binary formats, and the like. So does any attempt to introduce semantics into content.
-S-
Michael Ernest
High Plains Drifter
Sheriff

Joined: Oct 25, 2000
Posts: 7292

ME: In the meantime, some proponents of XML -- usually people simply hellbent on pushing a new technology, whatever it is, intend simply to exploit those problems as bad, or business-hostile, simply to push their own new simple thing. [/i]
MI: Isn't it too simple a view, Michael?
ME: On that note: XML for me carries some of the same red flags as "Y2K" stuff. It seems to me like it solves a problem that doesn't really exist, and it's quickly become a buzz-technology. That said, I admit my slant/bias/bigotry so that I may overcome it, not simply inflame honest, hard-working Americans with jobs.
MI: I think you do not really object such modest applications, it is "simple" tag that arouses your animus.
ME: This is nasty, snobby, devil's advocate side of what I said: XML "opens data exchange to the lowest common denominator." The good side: "When you want to do something that is simple, text-based exchange makes all the sense in the world. Awesome." It's when people think XML will do the simplifying that I get worried.
MI: ...it is easy for those inclined to look down on others to imagine that using XML signals mental or moral deficiency"
ME: Well I'm not so young anymore that I make gross associations like that, but when I see XML as an "enabler" (with every apology to Scott, I don't mean to pick on what you said), I squirm in my seat. The fair thing to do would be to say "I just don't understand why people are as excited as they seem about this." I don't even think PDF got this kind of media attention, and that's an innovation that completely changed data exchange in one part of our industry.
And in the end, you haven't solved the main problem, you've just made it accessible to a wider audience. Namely, you still have to agree with your exchange partners on the data format.
MI: Isn't it an inherent "defect" of any language - it has to be shared to be understood?
ME: I was on the cusp of asking "why in the world do you want to repeatedly send your data description with your data? Can we say redundant?" and never quite got there. I get the impression, again from people who are not here at the moment, that XML's "self-describing" approach somehow solves the need to agree on exchange.
MI: To all other participants in this discussion: please, ignore Michael's and mine courteous hostility, it's no more than a shared language.
ME: I prefer "vigorous debate" to "courteous hostility," but as a disclaimer. No feelings were intentionally abused in the making if this rant.
Mapraputa Is
Leverager of our synergies
Sheriff

Joined: Aug 26, 2000
Posts: 10065
This is nasty, snobby, devil's advocate side of what I said
Just wanted to give you hard time, so you wouldn't think it's an easy job to moderate XML forum.
On that note: XML for me carries some of the same red flags as "Y2K" stuff. It seems to me like it solves a problem that doesn't really exist, and it's quickly become a buzz-technology.
Hm... doesn't really exist... As a former honest hard-working database programmer I wrote a lot of code whose sole mission was to read a file in one proprietary format, do some trivial modifications and write the result into a database - this kind of "data exchange". I wish I had my input in XML format and XSLT processor ready to do all necessary transformation.
It's when people think XML will do the simplifying that I get worried.
Simplifying what? Data exchange? I think XML (here by "XML" we really mean XML + all surrounding technologies) does simplify it because it provides standard tools to read/parse data (DOM/SAX parsers), validate (DTD and Schema) and transform (XSLT) them. Saves programmer's effort. But if by "simplifying" you mean "free the programmer from thinking", "no need to worry about design, performance etc" - than everybody can only agree with your worries, except for marketing people, maybe. It seems to me like you are trying to fight a problem that doesn't really exist.
I was on the cusp of asking "why in the world do you want to repeatedly send your data description with your data? Can we say redundant?" and never quite got there.
Repeatedly. If you have two applications that you know will repeatedly exchange data, you may very well manage without XML! It is when you do not know who your clients are or will be tomorrow, then you want to employ standard, well-defined, widely used data format. Because chances are your recipient will be able to read the format.
How was that for "vigorous debate"?
Michael Ernest
High Plains Drifter
Sheriff

Joined: Oct 25, 2000
Posts: 7292

Just wanted to give you hard time, so you wouldn't think it's an easy job to moderate XML forum.
I was told you weren't permitted to leave XML Certification. Now I can Ajith set me up all along. Don't you worry, both of you will be repaid for your treachery.
It seems to me like you are trying to fight a problem that doesn't really exist.
I wonder if this guy agrees with you.
It is when you do not know who your clients are or will be tomorrow, then you want to employ standard, well-defined, widely used data format.
What is it XML does that really improves on that scenario? I guess I'll be finding that out as I go...

[ January 15, 2002: Message edited by: Michael Ernest ]
Ajith Kallambella
Sheriff

Joined: Mar 17, 2000
Posts: 5782
I apologise for not formally introducing Michael to the citizens of XML forum.
Welcome Michael, glad to have you with us and I'm sure you will be able to usurp my throne very soon
Really, I appreciate your co-moderation and very valuable contributions.
Thank you and welcome.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Call Me Michael