Raf Szczypiorski

Ranch Hand
+ Follow
since Aug 21, 2008
Merit badge: grant badges
For More
Cows and Likes
Cows
Total received
0
In last 30 days
0
Total given
0
Likes
Total received
2
Received in last 30 days
0
Total given
0
Given in last 30 days
0
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by Raf Szczypiorski

It is actually a high exaggeration that you can write statically typed Groovy code. It just allows you to declare types, and at runtime, the system performs casts, that's all. You can still call any method on such instances, including those that are _not_ in the target type, or those that are not visible / defined in the target type - welcome to duck typing.
To be clear - I am not bashing groovy in any way. I am a huge fan of both it and python, but I use them for different purposes. But that's a story that requires a different topic.
12 years ago
I explained what I mean to the unnamed persons, and they got what I meant by it. What I got as answer was that such data has to be all read / generated before, and sent in a single HTTP response, which doesn't seem right, because...
I did my experiment on Tomcat 7 and I managed to generate a few gigs of data and stream / send it to the client, although the server had only a few hundred mbs of memory. Also, sending such data as response payload requires the Content-Length to be set, which pretty much requires the whole data to be generated to be able to count the bytes.
But you are wrong saying that this simple test can prove anything - it will just prove that one of many servers (tomcat 7) is smart, for example, and it still doesn't tell me what the standard behavior is and what it is not. I was looking for some maybe general answer about how http servers must behave when data is sent with Content-Disposition: attachment. If there is no such rule that servers must obey, then be it, but I am not able to answer it on my own. Hence the question, to learn from more experienced people who might have the answer.

raf
12 years ago
So you are saying that streaming == media streaming? Or rather, that whenever anybody says 'streaming', everybody else understands it as 'media streaming'? That's an interesting point of view, but not the only correct one, I would say. Where I come from, streaming data means just allowing data processing (uploading, transformation, whatever) in a way not requiring to buffer all of it in memory first, as there can be a whole lot of it. I guess the inventors of StAX have a similar notion. Media streaming is just a use case.

To get things straight - no, I was not heaving media streaming in mind, but rather the ability of the server to start transmitting data of a huge file / dynamically created bytes before it has all been read / generated, thus reducing memory footprint on the server. Does Content-Disposition allow this, is there any RFC to define this?

Paul, I know you are a bartender here, so don't get me wrong, but I consider your post to be a) a little aggressive in tone and content; and b) not really contributing to the topic, as you gave no answers, just some implications of yours. Might be just my point of view, though.

raf
12 years ago
Hi Paul. Yes, that's what I mean by streaming - the client receives while the server is not ready yet.
Please educate me if what I say is wrong - can you provide links which describe this scenario, and also describe what streaming is and what it is not, because I obviously don't get the difference.

raf
12 years ago
My main concern is this: we use this library and we allow the users export some data in XML. The export data set might be small, or it might be huge - the actual configuration is done per 'export configuration'. So, it was all nice when the data amount was small, but now we have problems, as users export more and more data - and all of it is buffered on the server side. So, if the XML is 100mb worth of memory, 10 users exporting at the same time (which is not actually too many, and it is a viable scenario) take up 1gb, and it's all there sitting on the server side for a while. We do have 1gb to spare, but then again, we also have times when 100 users export... And we have problems with memory. So, I asked support about it and they say it is impossible for HTTP to stream, which I don't believe as there are a multitude of sites that allow huge download bundles (linux distros, rapidshare, whatever) and I don't believe they all buffer that before sending to the client - it somehow just doesn't make sense.
So. I just wanted to ask what actually happens when I do that with Content-Disposition and the rest of the story. I certainly don't care whether data will be sent in the same packet, or if some gateways perform buffering - not my problem. I just find the whole story the support guys tell me really hard to believe.
12 years ago
The thing is that I _don't_ want the server to buffer ;d But if you say there is nothing I can do to force it, there is probably also nothing that I can do to prevent it.
What does the venerable tomcat do in such case?
12 years ago
Hi. Yes, I know about the TCP/IP stack and that HTTP is application layer a few floors above that. The question was more: when does the HTTP server start writing to the link to the client, and when do the TCP packets get sent? Only after the whole response is ready somewhere (buffered on the server) or does it happen on the fly. It must be on the fly, as I don't believe downloads of big files (like whole Ubuntu distributions) are buffered on the server, but haven't found anything that would support that claim.
Yes, I know I can make a practical experiment, but I was more like searching for the theory (articles, tips, maybe some RFCs) behind the practice.
The reason why I ask is that we are using a funny UI library that was imposed on my team, and it is HTTP based, but it is not a webapp framework. You can store files on the client machine, but the whole data must be first prepared on the server, and I debugged it, and it is all buffered there! I asked support about it, and they said that HTTP doesn't support streaming anyways, so they can't do anything about it. I want to answer to that, but all I have are common sense and no links to back it up. The RFC where Content-Disposition is described didn't help either.

raf
12 years ago
I should have probably mentioned that I was meaning a response with Content-Type: application/octet-stream and Content-Disposition: attachment,file=...
Does this support streaming?

Regards,
Raf
12 years ago
Hi. When I get the ServletOutputStream for the response, and then try to write a large file to it, and read the file in chunks, into a buffer, and then sent that buffer to the output:

(please ignore closing streams, this code is just for demonstration)

The ServletOutputStream can at a certain point get 'committed', whatever that means. The question is - when are is the data sent to the client? Is it buffered somehow on the server, and then sent as a single http response with huge payload to the client, or something else?
12 years ago
Hi. In EJB3.1 spec, section 21.2.2 (Programming restrictions) stands:

The enterprise bean must not attempt to create a class loader; obtain the current class loader;
set the context class loader; set security manager; create a new security manager; stop the
JVM; or change the input, output, and error streams.

Ok, so my EJB can't do it. How about classes that the EJB uses, possibly deep in the hierarchy? Do they also have this restriction? They are technically other classes, but they execute in EJB context (they can for example use the wrapping EJB's java:comp naming context).

I am asking this question as we are thinking of using groovy for ad-hoc scripting (formatting output, defining a configuration DSL for some parts and the like), and the way it works is via a custom GroovyClassLoader. If I can't create my own class loader, I can't use groovy either, which would be bad.

As a side note - do you EJB gurus comply with the restrictions? I mean, no filesystem, no class loader, no threading... Some of them make sense, some others seem overly restrictive. For example, we have a WAR that packages JPA and EJBs in WEB-INF/lib (separate jars). The servlet container can use the filesystem, there is even a special method somewhere (ServletContext?) to get the real path to a resource, but as soon as the border Servlet / EJB is crossed, I can't access files any more, even though it is the same JVM (in our case).

szczyp
Hi ranchers. I have some trouble understanding the added value of async servlets / filters. Here is the deal: I have a servlet, and it has to wait for some action, like a return from a database or a web service endpoint, which can take time. So, instead of wasting the thread serving the request, my servlet starts async processing, and its thread returns immediately ready to serve other requests. This is supposed to add to server scalability and so on.
What I don't seem to grasp is how it is supposed to work? Most of the examples I have seen either start a new thread that does stuff to the AsyncContext instance returned, and simply starts it (see Tomcat 7 examples in $CATALINA_HOME/webapps/examples/WEB-INF/classes/async) or use some ThreadExecutor created by themselves, and there even is a AsyncContext.start(Runnable) method to take care of the thread and make it managed and possibly add some Java EE services to it (like security context propagation). So, this new thread has to wait instead of the container-managed one (but it can also be container-managed!). So what's the point of this? Wouldn't it be easier to just add threads to the container, so that it has more of them to handle the requests? Of course, you can't just add threads with no limits, that's why this async stuff was invented, right? But async requests require new threads anyways - how does this differ? If the threads are not container managed, I try to create more and more, and kill the machine. If they are container-managed, they wait whereas other requests could be served by them.

I am obviously missing something important here, as I must be wrong in my understanding. Please, ranchers, guide me to understanding async servlets!
12 years ago
Ok, but here you are using implicit *.jsp mapping, not the default servlet, and as the file does't exist, you get 404.
Also, that you get something from your (presumably) Tomcat doesn't mean that it is valid according to the specification, it might be a 'custom' bug.
13 years ago
Aah - the difference seems to be in the servletPath: for '/' the servlet path is everyhing after the context path, whereas for '/*' is an empty string.
This is quite confusing.
13 years ago
I am not sure if this is correct. Here:
http://www2.roguewave.com/support/docs/leif/leif/html/bobcatug/7-3.html
under 7.3.3. says:
A mapping that contains the pattern <url-pattern>/</url-patttern> matches a request if no other pattern matches.

Additionally, servlet 3.0 specs say (12.2, Specification of mappings):
The empty string ("") is a special URL pattern that exactly maps to the
application's context root, i.e., requests of the form http://host:port/<context-
root>/. In this case the path info is ’/’ and the servlet path and context path is
empty string ("").
A string containing only the ’/’ character indicates the "default" servlet of the
application. In this case the servlet path is the request URI minus the context path
and the path info is null.

So, what you said about '/' mapping seems to be incorrect, and it is an empty '' mapping. Please correct me if I am wrong.


What I don't understand is this: I have a '/*' mapping, which catches all. If I have the default servlet '/' which is invoked when no other mapping matched, it is invoked - so it is also a catch-all. What is the difference between the two?
13 years ago
As in the topic - what is the semantic difference between these two mappings? I fail to see any, but I am most likely wrong about that.
13 years ago