Not sure if you have tried this. I was doing a test to post a big file to Tomcat 5.5 from the MacBook Pro. Even thought the URL does not exist in the web.xml, the Tomcat will just save the file to somewhere (may be to the cache), then, return 404. If I post the file to the non-exisiting URL from Windows, it will return the 404 right away. Why?
Details are not clear. Tomcat is not a file server, so the only way to upload a file from a client is to provide code in a webapp that can receive and process a data stream that gets sent when you do an HTTP POST of a multipart form from the web client. Tomcat may cache that data somewhere, but it's not actually part of the J2EE standard, so it doesn't have to if it doesn't want to.
What really matters is that there should be a URL configured to route the incoming data the the code that will process it. If you don't provide handler code for an upload URL, then 404 is exactly what you will get. Tomcat itself doesn't care about file uploads and neither does web.xml. That's strictly the responsibility of the web application itself.
Customer surveys are for companies who didn't pay proper attention to begin with.
Joined: Dec 01, 2011
When I use Windows and Mac to post the large file to an non-exisiting URL, Tomcat returns 404 for both. The main different is the post from Windows will get 404 right away, the post from Mac will finish the posting, then, 404. I am running the Tomcat on Windows. I can see the Networking traffic in Task Manager when the Mac was doing the post, but not from Windows. Don't you think it is strange?
I do not really care the different between browsers. All I care is why Tomcat will take the POST for a non-existing URL with some browsers or OS. No matter which client does the POST, it should not do take the POST if the URL is not registered in the web.xml at all. The is the part that confuses me.
Uploading a file in HTTP is not a matter of initiating a file copy from client to server. Like I said, Tomcat is not a File Server.
What actually happens is that IF you define a form as MIME Mult-Part, METHOD=POST, then the browser will prompt for a client-side file. When the form is submitted (POSTed), the file is opened, read by the client, the data is MIME-encoded to escape binary-related issues, and the entire stream is transmitted to the server (Tomcat). This is unconditional and defined by the Internet's RFC standards. Has nothing to do with Java, J2EE or Tomcat. The same data stream will be posted to an Apache server running PHP or Perl-CGI. Or any other webserver running any other platform in any other language.
Therefore the same amount of time and bandwidth will be required regardless. The data stream is shaped by the client, and what language the server is running does not matter.
Nor is there any such thing a a "non-existing URL". URL stands for Uniform Resource Locator. It looks like - but is not - a filesystem path. It's simply a string of characters following RFC-defined syntactical rules that can be parsed and used by a web application. And, in the case of multi-application servers such as Tomcat, pre-parsed to determine which application shall receive the data stream.
Every URL gets routed somewhere. The patterns in web.xml are the first place that are checked. If the URL doesn't match any of them, Tomcat's master web.xml is checked. The master web.xml defines a default servlet that receives URLs that no other web.xml URL pattern matches. That servlet attempts to locate a resource (loosely speaking, a "file") in the webapp's WAR based on parsing the tail end of the URL as a resource path. If the resource is found, it is opened and its contents copied out to the response.out stream sent back to the client. If the resource is not found, the default servlet will return an empty response data stream with a "404" HTTP header attached to it. End of story.
Tomcat cannot and must not behave differently depending on what client sends data to it. Anybody can write an HTTP client application. As long as the client adheres to the RFC-defined HTTP protocols, Tomcat must process the request in accordance with those protocols.
In fact, one of the simplest ways to test an HTTP server is to simply send requests to it using the "telnet" text terminal program. No GUI, no client-side web logic, no nothing. Just straight text send/receive.
Joined: Dec 01, 2011
Thanks Tim! According to your explanation, if the URL does not match any patterns in the web.xml, Tomcat will not process the data stream, right? If so, why I see the networking traffic after I POSTed a 4GB file to the bogus URL?
The test is very simple. Save the following HTML as upload.html and post a large file to from the Mac Safari to a Tomcat server.
<!-- The data encoding type, enctype, MUST be specified as below -->
<form enctype="multipart/form-data" action="<your server >/foobar" method="POST">
<!-- MAX_FILE_SIZE must precede the file input field -->
Tomcat processes the data stream in the CPU, not on the network. The network is just the mechanism that brings in the request to Tomcat and sends Tomcat's response back to the client. So there's no way that graph can be a network graph and still represent "Tomcat processing" in the literal sense of the word. A graph of the actual application processing would show CPU usage. And maybe disk or database I/O.
I'm afraid I can't tell what part of the process your graph represents, but the HTTP protocol doesn't signal Tomcat to request an upload of a file based on application logic, in case that's what you're thinking. When a FORM containing one or more file upload controls is POSTed, the file data is bundled by the client (browser) as part of a multi-part MIME-encoded POST data stream and shoved down the network at Tomcat. Unconditionally. Tomcat can - and will - allow that data to simply fall on the floor once it arrives, if there's no application logic set up to store and/or process it, but neither Tomcat nor any other web application server in any programming language can keep that data from being sent by the client, and it will definitely show as incoming network data whether the web app uses it or not.
The network traffic on the response leg of that transaction is controlled by the web application, and it's whatever the application wants it to be, as big or as small as the app makes it. If the URL handler accepts an incoming CSV and outputs a PDF report, then the output network traffic would typically be fairly large, because the PDF itself would be fairly large. But if there was no URL handler and Tomcat's default servlet took over, the outbound network traffic would be very short. Mainly the "404" message, some cookies (if applicable), and assorted minimal HTTP headers.
Joined: Dec 01, 2011
Tim Holloway wrote:... if there's no application logic set up to store and/or process it, but neither Tomcat nor any other web application server in any programming language can keep that data from being sent by the client, and it will definitely show as incoming network data whether the web app uses it or not.
Ah! This is what I have been talking about. I only see the increasing network traffic when the request was from Mac's Safari, Chrome, and FireFox, but I don't see any network traffic if I POST the file from IE. The CPU, memory, and Disk I/O graphs are flat. I did not see any increasing activities.
I understand and believe everything you said. However, it really happens like I describe.
Well, IE is notorious for not playing by the rules, but this particular rule is not an option. The RFC that defines HTTP file upload for the whole world (RFC 1867) very definitely defines the file data as part of the form submission process. The client and server cannot negotiate for the data, the data must be uploaded unconditionally.
Actually, since HTTP itself is a series of one-off request/response cycles and only clients can initiate them, there's no technical way that the server could prompt for the file data in any event.
Author and all-around good cowpoke
Joined: Mar 22, 2000
How about using the Tomcat Manager App to look at the bytes received and bytes sent totals. The state of the request Threads would also be interesting.