aspose file tools*
The moose likes Java in General and the fly likes Downloading a file from web site Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Downloading a file from web site" Watch "Downloading a file from web site" New topic
Author

Downloading a file from web site

Ehud Kaldor
Greenhorn

Joined: Nov 24, 2002
Posts: 8
Hi,
i need to download a file from a web site, mimicking the behavior of a user who goes on site and click the link. problem is, the link is not the URL to the file itself, but rather a link that triggers the download (not sure what is the technical name for those. the kind that opens a small window or a tab, then ask you for the open/save option and then closes the window).
simplistically, it seems like i need a facility to click that link and catch the file that is returned by the click.

any idea on how to do that?

thank you,
Ehud
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18991
    
    8

Yeah, that's how a browser does it because it needs interaction with the client to decide what to do with a downloaded file. You don't have to do anything like that and you don't have to interact with a browser to do it.

It's really not complicated at all. Here's a tutorial with an example: Reading Directly from a URL.
Bear Bibeault
Author and ninkuma
Marshal

Joined: Jan 10, 2002
Posts: 61766
    
  67

Ehud Kaldor wrote:the kink that opens a small window or a tab, then ask you for the open/save option and then closes the window).

"Kink"?

Such "links" aren't special -- they're just a link like any other. The behavior of opening up a Save Dialog is browser behavior; not anything specified in the HTML.

[Edit: Paul beat me to it!]


[Asking smart questions] [Bear's FrontMan] [About Bear] [Books by Bear]
Ehud Kaldor
Greenhorn

Joined: Nov 24, 2002
Posts: 8
Bear Bibeault wrote:
Ehud Kaldor wrote:the kink that opens a small window or a tab, then ask you for the open/save option and then closes the window).

"Kink"?

Such "links" aren't special -- they're just a link like any other. The behavior of opening up a Save Dialog is browser behavior; not anything specified in the HTML.

[Edit: Paul beat me to it!]


was meant to be "the kind that opens a small window". corrected.
Ehud Kaldor
Greenhorn

Joined: Nov 24, 2002
Posts: 8
here is the code:



i am trying to download a binary .iso file. this is the response console i get:


if i paste the URL into the browser it pops the download dialog. why is it saying it is http/text? why is length=0?
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18991
    
    8

Perhaps you haven't done something the browser would have done, like authenticating or returning a cookie to the site or something like that.

why is it saying it is http/text?


Where does it say that?

why is length=0?


The content length header is optional.

By the way you said you were downloading a binary file. So why are you using a Reader to copy the data? Readers are for text, not binary data.
Ehud Kaldor
Greenhorn

Joined: Nov 24, 2002
Posts: 8
Paul Clapham wrote:Perhaps you haven't done something the browser would have done, like authenticating or returning a cookie to the site or something like that.

why is it saying it is http/text?


Where does it say that?

why is length=0?


The content length header is optional.

By the way you said you were downloading a binary file. So why are you using a Reader to copy the data? Readers are for text, not binary data.


meant text/html. typing without thinking.
why do you think i am using a reader? the InputStreamReader and BufferedReader in the imports are remnants of prior iterations, and are not sued in the code itself (and already removed in code file).
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19785
    
  20

Ehud Kaldor wrote:

That last line should be fos.write(buffer, 0, tempCount);. You've only read tempCount bytes, which may be any number between 0 and buffer.length. If it's smaller, and for the last block it probably is, you don't want to write everything in the buffer - only the newly read data.


SCJP 1.4 - SCJP 6 - SCWCD 5 - OCEEJBD 6
How To Ask Questions How To Answer Questions
Ehud Kaldor
Greenhorn

Joined: Nov 24, 2002
Posts: 8
Rob Spoor wrote:
Ehud Kaldor wrote:

That last line should be fos.write(buffer, 0, tempCount);. You've only read tempCount bytes, which may be any number between 0 and buffer.length. If it's smaller, and for the last block it probably is, you don't want to write everything in the buffer - only the newly read data.


Thanks, Rob,
but i have not reached that problem yet
URLConneciton.getContentLength() returns 0, so i get an empty file created. tried to set the default authenticator, but to no avail.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18991
    
    8

So why did you try to set an authenticator?
Ehud Kaldor
Greenhorn

Joined: Nov 24, 2002
Posts: 8
Paul Clapham wrote:So why did you try to set an authenticator?

the site is credential protected. tried http://user:pass@site.com, tired authenticator, still getting contentLength==0. i brought that up as it might be the issue.
i tried it on another, non-protected site, pointing to the page file (http://web.site.com/index.html) and it worked. on downloadable files in the site i am getting 403. what am i doing wrong here?

Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18991
    
    8

Ehud Kaldor wrote:
Paul Clapham wrote:So why did you try to set an authenticator?

the site is credential protected.


Okay. But the authenticator you used works with basic authentication. Does the site you are accessing use basic authentication, or does it use its own internal authentication process?
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19785
    
  20

Perhaps Apache's HttpClient is a better solution, especially if you need to login through an HTML form.
Ehud Kaldor
Greenhorn

Joined: Nov 24, 2002
Posts: 8
Paul Clapham wrote:
Ehud Kaldor wrote:
Paul Clapham wrote:So why did you try to set an authenticator?

the site is credential protected.


Okay. But the authenticator you used works with basic authentication. Does the site you are accessing use basic authentication, or does it use its own internal authentication process?


good point. internal.
Ehud Kaldor
Greenhorn

Joined: Nov 24, 2002
Posts: 8
Rob Spoor wrote:Perhaps Apache's HttpClient is a better solution, especially if you need to login through an HTML form.


i will give it a try.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Downloading a file from web site