• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Tim Cooke
  • Campbell Ritchie
  • Ron McLeod
  • Junilu Lacar
  • Liutauras Vilda
Sheriffs:
  • Paul Clapham
  • Jeanne Boyarsky
  • Henry Wong
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Stephan van Hulst
  • Piet Souris
  • Carey Brown
Bartenders:
  • Jesse Duncan
  • Frits Walraven
  • Mikalai Zaikin

Any way to get whole webpage content into a notepad?

 
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
i am using this link news
i want to get whole url content into notepad...there is a More  stories button in this link which contains hidden contents..i also want to get these hidden contents into a notepad..any idea?
 
Saloon Keeper
Posts: 7355
170
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What does "into notepad" mean - saving the entire HTML into a text file? If so, the HtmlUnit library is a great way to access web pages programmatically. It can also simulate clicking on buttons or links, which should take care of your use case.
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
yes...i want to read entire webpage content and write it into the notepad..
thanks for your response..then i will try it with HTML Unit...
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

for this code....i am getting this error..
Exception in thread "AWT-EventQueue-0" java.lang.NoClassDefFoundError: org/apache/commons/collections/set/ListOrderedSet
what should i do?
 
Java Cowboy
Posts: 16084
88
Android Scala IntelliJ IDE Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
This NoClassDefFoundError means that you are missing a JAR file in the classpath when you run the application.

Make sure that you have all the necessary JAR files in the classpath. You are probably using some web client library (where do the classes WebClient and HtmlPage come from? They are not from the standard library) which needs other JAR files. You need to find out what libraries are necessary and put them in the classpath.
 
Rancher
Posts: 4801
50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
By the way, the "More" button does not show hidden contents.
It retrieves more content from the server.

The former implies that the content is there on the page already, just hidden by some class or other.
The latter says that it isn't there at all until the button is clicked to retrieve it.
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
thanks....i am trying
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Dave Tolls wrote:By the way, the "More" button does not show hidden contents.
It retrieves more content from the server.

The former implies that the content is there on the page already, just hidden by some class or other.
The latter says that it isn't there at all until the button is clicked to retrieve it.



is there any way to get the content from MORE button without click using coding?
 
Tim Moores
Saloon Keeper
Posts: 7355
170
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

surya preethaaa wrote:java.lang.NoClassDefFoundError: org/apache/commons/collections/set/ListOrderedSet


The library comes with a whole lot of jar files in the "lib" directory - you need to put them all on the classpath.

is there any way to get the content from MORE button without click using coding?


The library emulates a browser, so whatever you need to do in a browser manually, you need to do via code programmatically.
 
Dave Tolls
Rancher
Posts: 4801
50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
That is the same as asking "is there any way to get the content of a web page without making a request".
The content sits on the server, so you have to make a request to get it.
In this case that request is made by clicking the More button.

You could look at the request made by the button and duplicate it in code I suppose.
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
@TIM   I tried with different jar file versions ...but same result...still i am trying

Dave Tolls wrote:That is the same as asking "is there any way to get the content of a web page without making a request".
The content sits on the server, so you have to make a request to get it.
In this case that request is made by clicking the More button.

You could look at the request made by the button and duplicate it in code I suppose.



if it is a button there is a chance of using the request which is used by the button..
but in other cases like  http://www.templatemonster.com/lazy-load-effect-website-templates/ " target="_blank" rel="nofollow">lazy load without using button ...is there a chance to get the request?

 
Tim Moores
Saloon Keeper
Posts: 7355
170
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

I tried with different jar file versions ...but same result...


If there are error messages, post them in full. Also post the entire command line you're using, and describe the directory layout of all involved files.

but in other cases like  http://www.templatemonster.com/lazy-load-effect-website-templates/ lazy load without using button ...is there a chance to get the request?


Again, the library emulates a browser. If the web page loads additional stuff if the user scrolls the page, then that's what you need to do programmatically. HtmlUnit supports scrolling.
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


this is my complete code to scroll the webpage using HTML unit..i am sure  that the path of geckodriver is correct..still getting this error

Exception in thread "main" java.lang.IllegalStateException: The path to the driver executable must be set by the webdriver.gecko.driver system property; for more information, see https://github.com/mozilla/geckodriver. The latest version can be downloaded from https://github.com/mozilla/geckodriver/releases
at com.google.common.base.Preconditions.checkState(Preconditions.java:199)
at org.openqa.selenium.remote.service.DriverService.findExecutable(DriverService.java:109)
at org.openqa.selenium.firefox.GeckoDriverService.access$000(GeckoDriverService.java:37)
at org.openqa.selenium.firefox.GeckoDriverService$Builder.findDefaultExecutable(GeckoDriverService.java:95)
at org.openqa.selenium.remote.service.DriverService$Builder.build(DriverService.java:296)
at org.openqa.selenium.firefox.FirefoxDriver.createCommandExecutor(FirefoxDriver.java:277)
at org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:247)
at org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:242)
at org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:238)
at org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:127)
at uu.main(uu.java:12)
C:\Users\Acer\AppData\Local\NetBeans\Cache\8.1\executor-snippets\run.xml:53: Java returned: 1
BUILD FAILED (total time: 3 seconds)
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
htmlunit-core-js-2.5
htmlunit-2.9.jar
commons-codec-1.10
commons-io-2.5
commons-lang3-3.4
commons-logging-1.2
cssparser-0.9.20
htmlunit-2.23
htmlunit-core-js-2.23
httpclient-4.5.2
httpcore-4.4.4
having all this jar files

 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
now the above coding runs and giving this error

Nov 03, 2016 2:02:43 PM org.openqa.selenium.remote.ProtocolHandshake createSession
INFO: Attempting bi-dialect session, assuming Postel's Law holds true on the remote end
log4j:WARN No appenders could be found for logger (org.apache.http.client.protocol.RequestAddCookies).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" org.openqa.selenium.WebDriverException: org.apache.http.conn.HttpHostConnectException: Connect to localhost:30992 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused: connect
Build info: version: 'unknown', revision: '1969d75', time: '2016-10-18 09:43:45 -0700'
System info: host: 'TVA29', ip: '192.168.1.30', os.name: 'Windows 7', os.arch: 'x86', os.version: '6.1', java.version: '1.8.0_66'
Driver info: driver.version: FirefoxDriver
at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:91)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:601)
at org.openqa.selenium.remote.RemoteWebDriver.startSession(RemoteWebDriver.java:241)
at org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:128)
at org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:259)
at org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:247)
at org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:242)
at org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:238)
at org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:127)
at uu.main(uu.java:16)
Caused by: org.apache.http.conn.HttpHostConnectException: Connect to localhost:30992 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused: connect
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:158)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353)
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:71)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
at org.openqa.selenium.remote.internal.ApacheHttpClient.fallBackExecute(ApacheHttpClient.java:142)
at org.openqa.selenium.remote.internal.ApacheHttpClient.execute(ApacheHttpClient.java:88)
at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:108)
at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:64)
at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:141)
at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:82)
... 9 more
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:74)
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:141)
... 24 more
C:\Users\Acer\AppData\Local\NetBeans\Cache\8.1\executor-snippets\run.xml:53: Java returned: 1
BUILD FAILED (total time: 38 seconds)
 
Tim Moores
Saloon Keeper
Posts: 7355
170
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

htmlunit-core-js-2.5
htmlunit-2.9.jar


You should delete these two files, since the current versions (2.23) are also included; having older jar files with the same classes in the classpath as well can only confuse matters.

I looked a bit into this using the pure Java version (rather than the geckodriver, with which I'm not familiar), and came to 2 conclusions: 1) that page uses a whole lot of JavaScript, making it doubtful that the pure Java versions can handle it (it can't handle all JavaScript libraries), and 2) contrary to what I said earlier, HtmlUnit actually may not properly handle page scrolling at all - I found a lot of questions around this very topic all over the place, but no solution that seems to work unequivocally.
 
Tim Moores
Saloon Keeper
Posts: 7355
170
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Here's the code I used. It produces lots of Javascript warnings and errors, so you'll have to dig around a bit for the actual output.

 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Tim Moores wrote:Here's the code I used. It produces lots of Javascript warnings and errors, so you'll have to dig around a bit for the actual output.



thanks..let me try using this code..
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
i tried the code ...and finally i am getting this error

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.http.conn.scheme.Scheme.<init>(Ljava/lang/String;ILorg/apache/http/conn/scheme/SchemeSocketFactory;)V
at com.gargoylesoftware.htmlunit.HttpWebConnection.createHttpClient(HttpWebConnection.java:504)
at com.gargoylesoftware.htmlunit.HttpWebConnection.getHttpClient(HttpWebConnection.java:470)
at com.gargoylesoftware.htmlunit.HttpWebConnection.setUseInsecureSSL(HttpWebConnection.java:659)
at com.gargoylesoftware.htmlunit.WebClient.setUseInsecureSSL(WebClient.java:1085)
at pp.main(pp.java:20)
C:\Users\Acer\AppData\Local\NetBeans\Cache\8.1\executor-snippets\run.xml:53: Java returned: 1
BUILD FAILED (total time: 1 second)
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
also the getOptions() method shows error..so i used as

       webClient.setRedirectEnabled(true);
       webClient.setCssEnabled(true);
       webClient.setThrowExceptionOnFailingStatusCode(false);
       webClient.setThrowExceptionOnScriptError(false);
       webClient.setUseInsecureSSL(true);
       webClient.setAjaxController(new NicelyResynchronizingAjaxController());
       webClient.setJavaScriptTimeout(2147483647);
 
Tim Moores
Saloon Keeper
Posts: 7355
170
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

java.lang.NoSuchMethodError: org.apache.http.conn.scheme.Scheme.<init>(Ljava/lang/String;ILorg/apache/http/conn/scheme/SchemeSocketFactory;)V


Make sure to compile and run against exactly the libraries that come with HtmlUnit, no others.

the getOptions() method shows error


What errors?
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
get option method not found..
if i run the coding only with htmlunit library...it shows error for other libraries such as htmlclient,common log jar etc..i add some jar files related to htmlunit..for this final i have not found any answer
 
Tim Moores
Saloon Keeper
Posts: 7355
170
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

if i run the coding only with htmlunit library


Not only with HtmlUnit - with all jar files it comes with. There's about 20 of them. And be sure to use the same set of jar files for both compilation and runtime.
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
let me try again by deleting the existing jar files and  add new set of htmlunit related jar files to the library..
 
Dave Tolls
Rancher
Posts: 4801
50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
This is where something like Maven would be handy for you.
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
thanks DAVE and TIM
i am trying...after a try i will say how it works for me
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Nov 04, 2016 10:19:19 AM org.openqa.selenium.remote.ProtocolHandshake createSession
INFO: Attempting bi-dialect session, assuming Postel's Law holds true on the remote end
log4j:WARN No appenders could be found for logger (org.apache.http.client.protocol.RequestAddCookies).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" org.openqa.selenium.WebDriverException: org.apache.http.conn.HttpHostConnectException: Connect to localhost:26345 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused: connect
Build info: version: 'unknown', revision: '1969d75', time: '2016-10-18 09:43:45 -0700'
System info: host: 'TVA29', ip: '192.168.1.30', os.name: 'Windows 7', os.arch: 'x86', os.version: '6.1', java.version: '1.8.0_66'
Driver info: driver.version: FirefoxDriver
at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:91)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:601)
at org.openqa.selenium.remote.RemoteWebDriver.startSession(RemoteWebDriver.java:241)
at org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:128)
at org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:259)
at org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:247)
at org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:242)
at org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:238)
at org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:127)
at tt.main(tt.java:18)
Caused by: org.apache.http.conn.HttpHostConnectException: Connect to localhost:26345 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused: connect
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:158)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353)
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:71)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
at org.openqa.selenium.remote.internal.ApacheHttpClient.fallBackExecute(ApacheHttpClient.java:142)
at org.openqa.selenium.remote.internal.ApacheHttpClient.execute(ApacheHttpClient.java:88)
at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:108)
at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:64)
at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:141)
at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:82)
... 9 more
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:74)
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:141)
... 24 more
C:\Users\Acer\AppData\Local\NetBeans\Cache\8.1\executor-snippets\run.xml:53: Java returned: 1
BUILD FAILED (total time: 29 seconds)



getting same error as before...
any idea for  initialize the log4j system properly
 
Tim Moores
Saloon Keeper
Posts: 7355
170
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
That output cannot be from the code I posted earlier - it doesn't use any class in a "org.openqa.selenium" package. In fact, HtmlUnit doesn't even come with that package. So you must be running something else.
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hai TIM

from the code you suggest me yesterday,giving this output

log4j:WARN No appenders could be found for logger (com.gargoylesoftware.htmlunit.WebClient).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

title = Lazy Loading Website Templates | TemplateMonster


length = 9524
BUILD SUCCESSFUL (total time: 1 minute 0 seconds)
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
thanks for the help...but i want to open a webpage and automatically scroll to bottom of the page
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
i get the text from the webpage but only few text..not scrolling automatically and din't get the text till the bottom of the page
 
Dave Tolls
Rancher
Posts: 4801
50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Is there a "bottom of the page"?
At least, before you've drained their database of articles.
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

surya preethaaa wrote:i get the text from the webpage but only few text..not scrolling automatically and din't get the text till the bottom of the page



I mean the text after loading in the webpage...for example if we consider facebook newfeeds there will be the loading process after a few post...by this coding I am getting content above that loading and I can't able to get the content after loading
 
Dave Tolls
Rancher
Posts: 4801
50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Ah sorry, I see what you mean now.
I misread that post.

As I suggested before, if it was me I would look into the call made by the page to retrieve more data rather than emulating a browser.
I have no idea, for example, if selenium or HtmlUnit or any of those frameworks will successfully emulate the functionality.
 
surya preethaaa
Ranch Hand
Posts: 146
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
i already used selenium and htmlunit that extracts only limited contents...
any idea to know to call made by the webpage?
 
Dave Tolls
Rancher
Posts: 4801
50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You need to find out what call it makes.
For that you need to use the developer tools available i most browsers and track the network traffic when you scroll to the bottom.
 
pie. tiny ad:
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop
https://coderanch.com/wiki/718759/books/Building-World-Backyard-Paul-Wheaton
reply
    Bookmark Topic Watch Topic
  • New Topic