File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes Returning A List Of Variables From A Folder Of Documents And Returning Them Into A New Document Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Returning A List Of Variables From A Folder Of Documents And Returning Them Into A New Document" Watch "Returning A List Of Variables From A Folder Of Documents And Returning Them Into A New Document" New topic
Author

Returning A List Of Variables From A Folder Of Documents And Returning Them Into A New Document

Nick Rowe
Ranch Hand

Joined: May 26, 2010
Posts: 88
Hi guys im new to Java, literally brand new. And this is basically my first program I have been asked to develop.
My program captures a value between two strings and stores it as an instance in an array. The array is then populated until all documents have been read in line by line. The sorted list is then printed into a new document.

This all works fine, however now I have been asked to develop my program further by also getting it to extract Idoc variable names from the documents. So far I have tried simply changing the instance criteria to return what I want, however when it comes to writing data to the document there is nothing there. I get no errors in the command prompt window but i also get no results.

I basically need to return my original values, plus another set of values.

Please can someone help? Thanks so much for your time.
regards S

My code so far is below

akhter wahab
Ranch Hand

Joined: Mar 02, 2009
Posts: 151

check this might work


Start Earning Online||Start Earning Using Java
Nick Rowe
Ranch Hand

Joined: May 26, 2010
Posts: 88
Hi Akhter, guys

Thanks for the post but I don't think thats really what I had in mind. For example in the code i posted (which already works). That code works fine removing references between the values <@dynamichtml", "@>". Basically any string occuring between these two values is captured and stored in my array.

Now i also need to capture of values between following the same method between "<$" and "="
So my program should bring back two different sets of results.
I'm a little confused as to how I should go about it as my attempts so far are bringing back blank documents.

My last effort i simply took my old code and changed the string criteria so that I could see the output in a seperate file. But im not having any luck for some reason.

Does Java not accept the "$" as part of a string search, i think this could be why I am not returning any results.

regards Nick
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

The dollar sign has a special meaning in regular expressions.

Method Javadoc comments should be written in third-person declarative form. For example, you have a comment that reads:
/** method to read html file line by line and return string to tore in array */

First, we know it's a method, so that part is automatically redundant. Third-person declarative would read:
/** Reads HTML file line-by-line and returns string array. */

Problem is, in this case, that's nothing like what it does. You're doing everything in the constructor, which from a technical standpoint is fine, but wholly unexpected. In the constructor, at most, I'd expect to pass in a filename.

It also doesn't return anything (it's a constructor), and there's no array in sight. So the comment is completely misleading.

Better example:
/** method to find the value between the beginning and end of a string

Again, we know it's a method. "Returns the string from resourceline found between "beg" and "end" delimiters." would be much more effective, but it should also include what it does if the delimiters aren't found.

It also doesn't handle cases where either the "beg" or "end" parameters contain regular expression characters as in the case of the $.
Nick Rowe
Ranch Hand

Joined: May 26, 2010
Posts: 88
Hi David Apologies about that. I havn't changed the comments since building it up from the original.

I realise the "$" symbol has a special meaning but it there a way around it. i.e. I was told a \ before the $ could work to enable me to use it within the string. Is this possible?

Also I am new. although my account has been on here for a month or two this is the only subject I've actually posted on. This is a developed version of my original program thats all


regards Nick
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 40052
    
  28
I presume you know all about regular expressions? You would have to escape the $ which I think means beginning of the string with \ but you have to escape the \ with \ giving \\ and if that doesn't work try \\\\ and if that doesn't work try \\\\\\\\
Nick Rowe
Ranch Hand

Joined: May 26, 2010
Posts: 88
Hi Ritchie,

Lol i've litererally just tried the "//" method and that works returnin alot of values. However it also returns alot of values that I dont wan't lol.

Im actually trying to return a list of Idoc variables within the resource folder of a component. So all values are going to be <$value$> im playing around with the search criteria but Im stuck. If im escaping the $ symbol how am i going to return exactly what i want

regards S
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

If it's returning matches you don't want then your regex is wrong--but we can't see it, so there's little we can do.

Plus you said "//", which is wrong.
Nick Rowe
Ranch Hand

Joined: May 26, 2010
Posts: 88
Apologies "\\"

Initially I was asked to return a value between the beginning string "<$" and the end string "="
I tried this but the values returned were not only idoc variables but also other code aswel.

All idoc variables begin with "<$" and end "$>" so I really only want the data (idoc variable name) from between those two fields stored. This should narrow the search down a bit, however the results shown are still massive.

I can't exactly say read in line by line and store every instance of <$---$> either as some values and other bits and pieces also start and end like this for example see below.

Ideally i would only like the variable to be printed if the variable id is not already stored within the array.
I'm not too sure how to tackle the rest just yet. Its rather frustrating.

David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

What are we supposed to do with 3000 lines of whatever that is?!?!?!
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 40052
    
  28
David Newton wrote:What are we supposed to do with 3000 lines of whatever that is?!?!?!
. . . and 3000+ lines which are illegible because they need horizontal scrolling?

Unsuitable for "beginning Java". Moving thread. Not sure where to; let's try JavaScript since some of those 3295 lines mention JavaScript.
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

No no, it's still Java, it's a regex question, with *FAR* too much sample data :(
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 40052
    
  28
Still unsuitable for "beginning", however.
Nick Rowe
Ranch Hand

Joined: May 26, 2010
Posts: 88
Im not actually sure how to move it.

Just a thought, could I not create an If statement along the lines of something like.
If the arrayList already contains the values from this instance of resourceline and the value contains a symbol then call main method and continue to read document.

Otherwise add the instance to the arrayList and trim. This should remove alot of values. But how would I go about
this. Is there a way i can use regex to to this. And if so how?
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 40052
    
  28
Nick Rowe wrote:Im not actually sure how to move it.
You can't move threads. I can, and have moved it. Twice, by the looks of it
Nick Rowe
Ranch Hand

Joined: May 26, 2010
Posts: 88
Oh ok thanks. As i mentioned this is only a development of my original program. I havn't really done much else to do with forums or Java so im kind of self teaching as I go. Thank you.

Could i use something referencing symbols stored within an existing variable to easily remove unwanted instances
i.e. Psuedo below

If this instance of resourceline is already contained in arrayList
{
main();
}
If resourceline contains (!,=,%,*,")
{
main();
}
else
{
resourceline=resourceline.trim;
arrayList.add resourceline;
}
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Returning A List Of Variables From A Folder Of Documents And Returning Them Into A New Document