wood burning stoves 2.0*
The moose likes Beginning Java and the fly likes read text file and perform an operation on specific columns Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCA/OCP Java SE 7 Programmer I & II Study Guide this week in the OCPJP forum!
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "read text file and perform an operation on specific columns" Watch "read text file and perform an operation on specific columns" New topic
Author

read text file and perform an operation on specific columns

Maria Filonczuk
Greenhorn

Joined: Dec 08, 2011
Posts: 6
I am somewhat of a java newbie (learned it awhile back and just getting back to programming some again.) I have a text file of data much larger than this, but here are the first ten rows and first 10 columns.



I can read and write all the data back out to files. But I am not sure about a few things.

first would it be best to use BufferedReader and OutputStream such as:


second, I was then going to initialize an array but I might have 30 or more columns of data, of which I might perform and output only 10 columns to my output.txt file so is this a good way?

third, I only want to perform operations (such as average the data, convert units or multiply two columns together) on 10 of the 30 plus columns and print out results to output.txt file. For example, In testdata.txt file above, I want to skip first seven columns. I then want to take column 8 and multiply by 100 then divide by 3 and output result to output file. I want to do same for column 9. Column 10 I want to multiply by different number (by 3.5).

thank you for the help!
Maria




Mohamed Sanaulla
Saloon Keeper

Joined: Sep 08, 2007
Posts: 3071
    
  33

Hi Maria, Welcome to Coderanch!

Have you tried some initial code? We would like to see your attempt and help you out.

I would suggest using a List to store the data read from the file. Because the List can grow dynamically in size you need not worry about giving it an initial size. Also retrieval by using the index can help you to get 8th element or 10th element and so on.
Are these numerical values have some significance? Like for example- Column 1 stands for X data, column-2 stands for Y data. If yes, then you can create a Class to represent this information.


This way you can create a generic List which would contain objects of type ClassName only (or which can be assigned to a reference of type ClassName).


Mohamed Sanaulla | My Blog
Mohamed Sanaulla
Saloon Keeper

Joined: Sep 08, 2007
Posts: 3071
    
  33

And in future please UseCodeTags to post your source code. This time I have done it for you
Maria Filonczuk
Greenhorn

Joined: Dec 08, 2011
Posts: 6
Mohamed, Thank you for prompt response. Sorry I didn't read the rules so well. I'm in a hurry to figure this out which is not good. I will try to figure out how to use CodeTags in a few minutes and send some code I am working on.

I am guessing by List you mean to use ArrayList?

I will make an attempt to tidy up some code I wrote and post here tonight. Yes the numerical values have some significance. For example the data in columns 8 and 9 are actually latitude and longitude in radians and I will convert it to degrees. Column 10 is height in meters and I need to convert it altitude in feet. I do know those conversions, although in the initial post I just put some simple numbers in. I am eventually going to need to average the data say every 10 lines and then print in the output file, but I am trying to do these initial steps first. It will be all double data I believe except I have to add an identifier string to the first and second column of each row so the data can be read by a weather model.

I hope you are around to help me some more tonight. thank you!!
Mohamed Sanaulla
Saloon Keeper

Joined: Sep 08, 2007
Posts: 3071
    
  33

Maria Filonczuk wrote:

I am guessing by List you mean to use ArrayList?

Yes List is an interface and ArrayList is one of the implementations.

Maria Filonczuk wrote:

I hope you are around to help me some more tonight. thank you!!

Sure I would be if I am around, or there are other members who would surely help.
Maria Filonczuk
Greenhorn

Joined: Dec 08, 2011
Posts: 6
ok I put here the start of some of my code. Any help would be greatly appreciated!

Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 39478
    
  28
If those data in the file represent something, you should be creating an object representing that something. You are not interested in the individual pieces of text. You can create an object by passing the whole line to a constructor, and splitting the String with the split method (try "\\s+" as a parameter). Or you can use a Scanner (if your file is text) and read nextInt, nextDouble, next, etc., passing those values as constructor arguments. In both cases note your constructor will have to be tightly coupled to the format of your text file.
That way, you can get all the code out of the main method.
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 8059
    
  22

Maria Filonczuk wrote:Yes the numerical values have some significance. For example the data in columns 8 and 9 are actually latitude and longitude in radians and I will convert it to degrees. Column 10 is height in meters and I need to convert it altitude in feet. I do know those conversions, although in the initial post I just put some simple numbers in.
Yes, but what Cambell is trying to find out is: does all the data in a line represent something? For example, is it something like METAR data?
It could even be more than one "something", but knowing what it is will help you to create a class (or classes) for it - and if it is an established standard like METAR, there may well be code out there already written to process it.

I am eventually going to need to average the data say every 10 lines and then print in the output file, but I am trying to do these initial steps first.
A very good idea. Don't get distracted by other stuff before you've got the input working.

It will be all double data I believe except I have to add an identifier string to the first and second column of each row so the data can be read by a weather model.
Ah, so where do you get that from, or is it a constant?
Also, make absolutely sure that your data should be in floating-point form. Doubles cannot hold certain values (eg, 0.1) exactly; although they may well be "close enough" for your purposes. This page may help you decide.

Winston


Isn't it funny how there's always time and money enough to do it WRONG?
Articles by Winston can be found here
Maria Filonczuk
Greenhorn

Joined: Dec 08, 2011
Posts: 6
Thank you for the help with my problem. No the data is not a line of METAR code. It is numerical data output by various sensors. I don't need to use some of the output at this time, hence I want just 10 of the columns of data that I need to perform an operation on each before outputting to file.

Thank you for info on floating point versus decimal. I will look into that more. The data is precise from input, some past 6 or 7 decimals which is why I didn't think to use float. But once it is averaged and operation performed on to convert units, it is rounded in the output file.

I have looked at parsing the data by String with the split method. I don't know how to call just the columns I need. I may have 30 or more columns of data but only want to call 10 of those and perform an operation on. I do understand the \\s+ and that is what I will need to output the data as I want columns with one blank space between each so thank you for pointing me in that direction.

I will try and post some more code I am working on this morning. Thank you for all the assistance!
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 8059
    
  22

Maria Filonczuk wrote:I have looked at parsing the data by String with the split method. I don't know how to call just the columns I need. I may have 30 or more columns of data but only want to call 10 of those and perform an operation on.

It'll be something like this:but Campbell's quite right, you'd be much better off defining a class for those fields, even if they are only a subset.

Winston
Maria Filonczuk
Greenhorn

Joined: Dec 08, 2011
Posts: 6
Thank you Winston. I understand what you just showed me. My file is just text with data in it. Can you or Campbell point me in the right direction on reading in just certain columns to pass as constructor arguments? I think I might understand that I will need subclasses for each data operation and then will call them to output result in new output.txt file. But I am not sure where to start on reading the data in now. Thank you!
Mohamed Sanaulla
Saloon Keeper

Joined: Sep 08, 2007
Posts: 3071
    
  33

I think Winston pointed out in a previous post on how to go about reading the input file.

Now that these data/columns are related information, they can be collected into a class definition.
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 8059
    
  22

Maria Filonczuk wrote:Thank you Winston. I understand what you just showed me. My file is just text with data in it.

Yes, but that data presumably has some meaning. In fact, I'm going to take a stab and say that it refers to some sort of location, so you might do something like:and then in your main code you might do something likeat the end of which you'll have a List of Locations that you can do whatever you like with.

Winston

PS: The conversion constants are just there for ease of reading. It might, in fact, be better to define them in a separate 'Conversions' class or enum.
Maria Filonczuk
Greenhorn

Joined: Dec 08, 2011
Posts: 6
you guys have been a great help and I think I am finally grabbing a clue. I have another newbie question to do with formatting the code. I have played with several examples of code to get a feel of running short jobs here and there successfully. I have always put the main method first. But as I see in your code below and have seen on other projects, some put it at the end. Just wondering what the value in that is, or is it more necessary when you have several classes?

Second, the error "non-static variable this cannot be referenced from a static context" is throwing me for a loop. I know it has to do with declaring my variables correctly in my classes. And this probably goes hand in hand with first question above. I think I am not quite clear on how the code is called in a program with more than two layers of info.

Thank you again for saving me today! I am actually enjoying learning this even though not my field of expertise!
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 8059
    
  22

Maria Filonczuk wrote:I have always put the main method first. But as I see in your code below and have seen on other projects, some put it at the end. Just wondering what the value in that is, or is it more necessary when you have several classes?

There's no hard and fast rule, but in general methods are usually defined after fields, and it's quite common to put static methods last of all (and oddly, static fields first).

Second, the error "non-static variable this cannot be referenced from a static context" is throwing me for a loop. I know it has to do with declaring my variables correctly in my classes. And this probably goes hand in hand with first question above.

I suspect you're right, and it also suggests to me that you have too much code in your main() method.
1. main() should be only used to launch your application; most (all?) of the application logic should be contained in its classes.
2. You don't need to put main() methods in every class that you write. In fact, an application only needs one class to have it; and I usually write that one at the very end.

Winston
 
 
subject: read text file and perform an operation on specific columns