• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Liutauras Vilda
  • Jeanne Boyarsky
  • paul wheaton
Sheriffs:
  • Ron McLeod
  • Devaka Cooray
  • Henry Wong
Saloon Keepers:
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
  • Tim Moores
  • Mikalai Zaikin
Bartenders:
  • Frits Walraven

Reading a multiline as if it were a single line?

 
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Is it possible in core Java to read a mutliline as if it were a single line?

Case in point: a CSV that should be like this:



Instead will appear like this:



Yet I need to somehow read this as two lines instead of four.  If possible, how is that done, considering you have embedded line breaks?

All third-party Java class libraries are out of the picture in this case; I can only use the available core Java libraries

Thanks
 
Saloon Keeper
Posts: 10930
87
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows ChatGPT
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You could write java that  would do it but it  would also be brittle in that a change to your data would probably change how certain lines are split. I don't  know if it is possible but a better tactic would be to tackle  the split where it  is occurring. What program are you using to generate the csv and what program are you using to display the csv?
 
Carey Brown
Saloon Keeper
Posts: 10930
87
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows ChatGPT
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
There are java libraries which will parse csv, I've  not used one myself, but they may be able to  handle this situation.
 
Phillip Powell
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Carey Brown wrote:There are java libraries which will parse csv, I've  not used one myself, but they may be able to  handle this situation.



Those Java libraries will have to be in core Java, otherwise, they are forbidden for us to use here; all non-core Java libraries are blocked from access
 
Phillip Powell
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Carey Brown wrote:You could write java that  would do it but it  would also be brittle in that a change to your data would probably change how certain lines are split. I don't  know if it is possible but a better tactic would be to tackle  the split where it  is occurring. What program are you using to generate the csv and what program are you using to display the csv?



Java is used to generate the CSV, and the clients use Excel simply to read it locally, however, they want to use one of the CSV files to eventually be also read by Java to extract a value from the line to program another object to run
 
Carey Brown
Saloon Keeper
Posts: 10930
87
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows ChatGPT
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If  java is creating the csv can you modify it to not replace the comma with a new line?
 
Carey Brown
Saloon Keeper
Posts: 10930
87
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows ChatGPT
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If you want to merge the lines together with core java:

Assume that all double  quotes come in pairs and are on the same line.
If you find a line with an odd number  of double quotes then append the next line with a joining comma.

This should work but is dependent on the code creating this file.
 
Sheriff
Posts: 28329
97
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Carey Brown wrote:There are java libraries which will parse csv, I've  not used one myself, but they may be able to  handle this situation.



But I expect they will parse

into

i.e. they will preserve the line-break character. That's one of the useful features of those libraries. But I believe that Phillip wants his code to remove the line-break character. There may also be a requirement to replace it by a space character, at least that's what the examples given suggest. We don't know what the requirements are for quote marks which are inside a quoted string but from the examples given it looks like we can assume that doesn't happen.

 
Paul Clapham
Sheriff
Posts: 28329
97
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The code is easy enough to write. Read the file one character at a time and use a flag to tell whether you're inside a quote. Then for each character: if it's not quote or end-line, write it to the output. If it's end-line, write end-line or space depending on whether you're inside a quote. If it's quote, write it and flip the flag.
 
Phillip Powell
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Carey Brown wrote:If  java is creating the csv can you modify it to not replace the comma with a new line?



That is possible, but the client wants the CSV to instantly have new lines instead of commas, and also they want to be able to use Java to read one of the CSVs configured with new lines instead of commas
 
Phillip Powell
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Paul Clapham wrote:The code is easy enough to write. Read the file one character at a time and use a flag to tell whether you're inside a quote. Then for each character: if it's not quote or end-line, write it to the output. If it's end-line, write end-line or space depending on whether you're inside a quote. If it's quote, write it and flip the flag.



I will have to research that one. I tried something similar to that to total failure due to the configuration of the CSV
 
Master Rancher
Posts: 5060
81
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The no-third-party-libraries requirement sounds like maybe this is for a course of some sort, is that the case?

Elaborating on Paul C's suggestions: you may want to use an enum rather than a boolean, if there are more than two states of interest.  But  so far there are two: you're either inside double quotes, or you're not.  Imagine looping through the whole file one character at a time, and deciding what to do for each character you read.  You might have a StringBuilder to contain what you read from whatever the current field is.   And maybe you want to build a List<List<String>> of the results that you've parsed so far - where the outer List is of record, and for each line, there's a List<String> of the fields for that record.  In your example, there's typically one record for every two lines, thanks to the newlines used as separators between names.  When you encounter a comma outside of quotes, you take whatever is in your StringBuilder as the content of the current field - add that to the List<String> that represents the current record, then reset the StringBuilder for the next field (i.e. with clear() or just replace with a new StringBuider).  When you encounter a newline outside of a double quote, you finish the current field just as if you had encountered a comma, but then also add the current List<String> record to the List<List<String>> that holds all the records, and start a new ArrayList for the new current List<String> record.
 
Phillip Powell
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It's not a course but a company requirement
 
Carey Brown
Saloon Keeper
Posts: 10930
87
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows ChatGPT
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
 
Saloon Keeper
Posts: 28321
210
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If by "core java" you mean that you are limited to only classes that come with the JDK, your employer is doing themselves a serious disservice. Not to mention locked themselves out of any meaningful Internet development, or, for that matter, the JDBC database drivers.

I started out long ago when we had to write everything in assembly language for the mainframe. It was an excruciatingly slow process, but we had no choice. since none of the higher-level programming languages such as COBOL or Fortran could be used at the system level where we were working. I suppose in that sense, we were "more core than core". My boss hated it, we hated it. We had to create every single sort, search and what-have-you algorithm from scratch and the only thing that justified it was that when a mainframe costs $1 million (last-century) dollars, having the most expensive talent in the shop constantly re-inventing the wheel was justifiable economically, even if not in terms of productivity.

Java's true utility comes from its trusted extensions. Java EE, for example, isn't "core" Java and never was even when Sun owned it. And now it's property of the Apache Foundation. Which brings you such venerable projects as the Tomcat webapp server, innumerable support libraries, including the Apache Commons CSV parser, and much more. Many of those libraries, in fact, form components of other Apache components. For example, Tomcat is built using the Apache Digester, Commons pooling, and probaby a lot of the JNDI support. Big-name commercial products like WebLogic server and Jira even though proprietary are laced with "non-core" products like the ANTLR parser.

The alternative to using ready-made is to re-invent it all yourself. Which requires more manpower, more time to develop, maintain, and support. All of which suck resources out of your core business projects.

While it's certainly not safe to just pick up libraries off the street, there are certain resources that have major weight behind them, sources like the Apache Foundation, IBM Red Hat/hibernate and others. These projects have serious support behing them and can be readily obtained from secure sources like the Maven Repository.

There's being prudent and then there's being penny-wise and pound-foolish. While it would be wise to keep a review team meeting occasionally to vet the safety and in some cases licensing issues before approving library use, throwing out the baby, bathwater and tub is a good way to end up at the mercy of the competition at best.

Having said all that, I'm reminded of a major corporation that provides downloadable financial information in CSV form. Their algorithm is an abomination and having downloaded the file, the first thing I can do before feeding it to a spreadsheet or database is run it through a custom cleanup program to repair all the mis-uses of quotes and commas in what it supposed to be a simple data file format. The data at the head of this thread is a lot like that.
 
Phillip Powell
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Paul Clapham wrote:The code is easy enough to write. Read the file one character at a time and use a flag to tell whether you're inside a quote. Then for each character: if it's not quote or end-line, write it to the output. If it's end-line, write end-line or space depending on whether you're inside a quote. If it's quote, write it and flip the flag.



That's what I wound up doing and getting it to work, with further guidance from this page.

Thanks for the help!

 
Carey Brown
Saloon Keeper
Posts: 10930
87
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows ChatGPT
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You have an  error on line 13, should be 'true'.

 
Phillip Powell
Ranch Hand
Posts: 74
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Line 11 was a typo; it should have been

 
Paul Clapham
Sheriff
Posts: 28329
97
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Phillip Powell wrote:It's not a course but a company requirement


Those packages which you can download to process CSV and so on are actually all written in core Java. (Whatever that means, Java is all core Java to me.) I'm guessing that your company requirement is "No code written by other people", is that right?
 
Tim Holloway
Saloon Keeper
Posts: 28321
210
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Likes 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You can call me a cynic, but I'd wager that a primary reason that your incoming data is such a mess is that it was produced using only "core" Java instead of tested publicly-available Java resources.
 
Mike Simmons
Master Rancher
Posts: 5060
81
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I agree with the cynic here, excellent point.
 
Ranch Hand
Posts: 79
3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It is possible to use a shell script to insert an empty line (or any other unique separator) to separate the records. Then reading such file will be easier.
Unless it is forbidden too.
 
Our first order of business must be this tiny ad:
Gift giving made easy with the permaculture playing cards
https://coderanch.com/t/777758/Gift-giving-easy-permaculture-playing
reply
    Bookmark Topic Watch Topic
  • New Topic