Win a copy of Clojure in Action this week in the Clojure forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

writing delimiters with csv files

 
Sunetra Sen
Ranch Hand
Posts: 43
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
I have to write a csv file for which the delimiter is to be picked up from a config file.
Now I am facing a problem if the delimiter is a special character like \t.
As the config file is read and values returned as Strings so the \t is getting printed as such without getting converted to a tab spacing.
Please let me know if any of you have an idea on how to tackle this situation. :roll:

Thanks,
Sunetra.
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Interesting. I think you'll have to test for the "slash followed by another character" delimiter and convert it. You might make a map of supported characters, or try

dlm = " ".replace(" ", "/" + dlm );

The hard-coded "/" plus the one in the config file makes "//t" which regex will recognize as tab. Maybe. Try it and let me know. If it works it should accept anything that regex can recognize with a leading /
 
Ryan McGuire
Ranch Hand
Posts: 1048
4
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Given exactly the constraints you've given, I think the easiest way would be to convert the delimiter as read from the config file to the one you need to use. You might do this with a series of replaceAll() calls on the string you read in, or you might set up a HashMap or whatever if you have to do that conversion many times.

Of course, if "\t" is the only escaped delimiter you'll ever see and all others will be single characters, then you only have to do one conversion.



Do you have control of the config file. If so, you might encode the delimiter differently.
  • If the file is XML, you could use CDATA or a single encoded char for the delimiter: <delimiter> </delimiter>
  • Could you put the ascii code for the delimiter in the file instead? delimiter=09
  • You could encode the delimiter using some scheme for which a decoder already exists. For instnce a URLDecoder() will take %09 and convert it to a TAB. (%09 is ok to have as an XML value without escaping any of the chars.)
  • If the file format is readable by a Properties object, then just use Properties.load(). That automatically does \t conversion.


  • But I don't know of anything built into the base classes that will convert \t to a TAB, \f to a FORMFEED, etc. in a String that's already in memory without enumerating all the ones you might see in your application.

    Stan:
    dlm = " ".replace(" ", "/" + dlm );

    I think you meant replaceAll() or replaceFirst(). Also, I think you were thinking of a '\' instead of a '/'.

    Stan:
    The hard-coded "/" plus the one in the config file makes "//t" which regex will recognize as tab. Maybe. Try it and let me know. If it works it should accept anything that regex can recognize with a leading /


    I'm not positive it won't work, but I don't think String addition really works like that. If you change your source code to...
    dlm = " ".replace(" ", "\" + dlm);
    ...the compiler would complain that you have an open String literal because it would start think that the \" sequence meant you wanted your string to start quote-space-plus-space-dee-ell-emm-paren-semi. If you change it to...
    dlm = " ".replace(" ", "\\" + dlm);
    ...when dlm was read in as backslash-tee, you'll have dlm equals to backslash-backslash-tee. Ok, but then what would you do with it to replace that with a TAB? One of us must be missing something.

    Ryan
    [ May 20, 2005: Message edited by: Ryan McGuire ]
     
    Stan James
    (instanceof Sidekick)
    Ranch Hand
    Posts: 8791
    • 0
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Yeah, I was out to lunch on that one. You're right about doubling up the BACK slashes.
     
    Alan Moore
    Ranch Hand
    Posts: 262
    • 0
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    If you want to use replaceAll() to replace a backslash followed by a 't' with a tab character, you have to use four backslashes in the first argument:Also, as of JDK 1.5, String has a replace() method that takes two CharSequences in addition to the one that takes two char arguments. If you can use that method, it will cut down on the backslash madness:
     
    Sunetra Sen
    Ranch Hand
    Posts: 43
    • 0
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Thanks a lot Alan!!
    It works now.
    However I am not sure what difference the four back slashes make?
    Can you please explain?
    Thanks in advance,
    Sunetra.
     
    I agree. Here's the link: http://aspose.com/file-tools
    • Post Reply
    • Bookmark Topic Watch Topic
    • New Topic