File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes How to compress String? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "How to compress String?" Watch "How to compress String?" New topic
Author

How to compress String?

Chetan Parekh
Ranch Hand

Joined: Sep 16, 2004
Posts: 3636
I have a String variable that contains some data and I want to compres it.

How can I achieve the same?

I don't want to store in file, I want to compress it in varible only.


My blood is tested +ve for Java.
Stuart Ash
Ranch Hand

Joined: Oct 07, 2005
Posts: 637
Originally posted by Chetan Parekh:
I have a String variable that contains some data and I want to compres it.

How can I achieve the same?

I don't want to store in file, I want to compress it in varible only.


Does the String contain some unutterable utterances by yourself?

I guess, creating a StringWriter wrapped into a Zip Writer or something will do? Something like create a ZipOutputStream wrapped into an OutputStream that can give you a String, say a ByteOutputStream. Do the compress, and then do a toString and store the result in a variable.


ASCII silly question, Get a silly ANSI.
Chetan Parekh
Ranch Hand

Joined: Sep 16, 2004
Posts: 3636
Originally posted by Stuart Ash:
Does the String contain some unutterable utterances by yourself?

Please simplify what you want to convey in above sentence.
[ December 26, 2005: Message edited by: Chetan Parekh ]
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
Remember String is unicode, so it's using 2 bytes of memory for each character. You could just use char[] and save half. At least until you want to turn them back to string for output.

There are some simple techniques that work with strings with certain characteristics. For example I used to work with mainframe records with lots of repeated spaces and zeros and such. I replaced 5-to-99 repeated characters with an escape, a counter and the character so x999999x becomes x!069x. It got dramatic compression with very low overhead but only because the data had extensive repetition.

I worked in one language that had six-bit characters. If you can get by with only 64 characters you can compress your char[] to a non-standard byte[] and compress 25%. I think I'm only kidding. Unless it sounds useful to you.


A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Scott Selikoff
author
Saloon Keeper

Joined: Oct 23, 2005
Posts: 3716
    
    5

Whats the purpose of the compression? Size of the string or encoding it securely? What is the range of elements in the string, such as the character set?


My Blog: Down Home Country Coding with Scott Selikoff
u johansson
Ranch Hand

Joined: Dec 27, 2005
Posts: 47
The String contains some data.

It's so BASIC'ish.

Split it up. Put the data where they belong. Introduce types (classes and interfaces).
u johansson
Ranch Hand

Joined: Dec 27, 2005
Posts: 47
Any compression requires data with similarities. If your data hasn't got any - you cannot compress it.
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by u johansson:
Any compression requires data with similarities. If your data hasn't got any - you cannot compress it.


That is not true. Huffman coding, for example, doesn't need any similarities, but compresses texts by encoding more often used characters with shorter bit sequences.


The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by Stan James:
Remember String is unicode, so it's using 2 bytes of memory for each character. You could just use char[] and save half.


Isn't a char two bytes in Java?
u johansson
Ranch Hand

Joined: Dec 27, 2005
Posts: 47
Sorry, Ilja Preuss, Huffman coding is based on similarites.
Rick O'Shay
Ranch Hand

Joined: Sep 19, 2004
Posts: 531
Originally posted by Chetan Parekh:
I have a String variable that contains some data and I want to compres it.

How can I achieve the same?

I don't want to store in file, I want to compress it in varible only.


More generally, how does one compress data. Let's assume you want to recover the information at some later date. This rules out compresssion schemes that can squeeze an infinite amounts of data in zero bytes. Let's assume you need full recovery. Use tools in the java.util.zip package.
Ilja Preuss
author
Sheriff

Joined: Jul 11, 2001
Posts: 14112
Originally posted by u johansson:
Sorry, Ilja Preuss, Huffman coding is based on similarites.


Well, then your definition of "similarities" must be quite wide - wide enough that almost any data contains some. Not sure how your argument is going to help, then...
Chetan Parekh
Ranch Hand

Joined: Sep 16, 2004
Posts: 3636
Reason to compress data is:

I have an Applet that get CSV value as String from Servlet and I need to compress it.

Is there any other way to pass data in a compressed format between Applet and Servlet?
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
By "compress" we've all meant to reduce a string in size. That doesn't sound like a solid part of your requirements and from that last post I'm not entirely sure that's what you meant.

Compression is done some times to reduce the number of bytes sent over the network. I wouldn't do it until somebody can prove I'm sending too many bytes over the network ... stressing some part of the network or experiencing very long response time. I mentioned the simple compression scheme I did for mainframe data. It gave no performance gain and was not put into production. It was great fun, but only proved that data size was NOT the problem.

What does your data look like? Could you show a sample? How big is the string you're dealing with?
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: How to compress String?