tony roberts wrote:
Here is what I'm looking at. Essentially I need to know how to do this in Java.
This the C code:
Now I know Java doesn't have unsigned primitives so I'll have to bit shift or somehow convert the byte data into something that Java can deal with but I can't for the live of me figure it out.
So any advice would be super helpful and I might just stop me from tearing my hair out.
Thanks in advance
Tony
For example, the metadata 0x0018 is converted into the offset b'000000000011', and the length b'000'. The offset is '-4', computed by inverting the offset bits, treating the result as a 2's complement, and converting it to an integer.
tony roberts wrote:Thank you everyone for all your help.
I'm still getting myself super confused.
The problem I'm having is tying all my code together.
I guess it would help if I put down exactly what I need to do and then compare your answers to what I currently have.
I have a byte[] containing approx 64k of compressed data.
The first four bytes need to be read in as a 32-bit bitmask (I'm not sure of the endian, problem one)
tony roberts wrote:
I've also created a byteBuffer which is a wrapper for the byte[] -
we'll assume you mean bit position 32tony roberts wrote:
Then I shift through the bitmask to check if the bit is a zero or one. -
Indicator bit starts at 32 then gets decremented by one each run through a loop ends up at zero and then reads in another 4 byte bitmask.
if the data are written so that this is a short int, then here is case where the setting as big or little endian would make a difference, IF you read a short int from the ByteBuffer instead of getting two bytes from a byte array.tony roberts wrote:
If a zero is detected in the bitmask then the corresponding byte in the inputbuffer gets copied into an output buffer and the pointer locations get moved along one byte:
If a one is read the the next two bytes will be metadata, containg an offset and a length. So I read in the next two bytes into a seperate byte[]:
;
tony roberts wrote:
... rest of quote omitted for brevity...
tony roberts wrote:Hey Ralph
...
Let's say the metadata generates this: (15, 4). This means go back 15 bytes in the stream and copy 4 bytes from that position to the end of the stream. There is more processing to do if the length is 7 but I wont go into that as it's fairly annoying
tony roberts wrote:
The first four bytes of the input stream are the bitmask/bitarray, they are (hex) 00 01 B6 02;
To distinguish data from metadata in the compressed byte stream, the data stream begins with a 4-byte bitmask that indicates to the decoder whether the next byte to be processed is data (a "0" value in the bit), or if the next byte (or series of bytes) is metadata (a "1" value in the bit). If a "0" bit is encountered, the next byte in the input stream is the next byte in the output stream. If a "1" bit is encountered, the next byte or series of bytes is metadata that MUST be interpreted further.
Ralph Cook wrote:I did not know this feature existed -- thanks for leading me to it.
Reading the documentation for ByteBuffer, it seems to me you might be expecting it to do something it isn't going to do. From what I read, the method getInt() on that class will return an integer, reading the bytes in little or big-endian, as specified. But reading through things a byte at a time is not going to change according to the endian setting, because endian refers to the order of the bytes, and if you're only getting one, then it is just going to return the next one regardless of the endian setting. And if all you're going to do with the ByteBuffer is put it into a byte array, I don't think the endian setting of the byte buffer has any effect.
The bitmask must also contain a "1" in the bit following the last encoded element, to indicate the end of the compressed data. For example, given a hypothetical 8-bit bitmask, the string "ABCABCDEF" should be compressed as (0,0)A(0,0)B(0,0)C(3,3)D(0,0)E(0,0)F. Its bitmask would be b'00010001' (0x11). This would indicate three bytes of data, followed by metadata, followed by an additional 3 bytes, finally terminated with a "1" to indicate the end of the stream.
The final end bit is always necessary, even if an additional bitmask has to be allocated. If the string in the above example was "ABCABCDEFG", for example, it would require an additional bitmask. It would begin with the bitmask b'00010000', followed by the compressed data, and followed by another bitmask with a "1" as the next bit to indicate the end of the stream.
Did you see how Paul cut 87% off of his electric heat bill with 82 watts of micro heaters? |