wood burning stoves 2.0*
The moose likes Java in General and the fly likes Parsing encoded data? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of The Java EE 7 Tutorial Volume 1 or Volume 2 this week in the Java EE forum
or jQuery UI in Action in the JavaScript forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Parsing encoded data?" Watch "Parsing encoded data?" New topic
Author

Parsing encoded data?

Alan Wake
Greenhorn

Joined: Jul 23, 2011
Posts: 4
I have some torrent file with list of announce urls, f.e. this is the part of it:

So here is one array with key «announce-list» which contains three elements (bencoded data, http://en.wikipedia.org/wiki/Bencode). So I am using BDecoder.java class from Aeltis to decode it. While parsing I am getting the next values of Map:

So announce list filled with some hashes. So how can I convert it to normal string (such as «http://iptorrents.com:2790/b6d18a815ab4421a86de672d6833369d/announce»)? Or it's some algorithm issue in BDecoder.java?

This is the method of upper class to decode data: http://pastebin.com/HimqF0ye
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19679
    
  18

Welcome to the Ranch!

All these parts that start with [B are byte arrays. When you print a byte[], its toString() method will print the type name ([B) followed by an @ and the hexadecimal hash code. This is how you get to something like [B@141d683.
created by=[B@141d683 => byte[]
announce=[B@16a55fa => byte[]
encoding=[B@32c41a => byte[]
announce-list=[[[B@e89b94], [[B@13e205f], [[B@1bf73fa]] => see below
comment=[B@5740bb => byte[]
creation date=1310060702 => int or long, probably long
info={pieces=[B@5ac072, name=[B@109a4c, length=34209795, piece length=65536, private=1} => see below

First of all, the announce list. You see not one but three starting [ characters. If used without closing ] character that means another array dimension. For example, [[B@e89b94 would be a byte[][] and [[[B@e89b94 a byte[][][]. However, there are closing ] characters. That means that it's not another array dimension, but a List, Set or other Collection; these are usually printed as [] with the elements inside separated by a comma and space. Matching the brackets you get this structure (using Collection since List and Set both extend Collection):
- Collection
-- Collection with one byte[]
-- Collection with one byte[]
-- Collection with one byte[]

Now, info. The {} with inside key=value pair usually indicates a Map. In this case it has the following entries:
- pieces => byte[]
- name => byte[]
- length => Integer or Long
- piece length => Integer or Long
- private => Byte, Short, Integer or Long

Afterwards, I'm guessing that you can turn each byte[] into a String, by using the right String constructor and providing the right encoding. Which one that is I can't tell you, but I'd probably start with UTF-8.


SCJP 1.4 - SCJP 6 - SCWCD 5 - OCEEJBD 6
How To Ask Questions How To Answer Questions
Alan Wake
Greenhorn

Joined: Jul 23, 2011
Posts: 4
Rob Spoor wrote:Welcome to the Ranch!

All these parts that start with [B are byte arrays. When you print a byte[], its toString() method will print the type name ([B) followed by an @ and the hexadecimal hash code. This is how you get to something like [B@141d683.
created by=[B@141d683 => byte[]
announce=[B@16a55fa => byte[]
encoding=[B@32c41a => byte[]
announce-list=[[[B@e89b94], [[B@13e205f], [[B@1bf73fa]] => see below
comment=[B@5740bb => byte[]
creation date=1310060702 => int or long, probably long
info={pieces=[B@5ac072, name=[B@109a4c, length=34209795, piece length=65536, private=1} => see below

First of all, the announce list. You see not one but three starting [ characters. If used without closing ] character that means another array dimension. For example, [[B@e89b94 would be a byte[][] and [[[B@e89b94 a byte[][][]. However, there are closing ] characters. That means that it's not another array dimension, but a List, Set or other Collection; these are usually printed as [] with the elements inside separated by a comma and space. Matching the brackets you get this structure (using Collection since List and Set both extend Collection):
- Collection
-- Collection with one byte[]
-- Collection with one byte[]
-- Collection with one byte[]

Now, info. The {} with inside key=value pair usually indicates a Map. In this case it has the following entries:
- pieces => byte[]
- name => byte[]
- length => Integer or Long
- piece length => Integer or Long
- private => Byte, Short, Integer or Long

Afterwards, I'm guessing that you can turn each byte[] into a String, by using the right String constructor and providing the right encoding. Which one that is I can't tell you, but I'd probably start with UTF-8.

Yep it works with String constructor such as:

So the result working code is:

Where "m" is map of torrent data from torrent file such as announce urls, pieces and so on. This code works for one announce url from announce list because I need just only one.
But I don't understand why in this class the error happens when I change this lines:

Where tempElement contains byte array to this:

Here is hole method:

In tempList there just only one string and somewhere it's brokes and don't go to return address. So I can't describe it you should see it ones and you'll understand what I mean. This is brokes just when I changes those three lines.
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19679
    
  18

What error are you getting? Please post the full stack trace.
Alan Wake
Greenhorn

Joined: Jul 23, 2011
Posts: 4
Rob Spoor wrote:What error are you getting? Please post the full stack trace.

Hm there are no any exceptions happens but it suspend running at some moment.
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: Parsing encoded data?