aspose file tools*
The moose likes Java in General and the fly likes Parsing encoded data? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Parsing encoded data?" Watch "Parsing encoded data?" New topic
Author

Parsing encoded data?

Alan Wake
Greenhorn

Joined: Jul 23, 2011
Posts: 4
I have some torrent file with list of announce urls, f.e. this is the part of it:

So here is one array with key «announce-list» which contains three elements (bencoded data, http://en.wikipedia.org/wiki/Bencode). So I am using BDecoder.java class from Aeltis to decode it. While parsing I am getting the next values of Map:

So announce list filled with some hashes. So how can I convert it to normal string (such as «http://iptorrents.com:2790/b6d18a815ab4421a86de672d6833369d/announce»)? Or it's some algorithm issue in BDecoder.java?

This is the method of upper class to decode data: http://pastebin.com/HimqF0ye
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19762
    
  20

Welcome to the Ranch!

All these parts that start with [B are byte arrays. When you print a byte[], its toString() method will print the type name ([B) followed by an @ and the hexadecimal hash code. This is how you get to something like [B@141d683.
created by=[B@141d683 => byte[]
announce=[B@16a55fa => byte[]
encoding=[B@32c41a => byte[]
announce-list=[[[B@e89b94], [[B@13e205f], [[B@1bf73fa]] => see below
comment=[B@5740bb => byte[]
creation date=1310060702 => int or long, probably long
info={pieces=[B@5ac072, name=[B@109a4c, length=34209795, piece length=65536, private=1} => see below

First of all, the announce list. You see not one but three starting [ characters. If used without closing ] character that means another array dimension. For example, [[B@e89b94 would be a byte[][] and [[[B@e89b94 a byte[][][]. However, there are closing ] characters. That means that it's not another array dimension, but a List, Set or other Collection; these are usually printed as [] with the elements inside separated by a comma and space. Matching the brackets you get this structure (using Collection since List and Set both extend Collection):
- Collection
-- Collection with one byte[]
-- Collection with one byte[]
-- Collection with one byte[]

Now, info. The {} with inside key=value pair usually indicates a Map. In this case it has the following entries:
- pieces => byte[]
- name => byte[]
- length => Integer or Long
- piece length => Integer or Long
- private => Byte, Short, Integer or Long

Afterwards, I'm guessing that you can turn each byte[] into a String, by using the right String constructor and providing the right encoding. Which one that is I can't tell you, but I'd probably start with UTF-8.


SCJP 1.4 - SCJP 6 - SCWCD 5 - OCEEJBD 6
How To Ask Questions How To Answer Questions
Alan Wake
Greenhorn

Joined: Jul 23, 2011
Posts: 4
Rob Spoor wrote:Welcome to the Ranch!

All these parts that start with [B are byte arrays. When you print a byte[], its toString() method will print the type name ([B) followed by an @ and the hexadecimal hash code. This is how you get to something like [B@141d683.
created by=[B@141d683 => byte[]
announce=[B@16a55fa => byte[]
encoding=[B@32c41a => byte[]
announce-list=[[[B@e89b94], [[B@13e205f], [[B@1bf73fa]] => see below
comment=[B@5740bb => byte[]
creation date=1310060702 => int or long, probably long
info={pieces=[B@5ac072, name=[B@109a4c, length=34209795, piece length=65536, private=1} => see below

First of all, the announce list. You see not one but three starting [ characters. If used without closing ] character that means another array dimension. For example, [[B@e89b94 would be a byte[][] and [[[B@e89b94 a byte[][][]. However, there are closing ] characters. That means that it's not another array dimension, but a List, Set or other Collection; these are usually printed as [] with the elements inside separated by a comma and space. Matching the brackets you get this structure (using Collection since List and Set both extend Collection):
- Collection
-- Collection with one byte[]
-- Collection with one byte[]
-- Collection with one byte[]

Now, info. The {} with inside key=value pair usually indicates a Map. In this case it has the following entries:
- pieces => byte[]
- name => byte[]
- length => Integer or Long
- piece length => Integer or Long
- private => Byte, Short, Integer or Long

Afterwards, I'm guessing that you can turn each byte[] into a String, by using the right String constructor and providing the right encoding. Which one that is I can't tell you, but I'd probably start with UTF-8.

Yep it works with String constructor such as:

So the result working code is:

Where "m" is map of torrent data from torrent file such as announce urls, pieces and so on. This code works for one announce url from announce list because I need just only one.
But I don't understand why in this class the error happens when I change this lines:

Where tempElement contains byte array to this:

Here is hole method:

In tempList there just only one string and somewhere it's brokes and don't go to return address. So I can't describe it you should see it ones and you'll understand what I mean. This is brokes just when I changes those three lines.
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19762
    
  20

What error are you getting? Please post the full stack trace.
Alan Wake
Greenhorn

Joined: Jul 23, 2011
Posts: 4
Rob Spoor wrote:What error are you getting? Please post the full stack trace.

Hm there are no any exceptions happens but it suspend running at some moment.
 
 
subject: Parsing encoded data?