• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

ascii to number conversion in java

 
luri ron
Ranch Hand
Posts: 87
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
is there library out there that convert an ascii byte array to a java primitive data type such as double, int, long?
 
David Newton
Author
Rancher
Posts: 12617
IntelliJ IDE Ruby
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If it's bytes (already a primitive) why would you need a library to convert it to another, different primitive? It's just a new array and a single loop, no?
 
Rob Spoor
Sheriff
Pie
Posts: 20527
54
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Or even simpler using ByteBuffer:
 
David Newton
Author
Rancher
Posts: 12617
IntelliJ IDE Ruby
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I didn't even know that existed... something new every day 'round these parts.
 
luri ron
Ranch Hand
Posts: 87
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
this would work binary byte array but not ascii byte array. i am looking for ascii to number. something like c function atoi.
 
David Newton
Author
Rancher
Posts: 12617
IntelliJ IDE Ruby
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What do you think ASCII is?
 
Matt Cartwright
Ranch Hand
Posts: 152
Linux VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
well, atoi() converts (char)0x30 to (int) 0, (char)0x31 to (int) 1
through to 0x39.

so before using the wrap() method you would need to
subtract 48 from each byte.

 
Matt Cartwright
Ranch Hand
Posts: 152
Linux VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Matt Cartwright wrote:well, atoi() converts (char)0x30 to (int) 0, (char)0x31 to (int) 1
through to 0x39.

so before using the wrap() method you would need to
subtract 48 from each byte.



all hat no cattle

that dog won't hunt

the following code:


gives:


Exception in thread "main" java.nio.BufferUnderflowException
at java.nio.Buffer.nextGetIndex(Buffer.java:480)
at java.nio.HeapByteBuffer.getInt(HeapByteBuffer.java:336)
at aero.tekware.ranch.CLike.atoi(CLike.java:13)
at aero.tekware.ranch.CLike.main(CLike.java:18)


it get's its knickers in a twist in java.nio.HeapByteBuffer
 
Henry Wong
author
Marshal
Pie
Posts: 21115
78
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Matt Cartwright wrote:well, atoi() converts (char)0x30 to (int) 0, (char)0x31 to (int) 1
through to 0x39.

so before using the wrap() method you would need to
subtract 48 from each byte.



If the bytes in the array are holding characters, then ByteBuffer won't work. Your best option is to loop through the array, first delete '0', as already mentioned. And doing an add after multiply by 10.

Henry
 
Matt Cartwright
Ranch Hand
Posts: 152
Linux VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Henry Wong wrote:
Matt Cartwright wrote:well, atoi() converts (char)0x30 to (int) 0, (char)0x31 to (int) 1
through to 0x39.

so before using the wrap() method you would need to
subtract 48 from each byte.



If the bytes in the array are holding characters, then ByteBuffer won't work. Your best option is to loop through the array, first delete '0', as already mentioned. And doing an add after multiply by 10.

Henry


and perform a circular shift to speed up multiplication?

I think the easiest way to implement the exact atoi() behaviour is:
 
luri ron
Ranch Hand
Posts: 87
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
atoi is simple. The complicated one is ASCII to double.
I looped through the ASCII code. For the decimal part, I divided it by 10
and then add it. But I got double precision error. For example, for double ASCII
code 0.0007, I got 6.999999e-4. Anyway to get 0.0007?
 
luri ron
Ranch Hand
Posts: 87
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Matt, if you use java Double,Integer to get the number value,
you will creat at least temp objects such as string and floatingdecimal. If you are parsing
million of these, the performance won't be good.
 
Matt Cartwright
Ranch Hand
Posts: 152
Linux VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
you will creat at least temp objects such as string and floatingdecimal. If you are parsing
million of these, the performance won't be good


I assume you are referring to the passing of method arguments
by value. So, if you wanna do it manually you might wanna
have a look at

and regarding performance, my computer does a million
of these atoi() calls in 108 milliseconds

What is your target?

Matt
 
Matt Cartwright
Ranch Hand
Posts: 152
Linux VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
luri ron wrote:atoi is simple. The complicated one is ASCII to double.
I looped through the ASCII code. For the decimal part, I divided it by 10
and then add it. But I got double precision error. For example, for double ASCII
code 0.0007, I got 6.999999e-4. Anyway to get 0.0007?


could you please provide the code in question and some stack trace?
 
Jeff Hansen
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Luri, did you ever find what you were looking for? I think I'm looking for the same thing -- some nice bin utils for converting an ascii (or even utf8) input stream of bytes directly to the appropriate expected primitive values without creating any objects per operation (static reuseable buffer objects are fine). I feel like there has to be some utility classes for this in the apache commons, just can't find them. Also wondering if maybe google's protocol buffers can do this -- haven't spent enough time looking in that direction. Please let me know if you found anything.

Matt, It's not just about the speed, it's about the memory. Creating all those local String objects fills up the heap. If you're parsing a REALLY HUGE ascii comma delimited file of numbers and you go with the lazy approach of using a LineReader that passes you back each line as a String, then parse the String with substring, regex or StringTokenizer as the core java API pretty much wants to force you to do, you end up filling up the heap with String objects and char[] arrays. If you're not careful, for every byte of input it's easy to generate 40 or more bytes of temporary object on the heap. All that junk on the heap eventually has to get garbage collected. When you're running through a couple PetaBytes of input, it makes a difference to come up with a utility method that works in constant memory. It's not that hard to create your own, but wouldn't it be nice if there was already one available so we don't all have to reinvent the wheel?
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic