• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Writing a faster String split.

 
Greenhorn
Posts: 26
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I wrote a simple split method to save time.





It's much faster than .split and Tokenizer, but I doubt it's optimal.. Either way, I had what I thought was a good idea..
In stead of using a temp array and copying every String into a result array, I wanted to store the size of the array in the last cell, like this:



and just return the temp array.. when iterating over the array later thought I could use



and just ignore the empty cells.

However, this doesn't work for some reason... Am I missing something really obvious here?
 
Master Rancher
Posts: 4796
72
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hmmm... I don't see the problem offhand, but you haven't shown the complete current code, right? You've showed an earlier version of the code and then show and described some changes. Let's see what you've actually got now.

Also, in what way does it "not work"? Throw an exception? Do nothing? Give results that are not what you expect? Can you show us a sample of how you call it, what result you expect, and what result you get?

This smells strongly of premature optimization. It's possible you might really, really need the extra speed. But I think it's pretty unlikely in most cases. It may be a fun programming challenge to get this as fast as you can, as a learning experience. But if it's for work, there are probably more productive ways to spend your time.

I think the biggest concern, however, is that the idea of storing the length as a string at the end of the array just seems very error prone. Users (other programmers) will not in general expect that sort of thing unless they've carefully read your documentation - and frankly many people won't read it until *after* it's blown up in their faces. If you really want to do something like this, just to avoid a single array copy, I might suggest creating a new class to wrap the results for the user, and make it easier for them to use the results. Something like this:

Actually as I think about it, you can just use existing utilities to create a List like this:
 
David Phluphy
Greenhorn
Posts: 26
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Mike Simmons wrote:Hmmm... I don't see the problem offhand, but you haven't shown the complete current code, right? You've showed an earlier version of the code and then show and described some changes. Let's see what you've actually got now.

Also, in what way does it "not work"? Throw an exception? Do nothing? Give results that are not what you expect? Can you show us a sample of how you call it, what result you expect, and what result you get?

This smells strongly of premature optimization. It's possible you might really, really need the extra speed. But I think it's pretty unlikely in most cases. It may be a fun programming challenge to get this as fast as you can, as a learning experience. But if it's for work, there are probably more productive ways to spend your time.



Thanks for your reply. It is a fun challenge, nothing more
I found the error. If I wanted to split a string with single letters, I was overwriting the last letter. I changed

String[] temp = new String[line.length()/2]; to String[] temp = new String[(line.length()/2)+1]; , and now it works.

It was actually a little slower than my original code though
 
Marshal
Posts: 79153
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
That is completely different from the java.lang.String.split(java.lang.String) method. You are splitting on a single character, whereas the built-in method uses a regular expression to split on. It obviously takes much longer to match a regular expression than find a single character.

I think this is a more difficult question than we usually get for "beginning", so shall move this thread.
 
David Phluphy
Greenhorn
Posts: 26
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Sorry about that, as you know I'm fairly new here, so I don't really know what goes where yet

I know it's different from .split, which is why I wrote it. Most problems I've encountered require that I split by single characters, so this method is usually all I need.

I tried skipping the split altogether and parsing the string directly, but I couldn't notice any difference in performance.. I guess I can't do it much faster than this? ^^
 
author and iconoclast
Posts: 24207
46
Mac OS X Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The storing-the-length-at-the-end bit seems unnecessary and as Mike said, error-prone. I would think the most performant thing to do would be to return the String array with possibly some extra nulls at the end, and the user is just supposed to check for them; i.e.

>
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic