I keep hearing this about using hasNext() instead of hasNextLine() so I had to try it for myself. I purposely made a txt file without a new-line (0d 0a) on the last line. Verified it with a hex dump. And scanned it with both hasNext() and hasNextLine() and both work identically.
Campbell Ritchie wrote: I would prefer to use hasNext() rather than hasNextLine(). That obviates any problems you might have if the file finishes with an empty line.
I didn't intentionally write an empty line, but maybe it was added regardless. I can imagine all sorts of ways to make that code empty line resistant, one of which is to change hasNextLine() to hasNext(). This is what the names file looks like:-I didn't see the empty line, but it is obviously there, or hasNextLine() would return false.
jshell> class EmptyLineDemo
...> public static void main(String... args)
...> new EmptyLineDemo(args).readLines();
...> private final String fileName;
...> EmptyLineDemo(String fileName)
...> this.fileName = fileName;
...> void readLines()
...> int count = 1;
...> try (Scanner scan = new Scanner(Paths.get(fileName)))
...> while (scan.hasNextLine())
...> System.out.printf("line %d has %s-%s%n", count++,
...> scan.next(), scan.next());
...> catch (IOException ex)
| modified class EmptyLineDemo
line 1 has Gurtej-Grewal
line 2 has Carey-Brown
line 3 has Campbell-Ritchie
| Exception java.util.NoSuchElementException
| at Scanner.throwFor (Scanner.java:937)
| at Scanner.next (Scanner.java:1478)
| at EmptyLineDemo.readLines (#2:23)
| at EmptyLineDemo.main (#2:5)
| at (#3:1)
Scanner does lots of things with regular expressions behind the scenes; it even says that in its documentation:-
. . . hasNextLine() . . . there was to explicit code for handling end of file. The method instead relies on a regular expression that matches up to '$'. . . .
It took me a long time to realise that the adjectives associate to the left not to the right; it is the text that is described as simple, presumably as opposed to HTML/XML tags. What's EOF in hex? Wikipedia isn't that helpful about that question, but suggests sometimes −1 is used. I presume there isn't an EOF appended to any of the files. Obviously the OS has a mechanism for reporting an end of file situation, even if it doesn't use an EOF character. You can find “$” in the Pattern documentation.
A simple text scanner which can parse primitive types and strings using regular expressions.