the trailboss abuses his CodeRanch power for his other stuff (power corrupts. absolute power corrupts absolutely is kinda neat!)
permaculture light bulbs permaculture electric heat permaculture cast iron permaculture wood burning stove permaculture solar food dehydrators
The moose likes Meaningless Drivel and the fly likes Book indexing in bad shape – your help needed Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Other » Meaningless Drivel
Bookmark "Book indexing in bad shape – your help needed" Watch "Book indexing in bad shape – your help needed" New topic
Author

Book indexing in bad shape – your help needed

Martha Simmons
Ranch Hand

Joined: Jun 24, 2008
Posts: 130
How often do you use book indexes? And why “not very often”, I wonder? One of my reasons is that the probability of an indexer choosing the same word I have in mind isn't very high.

Or is it?

Turned out, nobody knows. There are not many index usability studies, and yesterday I used “ask a librarian” service of the Library of Congress, to search for any study on what keywords readers use when searching a book index. Their answer:

“We can find no such study.”

Well, not any more!

Today you have a chance to participate in our own, very first study on readers searching book indexes. It's very short. Just one question. Here:
http://www.surveymonkey.com/s/DN3Y6S5

(I would ask it in this esteemed forum, but I didn't want your responses to influence later researchers. ;)

The results, of course, will be posted in this thread for further study...
Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 30085
    
149

Very short as promised. Nice.


[Blog] [JavaRanch FAQ] [How To Ask Questions The Smart Way] [Book Promos]
Blogging on Certs: SCEA Part 1, Part 2 & 3, Core Spring 3, OCAJP, OCPJP beta, TOGAF part 1 and part 2
Sumit Bisht
Ranch Hand

Joined: Jul 02, 2008
Posts: 329

Interesting study, keep us posted.
Martin Vajsar
Sheriff

Joined: Aug 22, 2010
Posts: 3606
    
  60

Do you want people for whom English is not the first language to take part in this? I'd say that might skew the results (as a foreigner, I'd have to take a look into a dictionary for a few synonyms).
Hesham Gneady
Ranch Hand

Joined: Feb 26, 2007
Posts: 66
You are right ... It depends on the Indexer to choose the best suitable words. He must think as the book reader to choose the right words.
But i still use the Index a lot if i do not want to browse all the book, and just check a specific topic ... may be i will be lucky


Hesham
John Jai
Bartender

Joined: May 31, 2011
Posts: 1776
lol... learned a new word 'Plagiarism'. It resembles a bit to this thread
Martha Simmons
Ranch Hand

Joined: Jun 24, 2008
Posts: 130
Martin:
Do you want people for whom English is not the first language to take part in this? I'd say that might skew the results (as a foreigner, I'd have to take a look into a dictionary for a few synonyms).


Almost 12% of US population is foreign-born, so we, foreigners, are a legitimate part of audience. Dictionary should be Ok if you realistically would use it when using an index.

Hesham: You are right ... It depends on the Indexer to choose the best suitable words. He must think as the book reader to choose the right words.

Yes, and this is a big problem. Traditionally, indexers use author's terminology, whatever it is, in their indexes, and then (perhaps) add 2-3 alternative terms with “See” reference pointing to the word the author used. How they decide if more variant synonyms are needed and what they should be? Each indexer makes his/her own subjective decision. And there is no feedback to improve these decisions over time. That's why I started my home-made research.

But i still use the Index a lot if i do not want to browse all the book, and just check a specific topic ... may be i will be lucky
It's easier with computer books, because terminology is more restricted. JSP is JSP is JSP. How many synonyms would you use to look up info on JSP? So yeah, we are lucky.
Martha Simmons
Ranch Hand

Joined: Jun 24, 2008
Posts: 130
Intermediate results:

We have 12 responses, 43 non-unique keywords, 23 unique. Overlap (words used by more than one researcher): 4 words.

Pretty amazing. I expected around 10 unique keywords, but 23...!

I'll wait a couple more days and post the list here.
Martha Simmons
Ranch Hand

Joined: Jun 24, 2008
Posts: 130
Results:

We have 15 responses and 49 non-unique search terms (that's 3.27 terms per respondent on average), and 25 unique. Here is the list with frequencies in parentheses:

consequences
conviction
copy copying
copyright (2)
crime
disciplinary
end user
expulsion
fair use
fine
intellectual property
jail
legal penalties for plagiarism
penal code
penalty (5)
plagiarism (10)
plagiarism and the law
proof
punishment (9)
sanctions
sentence, sentencing
terms and conditions
tools
violation
what is plagiarism

Conclusion: readers search indexes with much broader variety of search terms than indexes offer. How readers are supposed to guess which term is in the index remains a mystery. More experiments needed...
John Jai
Bartender

Joined: May 31, 2011
Posts: 1776
Martha Simmons wrote:what is plagiarism

Somebody mistakenly searched in the survey and not in Google
Martin Vajsar
Sheriff

Joined: Aug 22, 2010
Posts: 3606
    
  60

How many of the 15 respondents did not use either penalty or punishment?

Maybe if the book contained these two entries in the index, it might cover most of the searches. I didn't include plagiarism as it does not seem very useful to me; the text as a whole is about plagiarism. If it actually was in the index, it would probably refer to a page where the term was defined.
dennis deems
Ranch Hand

Joined: Mar 12, 2011
Posts: 808
Martha Simmons wrote:Conclusion: readers search indexes with much broader variety of search terms than indexes offer. How readers are supposed to guess which term is in the index remains a mystery. More experiments needed...

I think Martin's observation is most astute: a book's content defines a context in which the utility of any indexed term must be weighed. Imagine a vegetarian cookbook that indexed the word "vegetarian".

There is considerable confusion in the public consciousness between the quite distinct concepts of plagiarism and copyright violation. It is no surprise to see evidence of this confusion in the survey results, but the fact that confusion is widespread does not mean an index should be guided by it.
Martha Simmons
Ranch Hand

Joined: Jun 24, 2008
Posts: 130
John Jai: Somebody mistakenly searched in the survey and not in Google

Actually, this is a very valid point: I would be surprised if our Google-induced searching habits did not affect our index-searching habits. And unfortunately, not in a positive direction. When searching on Google we can be sloppy: use whatever words came to mind and hope that we will find better, more precise terms browsing the results. Indexes don't work this way, at least not yet. :-)

Martin Vajsar: How many of the 15 respondents did not use either penalty or punishment?

Five. "Punishment" was an artifact of how I formulated the question, though. Whichever word I used, it would be used more often than normally.

Martin Vajsar: Maybe if the book contained these two entries in the index, it might cover most of the searches.

That would be a good solution – if only indexers knew what terms readers choose most often. I was looking for this kind of statistics, but so far couldn't find much, besides "Google Insights for Search" service (http://www.google.com/insights/search/). I asked Safari online library if they have some publicly (or on other conditions) available logs of searches, and they answered, "no". Me keeps searching...

Martin Vajsar: I didn't include plagiarism as it does not seem very useful to me; the text as a whole is about plagiarism. If it actually was in the index, it would probably refer to a page where the term was defined.

Dennis Deems: I think Martin's observation is most astute: a book's content defines a context in which the utility of any indexed term must be weighed. Imagine a vegetarian cookbook that indexed the word "vegetarian".


It *is* astute! Indeed indexers are warned against indexing the whole book under "metatopic entry". On the other hand, there is a study that found that readers use the metatopic entry as a table-of-contents – as our little study confirms. IMHO, we should use this usability bug... feature to *design* a metatopic entry as a TOC: in our case to put whatever term an indexers chose (let's say "sanctions") under "plagiarism":

plagiarism
    etymology
    definition
    ...
    sanctions

In our case 10 out of 15 participants used "plagiarism", which gives us pretty good coverage.

Dennis Deems: There is considerable confusion in the public consciousness between the quite distinct concepts of plagiarism and copyright violation. It is no surprise to see evidence of this confusion in the survey results, but the fact that confusion is widespread does not mean an index should be guided by it.

Guided -- no, but I think it's useful to know and help these readers by a cross-reference. Something like this:

copyright violation
...
...See also plagiarism



-------------------------
Olason, Susan C. 2000. "Let's get usable! Usability studies for indexes." The Indexer 22(2): 91-95. http://www.asindexing.org/files/DTTF/Lets_ get_usable.pdf
dennis deems
Ranch Hand

Joined: Mar 12, 2011
Posts: 808
This work you're doing is fascinating. I'll be interested to see how it develops. Thanks for sharing with us!
Matthew Brown
Bartender

Joined: Apr 06, 2010
Posts: 4343
    
    8

Martha Simmons wrote:It *is* astute! Indeed indexers are warned against indexing the whole book under "metatopic entry". On the other hand, there is a study that found that readers use the metatopic entry as a table-of-contents – as our little study confirms. IMHO, we should use this usability bug... feature to *design* a metatopic entry as a TOC: in our case to put whatever term an indexers chose (let's say "sanctions") under "plagiarism":

plagiarism
    etymology
    definition
    ...
    sanctions


That makes sense to me. From a purely personal perpective, I've tended to find indexes most useful in two cases:
- I'm looking for something very specific and unambiguously named (especially a technical term)
- I'm able to quickly identify a relevant subsection of the index (not necessarily a metatopic, but a major concept) that's small enough for me to browse effectively.


And I'd agree with Dennis - very interesting work. My partner's recently been creating an index for her book - I think she'll be interested as well.
 
 
subject: Book indexing in bad shape – your help needed
 
Similar Threads
OutOfMemoryError While loading the indexes
Book Indexes
New SCJP Book
What is the best way to transmit the find criteria from View to Model and to Business
SQL Performance Tuning - Release Announcement - Addison-Wesley