As a computer scientist I have often marvelled at the fact that GCC can make use of the same backend for a number of frontends. Does the book go into any detail about how to make use of this separation, i.e., how to use the frontend w/o the backend or vice-versa? An example would be a frontend for a new language, or substituting other code-processing tools for the code-generating backend. If it is covered, would you say that the book might enable a reader to start something like this? If not, are you aware of other resources that help out in this regard?
I agree with you 100% - GCC is one of the most impressive and largest pieces of open source software I've ever had the pleasure of working with. However, my book is much more of a user's guide and reference for GCC and related tools than an implementor's guide, which is what it sounds like you're looking for. Though you're probably already aware of it, there's an excellent guide to GCC internals from the FSF folks on the page at http://gcc.gnu.org/onlinedocs (the specific doc is at http://gcc.gnu.org/onlinedocs/gccint). This talks about implementing new front ends, etc. There is an online GCC manual called "Using and Porting the GNU Compiler Collection" for older versions of GCC, but I haven't seen an update to that for the 4.x family - I think they've split the manual up into language-specific manuals, plus the internals doc I mentioned earlier.
The GCC manuals themselves are great - obviously I've used them for years before writing this book. My book is intended to be a more user-friendly guide to using the GCC compilers and related tools that can also serve as a useful reference for specific questions.
With all due respect to GCC, that's hardly a new concept. In 1986, I had a deal with Lattice (remember when other companies besides Microsoft sold DOS C compilers?). I'd ported the AT&T C++ translator to run as a preprocessor to their Amiga C compiler, but we wanted tighter integration, so they gave me the specs for the workfile that their frontend passed to their backend. Like GCC, Lattice C had common code for the lext/parse/AST part of the compiler and target-specific code for the code generator part.
We were planning on replacing their C frontend with a C++ frontend. Alas, it came to naught, since I was never able to define a reliable function overloading subsystem. C++ is a real horror on overloading, since unlike Java, you can implicitly invoke constructors and conversion operators as part of the overloaded method call, which can result in some major ambiguities.
Layered compilers in general go back to almost the beginning. The old IBM mainframe COBOL ran in 6 or 7 distinct passes. It had to. Memory back then meant the logic for each pass had to be able to fit in something like 16KB RAM.
Speaking of IBM mainframes, there has been a fair amount of work done on getting gcc to run in IBM's OS/MVS 3.8 - a free ancestor of the modern z/OS that runs in the Hercules mainframe emulator environment. I looked at the gcc internal logic docs as part of this project and was quite impressed.
Gcc frontends exist for C, C++, and Ada that I can list offhand. I believer there's been work on a COBOL, and I'm pretty sure it's at the core of the Linux FORTRAN. I'm sure there are others. So the docs are pretty good for that sort of effort.
Customer surveys are for companies who didn't pay proper attention to begin with.
subject: curious about frontend/backend separation in GCC