• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Apparently, this is either a very hard, or very boring, question. :)

 
Bartender
Posts: 1464
32
Netbeans IDE C++ Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
So, I posted a question at Dream in Code, but no one has tried to answer it. Thinking it was more appropriate to StackOverflow, I directed any further interest to a new, more detailed version of it there. I must have named it really badly, or something, because, a whole day later, it's only had like twelve views. What kind of StackOverflow question only gets twelve views?

In short, I've exported the undecorated form of a C++ function from a DLL via a module definition file, and am seeing that the decorated form is what actually appears in the import library (all under Visual C++). Microsoft says this does not work, but it does, and I'd like to hear from someone who either knows more about it than I do, or, in the alternative, will look over my analysis and tell me if they think it checks out.

The full item is visible at my post at SO.

I know multi-posting the same question on different forums is frowned upon, but I'm getting no replies. Hoping someone here will give it a look and let me know if they think I'm on the right track, or have gone off the rails.
 
Ranch Hand
Posts: 165
12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I haven't had time to study your posts in detail but in general terms you have to be aware of specifics like this when linking object modules generated by different language compilers, or even same language compilers from different vendors. Each will have their own algorithm for generating decorations. When linking object modules (including those in DLLs) from the same compiler then reference resolutions are usually taken care of by the compiler and linker. My guess is the MS documentation needs to add this clarification.
 
Saloon Keeper
Posts: 27762
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Having actually implemented a C++ product, I'll hazard a guess.

"Decoration" is done to support overloading. Since the linker must choose one and only one of the possible functions named "xfunction", the actual function names seen at the linker level are the decorated function names. The compiler is responsible for taking the undecorated (overloaded) method name and resolving to one of the potential decoration names. Which was no small task back in the early days of the language!

However, unlike java signatures, which are always decorated to the hilt, in C++ there was a possibility - I think even a desirability that there be a candidate function whose name translated undecorated. The exact details of why I did this I know longer remember, although I think it was mostly to make it easier to connect C and C++ code. It wasn't a requirement of the language, it was strictly implementation, so it was very implementation-dependent. There may have even been cases where a pragma could specifically map an overloaded method name to an undecorated name.

So in short, it's a feature that often worked, by unless you were coding close to the implementation details, not one to rely on.
 
Stevens Miller
Bartender
Posts: 1464
32
Netbeans IDE C++ Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Steffe Wilson wrote:...you have to be aware of specifics like this when linking object modules generated by different language compilers, or even same language compilers from different vendors. Each will have their own algorithm for generating decorations.


Quite right, Steffe. Decoration (or, as it is perhaps more accurately sometimes called, "name mangling") is an implementation detail that is not part of the C++ language specification. No two compilers necessarily have to do it the same way. MS does make it clear that, if you want your DLL code to link successfully with your client code, you have to use the same compiler on both, to be sure the mangled names match.

Alternatively, if your compiler supports it, you can dictate that a symbol not be mangled at all. In Visual C++, you do this with the extern "C" prefix in both your server's and client's declarations. Works great, unless you want an overloaded version of the symbol, since overloading requires mangling to avoid ambiguity. You can drop the extern "C" usage in your server and export each overloaded version under an alias, with the alias being a symbol still declared as extern "C" in your client. But that means you have to use some under-the-hood tool like a map file or dumpbin.exe to reveal the decorated name, leading to the inclusion in your module definition file of horrors like this:

In Visual C++, you can get around this completely by prefixing your declarations with __declspec(dllexport) and __declspec(dllimport), though you can't export a symbol privately anymore if you do that (that is, in the module definition file, you can add PRIVATE after the symbol, and it will be available to clients, but they will have to know the symbol name and look it up at run-time, as it will not be included in the import library available at link time).

Those modifiers are, of course, unique to Visual C++. How other compilers deal with this is up to their vendors. The issue isn't (I don't think) directly addressed in the C++ language spec.

Stuff like this is part of why I try to avoid Microsoft's compiler products. Their API is simply littered with MS-only constructs that are not only unportable, they are documented solely by MS and its satellite community. Some MS documentation is pretty good. But, as here, sometimes you find out it is just plain wrong.

Oh well, I had to look into this as part of boning up on DLLs and COM servers in general, and, to do that, I had to dig into a twenty-year-old copy of Charles Petzold's venerable "Programming Windows 95." Most likely, no one developing code with modern tools would even encounter this issue.
 
Stevens Miller
Bartender
Posts: 1464
32
Netbeans IDE C++ Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Tim Holloway wrote:...in C++ there was a possibility - I think even a desirability that there be a candidate function whose name translated undecorated. The exact details of why I did this I know longer remember, although I think it was mostly to make it easier to connect C and C++ code.


Your memory is pretty close, Tim. To avoid having to deal with decorated names, if your exported function is not overloaded, you can use extern "C" in both your server and client code declarations, and the compiler will not mangledecorate the name in either place. In that situation, you can use the undecorated name in your list of exported symbols, since that's the form of the name the compiler will actually generate.

What I've discovered is that, if you don't use extern "C", it all still works anyway, because the linker knows enough to translate the undecorated name you include in your list of exported symbols to its decorated form (provided there are no overloaded versions), and then include that in the import library that the client code will use to resolve its reference to the exported function. Since the compiler will also decorate the client reference, it will be the decorated form that the linker needs to find in the import library and, indeed, that's what will be there. What MS doesn't tell us is that this translation from undecorated to decorated form is done by the linker when it uses a module definition file to create the import library. Instead, they say you must use the decorated form. That's just not true, and I'm kind of astonished that they don't document this (I think) very useful feature.
 
Tim Holloway
Saloon Keeper
Posts: 27762
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yep, it's been a while. The term "name mangling" is an old familiar one.

The extern "C" qualification dates all the way back to the beginning. Its primary use was to allow C++ code to reference C resources (such as stdio.h) without the compiler attempting to mangle the C function name references and thereby fail to resolve them come linker time. It also works in the other direction, though, so C code can call C++ code without knowing the specific mangling algorithm of the C++ development system in question.
 
Stevens Miller
Bartender
Posts: 1464
32
Netbeans IDE C++ Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Tim Holloway wrote:It also works in the other direction, though, so C code can call C++ code without knowing the specific mangling algorithm of the C++ development system in question.


That's where I'm encountering it. To write a COM server, you must export a few mandatory functions, and all must be callable from ancient system code that expects them to have fixed names. The mangled versions would never do, as the system wouldn't know what they were. Interestlngly, a COM server is a DLL, and DLLs can have their entry symbol set by the linker. In the same way that C programs conventionally start with a "main" function, DLLs start with a "DllMain" function. But, in a bizarre inversion of where it makes sense to hard-code this sort of thing, and where it doesn't, MS says its references to a "DllMain" routine are all "place holders," and that the actual symbol for entry to a DLL is set by the vendor of the code your DLL wants to link to (or something like that; it really makes no sense). The truth is that MS conventionally calls a function literally named "DllMain" when it loads or unloads your DLL, and when it initializes or uninitializes your DLL on any threads other than the loading the thread. This is all pretty silly, though, since they also advise against doing much of anything in those calls, for reasons that are too tedious to get into here. What is relevant is that they say you can name the entry point anything you want, with a linker option. Okay, that could even be a decorated name, if you knew what it was. But, if you want your DLL to be a COM server, you must implement a few symbols like, for example, "DllCanUnloadNow." Those cannot be mangled, as the system expects them to be exported under fixed names. There's no reason why they couldn't have had, say, DllMain return the entry points to functions that do what DllCanUnloadNow (and the others) do, and then you'd be free to name them whatever you want, and a compiler would be free to mangle them as much as it wanted.

But, no, those symbols are set in stone, so extern "C", a Microsoft-specific construct, is here to stop the C++ compiler from doing its job (and to stop us from overloading their precious reverse-API calls).

I wonder how much of C++'s loss of market share is due to MS's cumbersome implementation, and related issues like this one.
 
Tim Holloway
Saloon Keeper
Posts: 27762
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
DLLs. Another thing I haven't dealt with in an æon.

Actually, a pure DLL doesn't need methods at all - it can be data-only. Then there's generic DLLs, which can have any function names they want, runnable DLLs (RUNDLL32, a blight upon the Universe), which I think is where the DllMain requirement comes in, OLE/COM/ActiveX DLLs, which compound the issue and worse things.

But Microsoft didn't invent the 'extern "C"' construct. I was using it to link C++ for the Amiga to the Amiga's Exec and AmigaDOS OS functions before Microsoft even got into the C++ business (roughly 4 years later, I think it was). And, as I said before, it also allowed C++ to use the venerable functions from the original Unix/C libraries.

Microsoft was actually the 4th C++ vendor I ever dealt with. I implemented C++ for the Amiga (sold as Lattice/SAS C++ for Amiga) under license from AT&T Bell Labs. Then I did a stint using one of the first Windows/DOS C++ compilers using a product whose name I forget, but I remember calling Ireland for support. That was for Windows version 2. Then Windows 3 and Borland C++ with OWL. Finally Microsoft C++/MFC. From there, I ended up on OS/2 (IBM compiler) and Linux (g++) before deciding I liked Java better.
 
Stevens Miller
Bartender
Posts: 1464
32
Netbeans IDE C++ Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Tim Holloway wrote:Actually, a pure DLL doesn't need methods at all - it can be data-only.


Well, almost. If you don't include a "DllMain" of your own, the linker supplies one for you, as even a "data only" DLL will have that function called when it is loaded (I think).

But Microsoft didn't invent the 'extern "C"' construct. I was using it to link C++ for the Amiga to the Amiga's Exec and AmigaDOS OS functions before Microsoft even got into the C++ business


That's quite interesting. In fact, I was wrong: extern "C" is part of the formal language specification, in Section 7.5, "Linkage specifications." Amazing that Bjarne Stroustrup's book doesn't mention it (or, at least, I can't find it in his book anywhere).

Yes, DLLs have a long and inglorious history. Up to Windows 3.1, the alleged ".exe" files their compiler products produced were all actually DLLs. That's why shared memory between applications was so dangerous easy to use: they were all really DLLs loaded by the same application: win.exe.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic