Article: 4779 of comp.programming.threads From: maslen@best.com (Thomas M. Maslen) Newsgroups: comp.programming.threads Subject: Re: gethostbyXXXX() and threads Date: 1 May 1997 01:37:13 -0700 Organization: BEST Internet Communications Message-ID: My officemate (who may wish to remain anonymous) and I specified and implemented the getXXXX_r() routines that went into SunOS 5.3 (Solaris 2.3), so I have a fading recollection (from 1991 or 1992) of how and why things were done then. Some of this may have changed since, and I haven't kept close tabs on it. First, we checked with the Sun folks who went to UI (Unix International) and POSIX meetings, and discovered that all we had to go on were: - the SVID (System V Interface Definition), which defined pre-MT interfaces in painful detail but had nothing to say about MT, - a UI (Unix International) document about MT-safe APIs for libc, which defined libc routines like getpwnam_r() and its friends but didn't say anything about libsocket or libnsl, - if there was a similar POSIX document then we had it, but offhand I don't remember whether POSIX had got as far as libc, much less libsocket or libnsl. I took some time out to curse - whoever it was that originally defined getpwnam() et al to return a pointer-to-static, - the same person, presumably, who defined the setpwent/getpwent/endpwent enumeration routines, - whoever gave us the global variable 'errno', - whoever it was at Berkeley who slavishly followed suit when they created gethostbyname() et al and h_errno (can we blame this one on Bill Joy?). We also noted that whoever defined the transport-independent naming API, netdir_getbyname() and netdir_getbyaddr(), had at least got the no-pointers- to-static part right, even if the interface wasn't exactly beautiful. It was probably at about this time that I began to yearn very strongly for a useful programming language with garbage collection built in. Then we sat down to define MT-safe interfaces to gethostbyname() et al that would, as much as possible, follow the intent of the getpwnam_r() definitions in the documents we had, i.e. (1) Don't attempt to make the existing interface MT-safe by playing games with TSD (Thread-Specific Data): instead, leave that interface alone and introduce a new _r interface [I happen to agree with this... I think that returning pointer-to-TSD is only marginally better than returning pointer-to-static -- it still means that, in a given thread, all bets are off if I call a second routine without doing a deep copy of the results from the first] (2) The _r routine should take three extra arguments: - a pointer to space for a (struct XXX), - a pointer to a buffer for all the stuff that the struct XXX might point to, - the size of the buffer [There are obviously alternatives, but at least this approach keeps the performance folks from freaking out] (3) The routine would set a TSD errno, and the return value of the function would be either NULL or a pointer to struct XXX [This is pretty silly. The caller knows what the pointer-to-struct is because it was one of the arguments in (2), and the return value should have been an integer 0 or errno. However, that's what getpwnam_r() and friends did, and we slavishly followed suit] (4) Nobody had said anything about h_errno, so we mumbled something about the hobgoblin of small minds and, instead of creating a TSD h_errno [bleah], we made it an 'out' parameter of gethostbyname_r() and gethostbyaddr_r() [h_errno is pretty cheesy anyway, maybe we should have been more bold and just nuked it] (5) The enumeration routines get a getXXXent_r(), but the enumeration state is still process-global rather than, say, thread-specific or held in a client-controlled state object [This makes the enumeration routines pretty useless, but they were pretty useless already -- they stopped making sense the minute that name-service information was put in, say, DNS rather than /etc/hosts. Under the covers, our implementation uses a state object, so it could easily adapt to a thread-specific or client-controlled state API] ... which meant that, for better or worse, we ended up with interfaces like struct hostent *gethostbyname_r(const char *name, struct hostent *result, char *buffer, int buflen, int *h_errnop); and struct netent *getnetbyname_r(const char *name, struct netent *result, char *buffer, int buflen); Other people clearly made different, maybe more sensible, decisions, and the POSIX committee changed its mind a few times (which led to a brief panic late in the development cycle of 5.3 or 5.4, I forget which). Ben Self writes: [...] >I think to some extent that the interfaces in nsl/socket are second >class citizens. Amen to that. >In practice, every platform that I have worked on has solved this >problem differently. Most just specify a XXX_r() function which takes >its buffers in a format that requires them to rewrite the least amount >of code. Actually, it would have been easier for us to implement pointer-to-TSD; instead, the client-supplied buffers meant that we pretty much rewrote the code from scratch (partly because it meant that, for the first time in the history of this code, the code actually did some bounds-checking). >A few have really fixed the interface by using thread specific >data or another thread safe mechanism. I guess we disagree on whether pointer-to-TSD is a feature or a bug. >In all cases, the documentation of what was done and why is in short supply. Yup. >Good luck, I for one would love to see this standardized. Likewise. However, before sinking too much time into trying to standardize MT-safe versions of the old get{host,net,service...} routines, it might be a good idea to make sure that future API work is headed in the right direction. I'm thinking of RFC 2133 Section 6, particularly the use of the global variable _res.options, which (apart from being pretty BIND-specific) does seem to tie all threads in the process to a single policy. Section 6.3 of that RFC also mentions POSIX 1003.1g and the getaddrinfo() function, which seems to be a rather more MT-friendly API. (Mr Stevens presumably knows all this, since he's one of the authors of the RFC, but I certainly didn't). In the short term, does anyone know whether there's likely to be an MT-hot version of the BIND libresolv code? Last time I looked, both the code and the APIs were pretty MT-hostile. Thomas Maslen maslen@pobox.com (Maybe I'll just refuse to write MT networking code in anything except Java...)