diff options
Diffstat (limited to 'REORG.TODO/manual/nss.texi')
-rw-r--r-- | REORG.TODO/manual/nss.texi | 734 |
1 files changed, 734 insertions, 0 deletions
diff --git a/REORG.TODO/manual/nss.texi b/REORG.TODO/manual/nss.texi new file mode 100644 index 0000000000..ee70ad309d --- /dev/null +++ b/REORG.TODO/manual/nss.texi @@ -0,0 +1,734 @@ +@node Name Service Switch, Users and Groups, Job Control, Top +@chapter System Databases and Name Service Switch +@c %MENU% Accessing system databases +@cindex Name Service Switch +@cindex NSS +@cindex databases + +Various functions in the C Library need to be configured to work +correctly in the local environment. Traditionally, this was done by +using files (e.g., @file{/etc/passwd}), but other nameservices (like the +Network Information Service (NIS) and the Domain Name Service (DNS)) +became popular, and were hacked into the C library, usually with a fixed +search order. + +@Theglibc{} contains a cleaner solution to this problem. It is +designed after a method used by Sun Microsystems in the C library of +@w{Solaris 2}. @Theglibc{} follows their name and calls this +scheme @dfn{Name Service Switch} (NSS). + +Though the interface might be similar to Sun's version there is no +common code. We never saw any source code of Sun's implementation and +so the internal interface is incompatible. This also manifests in the +file names we use as we will see later. + + +@menu +* NSS Basics:: What is this NSS good for. +* NSS Configuration File:: Configuring NSS. +* NSS Module Internals:: How does it work internally. +* Extending NSS:: What to do to add services or databases. +@end menu + +@node NSS Basics, NSS Configuration File, Name Service Switch, Name Service Switch +@section NSS Basics + +The basic idea is to put the implementation of the different services +offered to access the databases in separate modules. This has some +advantages: + +@enumerate +@item +Contributors can add new services without adding them to @theglibc{}. +@item +The modules can be updated separately. +@item +The C library image is smaller. +@end enumerate + +To fulfill the first goal above, the ABI of the modules will be described +below. For getting the implementation of a new service right it is +important to understand how the functions in the modules get called. +They are in no way designed to be used by the programmer directly. +Instead the programmer should only use the documented and standardized +functions to access the databases. + +@noindent +The databases available in the NSS are + +@cindex ethers +@cindex group +@cindex hosts +@cindex netgroup +@cindex networks +@cindex protocols +@cindex passwd +@cindex rpc +@cindex services +@cindex shadow +@table @code +@item aliases +Mail aliases +@comment @pxref{Mail Aliases}. +@item ethers +Ethernet numbers, +@comment @pxref{Ethernet Numbers}. +@item group +Groups of users, @pxref{Group Database}. +@item hosts +Host names and numbers, @pxref{Host Names}. +@item netgroup +Network wide list of host and users, @pxref{Netgroup Database}. +@item networks +Network names and numbers, @pxref{Networks Database}. +@item protocols +Network protocols, @pxref{Protocols Database}. +@item passwd +User passwords, @pxref{User Database}. +@item rpc +Remote procedure call names and numbers, +@comment @pxref{RPC Database}. +@item services +Network services, @pxref{Services Database}. +@item shadow +Shadow user passwords, +@comment @pxref{Shadow Password Database}. +@end table + +@noindent +There will be some more added later (@code{automount}, @code{bootparams}, +@code{netmasks}, and @code{publickey}). + +@node NSS Configuration File, NSS Module Internals, NSS Basics, Name Service Switch +@section The NSS Configuration File + +@cindex @file{/etc/nsswitch.conf} +@cindex @file{nsswitch.conf} +Somehow the NSS code must be told about the wishes of the user. For +this reason there is the file @file{/etc/nsswitch.conf}. For each +database, this file contains a specification of how the lookup process should +work. The file could look like this: + +@example +@include nsswitch.texi +@end example + +The first column is the database as you can guess from the table above. +The rest of the line specifies how the lookup process works. Please +note that you specify the way it works for each database individually. +This cannot be done with the old way of a monolithic implementation. + +The configuration specification for each database can contain two +different items: + +@itemize @bullet +@item +the service specification like @code{files}, @code{db}, or @code{nis}. +@item +the reaction on lookup result like @code{[NOTFOUND=return]}. +@end itemize + +@menu +* Services in the NSS configuration:: Service names in the NSS configuration. +* Actions in the NSS configuration:: React appropriately to the lookup result. +* Notes on NSS Configuration File:: Things to take care about while + configuring NSS. +@end menu + +@node Services in the NSS configuration, Actions in the NSS configuration, NSS Configuration File, NSS Configuration File +@subsection Services in the NSS configuration File + +The above example file mentions five different services: @code{files}, +@code{db}, @code{dns}, @code{nis}, and @code{nisplus}. This does not +mean these +services are available on all sites and neither does it mean these are +all the services which will ever be available. + +In fact, these names are simply strings which the NSS code uses to find +the implicitly addressed functions. The internal interface will be +described later. Visible to the user are the modules which implement an +individual service. + +Assume the service @var{name} shall be used for a lookup. The code for +this service is implemented in a module called @file{libnss_@var{name}}. +On a system supporting shared libraries this is in fact a shared library +with the name (for example) @file{libnss_@var{name}.so.2}. The number +at the end is the currently used version of the interface which will not +change frequently. Normally the user should not have to be cognizant of +these files since they should be placed in a directory where they are +found automatically. Only the names of all available services are +important. + +@node Actions in the NSS configuration, Notes on NSS Configuration File, Services in the NSS configuration, NSS Configuration File +@subsection Actions in the NSS configuration + +The second item in the specification gives the user much finer control +on the lookup process. Action items are placed between two service +names and are written within brackets. The general form is + +@display +@code{[} ( @code{!}? @var{status} @code{=} @var{action} )+ @code{]} +@end display + +@noindent +where + +@smallexample +@var{status} @result{} success | notfound | unavail | tryagain +@var{action} @result{} return | continue +@end smallexample + +The case of the keywords is insignificant. The @var{status} +values are the results of a call to a lookup function of a specific +service. They mean: + +@ftable @samp +@item success +No error occurred and the wanted entry is returned. The default action +for this is @code{return}. + +@item notfound +The lookup process works ok but the needed value was not found. The +default action is @code{continue}. + +@item unavail +@cindex DNS server unavailable +The service is permanently unavailable. This can either mean the needed +file is not available, or, for DNS, the server is not available or does +not allow queries. The default action is @code{continue}. + +@item tryagain +The service is temporarily unavailable. This could mean a file is +locked or a server currently cannot accept more connections. The +default action is @code{continue}. +@end ftable + +@noindent +The @var{action} values mean: + +@ftable @samp +@item return + +If the status matches, stop the lookup process at this service +specification. If an entry is available, provide it to the application. +If an error occurred, report it to the application. In case of a prior +@samp{merge} action, the data is combined with previous lookup results, +as explained below. + +@item continue + +If the status matches, proceed with the lookup process at the next +entry, discarding the result of the current lookup (and any merged +data). An exception is the @samp{initgroups} database and the +@samp{success} status, where @samp{continue} acts like @code{merge} +below. + +@item merge + +Proceed with the lookup process, retaining the current lookup result. +This action is useful only with the @samp{success} status. If a +subsequent service lookup succeeds and has a matching @samp{return} +specification, the results are merged, the lookup process ends, and the +merged results are returned to the application. If the following service +has a matching @samp{merge} action, the lookup process continues, +retaining the combined data from this and any previous lookups. + +After a @code{merge} action, errors from subsequent lookups are ignored, +and the data gathered so far will be returned. + +The @samp{merge} only applies to the @samp{success} status. It is +currently implemented for the @samp{group} database and its group +members field, @samp{gr_mem}. If specified for other databases, it +causes the lookup to fail (if the @var{status} matches). + +When processing @samp{merge} for @samp{group} membership, the group GID +and name must be identical for both entries. If only one or the other is +a match, the behavior is undefined. + +@end ftable + +@noindent +If we have a line like + +@smallexample +ethers: nisplus [NOTFOUND=return] db files +@end smallexample + +@noindent +this is equivalent to + +@smallexample +ethers: nisplus [SUCCESS=return NOTFOUND=return UNAVAIL=continue + TRYAGAIN=continue] + db [SUCCESS=return NOTFOUND=continue UNAVAIL=continue + TRYAGAIN=continue] + files +@end smallexample + +@noindent +(except that it would have to be written on one line). The default +value for the actions are normally what you want, and only need to be +changed in exceptional cases. + +If the optional @code{!} is placed before the @var{status} this means +the following action is used for all statuses but @var{status} itself. +I.e., @code{!} is negation as in the C language (and others). + +Before we explain the exception which makes this action item necessary +one more remark: obviously it makes no sense to add another action +item after the @code{files} service. Since there is no other service +following the action @emph{always} is @code{return}. + +@cindex nisplus, and completeness +Now, why is this @code{[NOTFOUND=return]} action useful? To understand +this we should know that the @code{nisplus} service is often +complete; i.e., if an entry is not available in the NIS+ tables it is +not available anywhere else. This is what is expressed by this action +item: it is useless to examine further services since they will not give +us a result. + +@cindex nisplus, and booting +@cindex bootstrapping, and services +The situation would be different if the NIS+ service is not available +because the machine is booting. In this case the return value of the +lookup function is not @code{notfound} but instead @code{unavail}. And +as you can see in the complete form above: in this situation the +@code{db} and @code{files} services are used. Neat, isn't it? The +system administrator need not pay special care for the time the system +is not completely ready to work (while booting or shutdown or +network problems). + + +@node Notes on NSS Configuration File, , Actions in the NSS configuration, NSS Configuration File +@subsection Notes on the NSS Configuration File + +Finally a few more hints. The NSS implementation is not completely +helpless if @file{/etc/nsswitch.conf} does not exist. For +all supported databases there is a default value so it should normally +be possible to get the system running even if the file is corrupted or +missing. + +@cindex default value, and NSS +For the @code{hosts} and @code{networks} databases the default value is +@code{dns [!UNAVAIL=return] files}. I.e., the system is prepared for +the DNS service not to be available but if it is available the answer it +returns is definitive. + +The @code{passwd}, @code{group}, and @code{shadow} databases are +traditionally handled in a special way. The appropriate files in the +@file{/etc} directory are read but if an entry with a name starting +with a @code{+} character is found NIS is used. This kind of lookup +remains possible by using the special lookup service @code{compat} +and the default value for the three databases above is +@code{compat [NOTFOUND=return] files}. + +For all other databases the default value is +@code{nis [NOTFOUND=return] files}. This solution gives the best +chance to be correct since NIS and file based lookups are used. + +@cindex optimizing NSS +A second point is that the user should try to optimize the lookup +process. The different service have different response times. +A simple file look up on a local file could be fast, but if the file +is long and the needed entry is near the end of the file this may take +quite some time. In this case it might be better to use the @code{db} +service which allows fast local access to large data sets. + +Often the situation is that some global information like NIS must be +used. So it is unavoidable to use service entries like @code{nis} etc. +But one should avoid slow services like this if possible. + + +@node NSS Module Internals, Extending NSS, NSS Configuration File, Name Service Switch +@section NSS Module Internals + +Now it is time to describe what the modules look like. The functions +contained in a module are identified by their names. I.e., there is no +jump table or the like. How this is done is of no interest here; those +interested in this topic should read about Dynamic Linking. +@comment @ref{Dynamic Linking}. + + +@menu +* NSS Module Names:: Construction of the interface function of + the NSS modules. +* NSS Modules Interface:: Programming interface in the NSS module + functions. +@end menu + +@node NSS Module Names, NSS Modules Interface, NSS Module Internals, NSS Module Internals +@subsection The Naming Scheme of the NSS Modules + +@noindent +The name of each function consists of various parts: + +@quotation + _nss_@var{service}_@var{function} +@end quotation + +@var{service} of course corresponds to the name of the module this +function is found in.@footnote{Now you might ask why this information is +duplicated. The answer is that we want to make it possible to link +directly with these shared objects.} The @var{function} part is derived +from the interface function in the C library itself. If the user calls +the function @code{gethostbyname} and the service used is @code{files} +the function + +@smallexample + _nss_files_gethostbyname_r +@end smallexample + +@noindent +in the module + +@smallexample + libnss_files.so.2 +@end smallexample + +@noindent +@cindex reentrant NSS functions +is used. You see, what is explained above in not the whole truth. In +fact the NSS modules only contain reentrant versions of the lookup +functions. I.e., if the user would call the @code{gethostbyname_r} +function this also would end in the above function. For all user +interface functions the C library maps this call to a call to the +reentrant function. For reentrant functions this is trivial since the +interface is (nearly) the same. For the non-reentrant version the +library keeps internal buffers which are used to replace the user +supplied buffer. + +I.e., the reentrant functions @emph{can} have counterparts. No service +module is forced to have functions for all databases and all kinds to +access them. If a function is not available it is simply treated as if +the function would return @code{unavail} +(@pxref{Actions in the NSS configuration}). + +The file name @file{libnss_files.so.2} would be on a @w{Solaris 2} +system @file{nss_files.so.2}. This is the difference mentioned above. +Sun's NSS modules are usable as modules which get indirectly loaded +only. + +The NSS modules in @theglibc{} are prepared to be used as normal +libraries themselves. This is @emph{not} true at the moment, though. +However, the organization of the name space in the modules does not make it +impossible like it is for Solaris. Now you can see why the modules are +still libraries.@footnote{There is a second explanation: we were too +lazy to change the Makefiles to allow the generation of shared objects +not starting with @file{lib} but don't tell this to anybody.} + + +@node NSS Modules Interface, , NSS Module Names, NSS Module Internals +@subsection The Interface of the Function in NSS Modules + +Now we know about the functions contained in the modules. It is now +time to describe the types. When we mentioned the reentrant versions of +the functions above, this means there are some additional arguments +(compared with the standard, non-reentrant versions). The prototypes for +the non-reentrant and reentrant versions of our function above are: + +@smallexample +struct hostent *gethostbyname (const char *name) + +int gethostbyname_r (const char *name, struct hostent *result_buf, + char *buf, size_t buflen, struct hostent **result, + int *h_errnop) +@end smallexample + +@noindent +The actual prototype of the function in the NSS modules in this case is + +@smallexample +enum nss_status _nss_files_gethostbyname_r (const char *name, + struct hostent *result_buf, + char *buf, size_t buflen, + int *errnop, int *h_errnop) +@end smallexample + +I.e., the interface function is in fact the reentrant function with the +change of the return value, the omission of the @var{result} parameter, +and the addition of the @var{errnop} parameter. While the user-level +function returns a pointer to the result the reentrant function return +an @code{enum nss_status} value: + +@vtable @code +@item NSS_STATUS_TRYAGAIN +numeric value @code{-2} + +@item NSS_STATUS_UNAVAIL +numeric value @code{-1} + +@item NSS_STATUS_NOTFOUND +numeric value @code{0} + +@item NSS_STATUS_SUCCESS +numeric value @code{1} +@end vtable + +@noindent +Now you see where the action items of the @file{/etc/nsswitch.conf} file +are used. + +If you study the source code you will find there is a fifth value: +@code{NSS_STATUS_RETURN}. This is an internal use only value, used by a +few functions in places where none of the above value can be used. If +necessary the source code should be examined to learn about the details. + +In case the interface function has to return an error it is important +that the correct error code is stored in @code{*@var{errnop}}. Some +return status values have only one associated error code, others have +more. + +@multitable @columnfractions .3 .2 .50 +@item +@code{NSS_STATUS_TRYAGAIN} @tab + @code{EAGAIN} @tab One of the functions used ran temporarily out of +resources or a service is currently not available. +@item +@tab + @code{ERANGE} @tab The provided buffer is not large enough. +The function should be called again with a larger buffer. +@item +@code{NSS_STATUS_UNAVAIL} @tab + @code{ENOENT} @tab A necessary input file cannot be found. +@item +@code{NSS_STATUS_NOTFOUND} @tab + @code{ENOENT} @tab The requested entry is not available. + +@item +@code{NSS_STATUS_NOTFOUND} @tab + @code{SUCCESS} @tab There are no entries. +Use this to avoid returning errors for inactive services which may +be enabled at a later time. This is not the same as the service +being temporarily unavailable. +@end multitable + +These are proposed values. There can be other error codes and the +described error codes can have different meaning. @strong{With one +exception:} when returning @code{NSS_STATUS_TRYAGAIN} the error code +@code{ERANGE} @emph{must} mean that the user provided buffer is too +small. Everything else is non-critical. + +In statically linked programs, the main application and NSS modules do +not share the same thread-local variable @code{errno}, which is the +reason why there is an explicit @var{errnop} function argument. + +The above function has something special which is missing for almost all +the other module functions. There is an argument @var{h_errnop}. This +points to a variable which will be filled with the error code in case +the execution of the function fails for some reason. (In statically +linked programs, the thread-local variable @code{h_errno} is not shared +with the main application.) + +The @code{get@var{XXX}by@var{YYY}} functions are the most important +functions in the NSS modules. But there are others which implement +the other ways to access system databases (say for the +password database, there are @code{setpwent}, @code{getpwent}, and +@code{endpwent}). These will be described in more detail later. +Here we give a general way to determine the +signature of the module function: + +@itemize @bullet +@item +the return value is @code{enum nss_status}; +@item +the name (@pxref{NSS Module Names}); +@item +the first arguments are identical to the arguments of the non-reentrant +function; +@item +the next four arguments are: + +@table @code +@item STRUCT_TYPE *result_buf +pointer to buffer where the result is stored. @code{STRUCT_TYPE} is +normally a struct which corresponds to the database. +@item char *buffer +pointer to a buffer where the function can store additional data for +the result etc. +@item size_t buflen +length of the buffer pointed to by @var{buffer}. +@item int *errnop +the low-level error code to return to the application. If the return +value is not @code{NSS_STATUS_SUCCESS}, @code{*@var{errnop}} needs to be +set to a non-zero value. An NSS module should never set +@code{*@var{errnop}} to zero. The value @code{ERANGE} is special, as +described above. +@end table + +@item +possibly a last argument @var{h_errnop}, for the host name and network +name lookup functions. If the return value is not +@code{NSS_STATUS_SUCCESS}, @code{*@var{h_errnop}} needs to be set to a +non-zero value. A generic error code is @code{NETDB_INTERNAL}, which +instructs the caller to examine @code{*@var{errnop}} for further +details. (This includes the @code{ERANGE} special case.) +@end itemize + +@noindent +This table is correct for all functions but the @code{set@dots{}ent} +and @code{end@dots{}ent} functions. + + +@node Extending NSS, , NSS Module Internals, Name Service Switch +@section Extending NSS + +One of the advantages of NSS mentioned above is that it can be extended +quite easily. There are two ways in which the extension can happen: +adding another database or adding another service. The former is +normally done only by the C library developers. It is +here only important to remember that adding another database is +independent from adding another service because a service need not +support all databases or lookup functions. + +A designer/implementer of a new service is therefore free to choose the +databases s/he is interested in and leave the rest for later (or +completely aside). + +@menu +* Adding another Service to NSS:: What is to do to add a new service. +* NSS Module Function Internals:: Guidelines for writing new NSS + service functions. +@end menu + +@node Adding another Service to NSS, NSS Module Function Internals, Extending NSS, Extending NSS +@subsection Adding another Service to NSS + +The sources for a new service need not (and should not) be part of @theglibc{} +itself. The developer retains complete control over the +sources and its development. The links between the C library and the +new service module consists solely of the interface functions. + +Each module is designed following a specific interface specification. +For now the version is 2 (the interface in version 1 was not adequate) +and this manifests in the version number of the shared library object of +the NSS modules: they have the extension @code{.2}. If the interface +changes again in an incompatible way, this number will be increased. +Modules using the old interface will still be usable. + +Developers of a new service will have to make sure that their module is +created using the correct interface number. This means the file itself +must have the correct name and on ELF systems the @dfn{soname} (Shared +Object Name) must also have this number. Building a module from a bunch +of object files on an ELF system using GNU CC could be done like this: + +@smallexample +gcc -shared -o libnss_NAME.so.2 -Wl,-soname,libnss_NAME.so.2 OBJECTS +@end smallexample + +@noindent +@ref{Link Options, Options for Linking, , gcc, GNU CC}, to learn +more about this command line. + +To use the new module the library must be able to find it. This can be +achieved by using options for the dynamic linker so that it will search +the directory where the binary is placed. For an ELF system this could be +done by adding the wanted directory to the value of +@code{LD_LIBRARY_PATH}. + +But this is not always possible since some programs (those which run +under IDs which do not belong to the user) ignore this variable. +Therefore the stable version of the module should be placed into a +directory which is searched by the dynamic linker. Normally this should +be the directory @file{$prefix/lib}, where @file{$prefix} corresponds to +the value given to configure using the @code{--prefix} option. But be +careful: this should only be done if it is clear the module does not +cause any harm. System administrators should be careful. + + +@node NSS Module Function Internals, , Adding another Service to NSS, Extending NSS +@subsection Internals of the NSS Module Functions + +Until now we only provided the syntactic interface for the functions in +the NSS module. In fact there is not much more we can say since the +implementation obviously is different for each function. But a few +general rules must be followed by all functions. + +In fact there are four kinds of different functions which may appear in +the interface. All derive from the traditional ones for system databases. +@var{db} in the following table is normally an abbreviation for the +database (e.g., it is @code{pw} for the password database). + +@table @code +@item enum nss_status _nss_@var{database}_set@var{db}ent (void) +This function prepares the service for following operations. For a +simple file based lookup this means files could be opened, for other +services this function simply is a noop. + +One special case for this function is that it takes an additional +argument for some @var{database}s (i.e., the interface is +@code{int set@var{db}ent (int)}). @ref{Host Names}, which describes the +@code{sethostent} function. + +The return value should be @var{NSS_STATUS_SUCCESS} or according to the +table above in case of an error (@pxref{NSS Modules Interface}). + +@item enum nss_status _nss_@var{database}_end@var{db}ent (void) +This function simply closes all files which are still open or removes +buffer caches. If there are no files or buffers to remove this is again +a simple noop. + +There normally is no return value other than @var{NSS_STATUS_SUCCESS}. + +@item enum nss_status _nss_@var{database}_get@var{db}ent_r (@var{STRUCTURE} *result, char *buffer, size_t buflen, int *errnop) +Since this function will be called several times in a row to retrieve +one entry after the other it must keep some kind of state. But this +also means the functions are not really reentrant. They are reentrant +only in that simultaneous calls to this function will not try to +write the retrieved data in the same place (as it would be the case for +the non-reentrant functions); instead, it writes to the structure +pointed to by the @var{result} parameter. But the calls share a common +state and in the case of a file access this means they return neighboring +entries in the file. + +The buffer of length @var{buflen} pointed to by @var{buffer} can be used +for storing some additional data for the result. It is @emph{not} +guaranteed that the same buffer will be passed for the next call of this +function. Therefore one must not misuse this buffer to save some state +information from one call to another. + +Before the function returns with a failure code, the implementation +should store the value of the local @var{errno} variable in the variable +pointed to be @var{errnop}. This is important to guarantee the module +working in statically linked programs. The stored value must not be +zero. + +As explained above this function could also have an additional last +argument. This depends on the database used; it happens only for +@code{host} and @code{networks}. + +The function shall return @code{NSS_STATUS_SUCCESS} as long as there are +more entries. When the last entry was read it should return +@code{NSS_STATUS_NOTFOUND}. When the buffer given as an argument is too +small for the data to be returned @code{NSS_STATUS_TRYAGAIN} should be +returned. When the service was not formerly initialized by a call to +@code{_nss_@var{DATABASE}_set@var{db}ent} all return values allowed for +this function can also be returned here. + +@item enum nss_status _nss_@var{DATABASE}_get@var{db}by@var{XX}_r (@var{PARAMS}, @var{STRUCTURE} *result, char *buffer, size_t buflen, int *errnop) +This function shall return the entry from the database which is +addressed by the @var{PARAMS}. The type and number of these arguments +vary. It must be individually determined by looking to the user-level +interface functions. All arguments given to the non-reentrant version +are here described by @var{PARAMS}. + +The result must be stored in the structure pointed to by @var{result}. +If there are additional data to return (say strings, where the +@var{result} structure only contains pointers) the function must use the +@var{buffer} of length @var{buflen}. There must not be any references +to non-constant global data. + +The implementation of this function should honor the @var{stayopen} +flag set by the @code{set@var{DB}ent} function whenever this makes sense. + +Before the function returns, the implementation should store the value of +the local @var{errno} variable in the variable pointed to by +@var{errnop}. This is important to guarantee the module works in +statically linked programs. + +Again, this function takes an additional last argument for the +@code{host} and @code{networks} database. + +The return value should as always follow the rules given above +(@pxref{NSS Modules Interface}). + +@end table |