.\" $Id: xclate.man,v 2.38 2012/10/07 16:00:53 ksb Exp $ .\" by Kevin Braunsdorf .\" $Compile: Display%h .\" $Display: groff -tbl -Tascii -man %f | ${PAGER:-less} .\" $Display(*): groff -tbl -T%s -man %f .\" $Install: %b -mDeinstall %o %f && cp %f $DESTDIR/usr/local/man/man1/xclate.1 .\" $Deinstall: ${rm-rm} -f $DESTDIR/usr/local/man/[cm]a[nt]1/xclate.1* .TH XCLATE 1 LOCAL .SH NAME xclate - collate output from parallel tasks .SH SYNOPSIS .ds PN "xclate \fI\*(PN\fP \fB\-m\fP [\fB\-AdnQrsv\fP] [\-\fIdepth\fP] [\fB\-e\fP\~\fIvar=value\fP] [\fB\-H\fP\~\fIhr\fP] [\fB\-i\fP\~\fIinput\fP] [\fB\-L\fP\~\fIcap\fP] [\fB\-N\fP\~\fInotify\fP] [\fB\-O\fP\~\fIoutput\fP] [\fB\-T\fP\~\fItitle\fP] [\fB\-u\fP\~\fIunix\fP] [\fB\-W\fP\~\fIwidow\fP] [\fIutility\fP] .br \fI\*(PN\fP [\fB\-nQsvw\fP] [\fB\-IEDY\fP] [\-\fIdepth\fP] [\fB\-e\fP\~\fIvar=value\fP] [\fB\-H\fP\~\fIhr\fP] [\fB\-L\fP\~\fIcap\fP] [\fB\-T\fP\~\fItitle\fP] [\fB\-u\fP\~\fIunix\fP] [\fIxid\fP]\~[\fIclient\fP] .br \fI\*(PN\fP \fB\-h\fP .br \fI\*(PN\fP \fB\-V\fP .br \fI\*(PN\fP \fB\-P\fP [\fB\-nqQv\fP] [-\fIdepth\fP] [\fB\-u\fP\~\fIunix\fP] .SH DESCRIPTION The UNIX pipe model for input/output redirection is simple and elegant for most single-thread tasks, and scales well for some multi-thread applications. New issues arise when the speed of computation far exceeds the speed of an output device. Processes block quickly, output from one process gets mixed into the output from another, and overall through-put is limited to the slowest tasks using the slowest device. .PP \fBXclate\fP is an attempt to recover from these issues. Processes may output freely to a buffered stream of at least \fIcap\fP bytes before they might block (on output). The output from each tasks is flushed completely before a new task is allowed to output. While overall through-put is greater, as tasks are cleared in the order they finish (more likely) than in the order they started (thus a slow task may run with deferred output while several generations of quicker tasks complete). .PP To accomplish this \fBxclate\fP has three modes. .TP Manager mode Under \fB\-m\fP it forks a \fIutility\fP with a special environment, then arbitrates use of the original \fIstdout\fP channel for any descendant processes created by \fIutility\fP which request access. The default \fIutility\fP in this mode is a shell, either the one specified by $SHELL, or \fI/bin/sh\fP. A semi-permanent instance may be created when the \fIutility\fP is specified as \fB:\fP (a single colon), see \fB\-Q\fP and \fB\-u\fP. .TP Filter mode -- merge output Without \fB\-m\fP or any \fIclient\fP the program reads from \fIstdin\fP at most \fIcap\fP bytes then waits for exclusive access to the manager's original \fIstdout\fP. It flushes all the buffered input, then pumps the remainder of its \fIstdin\fP to the acquired \fIstdout\fP. .TP Command mode -- filter and merge output Given a \fIclient\fP, and without \fB\-m\fP, the program buffers \fIstdin\fP, as it would without a command, waiting for exclusive access to the manager's \fIstdout\fP. When it gains access to the exclusive output it starts the given \fIclient\fP with the buffered stream as \fIstdin\fP and the manager's \fIstdout\fP, then pumps the remainder of its \fIstdin\fP to the new process. When any options in the set \fB\-I\fP, \fB\-E\fP, \fB\-D\fP are set this behavior is modified quit a bit. .PP In either of the last two modes the positional parameter \fIxid\fP is the identification for this stream, usually \fBxapply\fP's %1 or %u escape or by default \fI\*(PN\fP's process id. Other useful values are the name of a host, directory, or device which is the topic of the output. Specify \fB\-T\fP and \fB\-H\fP to include a title and/or horizontal break around named sections. .PP The \fB\-P\fP option allows \fBxapply\fP to send synthetic acknowledgments for commands that were suppressed by the receipt of a \fBUSR1\fP signal. In that case \fIstdin\fP is taken as a list of \fIuid\fP,\fIstatus\fP pairs to be delivered to the enclosing \fIxclate\fP's notification stream. The \fB\-r\fP option provided to that diversion is honored (that is the exit code is removed when \fB\-r\fP was not provided). .SH OPTIONS If the program is called \fI\*(PN\fP then no options are forced. The environment variable \fB$XCLATE\fP is read for command line options, before the explicit command line. That variable may also be changed by \fBxclate\fP for descendent processes, see ENVIRONMENT below. .TP \-\fIdepth\fP Specify a collation sequence beyond the tightest enclosing. A \fIdepth\fP of 0 is the default. A \fIdepth\fP of 1 accesses the next outer instance of \fBxclate\fP \fB\-m\fP. .TP \fB\-A\fP Add the name of the controlling socket as the first line of the notification stream. This option is used by \fIsshw\fP to locate the service, as it is used in global mode to manage the remote process redirection. .TP \fB\-d\fP Do not link the new manager's diversion socket in the \fB$\fP\fIxclate_link\fP environment chain. Rather the manager socket name is recorded in \fB$\fP\fIxclate_d\fP, while \fB$\fP\fIxclate_link\fP is left unchanged. When the manager socket is specified (via \fB\-u\fP \fIunix\fP) the environment variable may be totally ignored, as you already know where the socket is. .TP .nf \fB\-e\fP \fIvar=value\fP .fi Set the environment variable \fIvar\fP to the given \fIvalue\fP after the path to the \fIclient\fP (or \fIutility\fP) is found. When no \fIvalue\fP is provided, the implied \fIvalue\fP is the empty string. .TP .nf \fB\-e\fP \fB!\fP\fIvar\fP .fi As above, but remove \fIvar\fP from the \fIutility\fP process's environment. The character tilde (\fB~\fP) is also accepted. It is not an error to remove a nonexistent variable name.. .TP .nf \fB\-h\fP .fi Print a help message. .TP \fB\-i\fP \fIinput\fP The \fIstdin\fP given to the \fI\*(PN\fP process is replaced by this stream for the \fIutility\fP process. The original \fIstdin\fP is available only via the \fB\-I\fP option to client \fI\*(PN\fP instances. The \fIinput\fP specification may be any of "|\fIcommand\fP", "<\fIfile\fP", "<>\fIfile\fP", "<&\fIfd\fP", "<&-", "\fIfile\fP", "\fIsocket\fP" or "-" as a shorthand for "\fIfile\fP", or ">>\fIfile\fP", or "\fIfile\fP"), a local (UNIX) domain socket (as "\fIsocket\fP"), any already opened file descriptor (as ">&\fIfd\fP"), or a dash (as "-") for \fIstderr\fP. .SH "COMMAND MODE OPTIONS" These options will let you blow your foot completely off, placing it into orbit. .\" It is not an accident that they spell "improvised explosive device". They are not used by most automation, but allow for some very clever shell scripts (see sapply): .TP \fB\-I\fP The \fIclient\fP is started with \fBstdin\fP connected to the master's \fBstdin\fP. No buffering of \fBstdin\fP can be provided. .TP \fB\-E\fP The \fIclient\fP is started with \fBstderr\fP connected to the master's \fBstderr\fP. .TP \fB\-D\fP The \fIclient\fP is launched from the same current working directory as the master's. Really, even if that directory is hidden under a mount point. See \fBfchdir\fP(2). .TP \fB\-Y\fP Change the controlling tty to the new \fIstdin\fP, \fIstdout\fP, or \fIstderr\fP (in that order). See \fBtty\fP(4). .PP These are the inspiration for the \fIescrow\fP wrapper. I'll even claim that \fI\*(PN\fP is the first true wrapper, and inspired all the others. .SH ENVIRONMENT Since \fBxclate\fP is intended to be recursive, a provision is made in the processing of environment variables for passing options to nested \fBxclate\fP processes. .TP $xcl_link The environment variable \fB$xcl_link\fP is set to the number of nested \fBxclate\fP processes presently running. .TP $xcl_1, $xcl_2, ... Each manager instance of \fBxclate\fP sets a variable (named for the value of $xcl_link established when it started) to the path to the unix domain socket used to chat with that instance. .TP $xcl_d The path to the unix domain socket for the tightest enclosing instance of \fI\*(PN\fP \fB\-m\fP with \fB\-d\fP in effect. .TP $XCLATE and $XCLATE_1, $XCLATE_2, \fI...\fP After reading command line options from variable \fB$XCLATE\fP, any master (\fB\-m\fP) process removes it from the environment (before executing \fIutility\fP in the child process). It then installs the variable \fB$XCLATE\fP_\fInesting\fP in its place, if it is present in the environment. This allows header and horizontal rule options to be set before they are needed, for example before any \fBxapply\fP commands are started. .TS l l. $XCLATE read by the first manger instance $XCLATE_1 read by the children of the first manager $XCLATE_1 read by the second manager, if not reset. $XCLATE_2 read by the children of the second manager \... and so on... .TE .SH EXAMPLES .TP \fBxclate\fP \-vV Show version information and any details in the present environment. .TP \fBxclate\fP \-L \e? \-V Display the table of suffixes supported by \fB\-L\fP, and some version information. .TP \fBxclate\fP \fB\-vm\fP tasks Assuming that the file \fItasks\fP is a shell script that had a list of (long running) background jobs in it, for example: .nf xclate \-T%x passwd sort /etc/passwd & xclate \-T%x group sort /etc/group & ... wait .fi The output of this command would be the output of the jobs in some order, with each output section contiguous. .TP \fBxapply\fP \fB\-m\fP \-P5 ... Wrap a controlled environment around an \fBxapply\fP process. There is some real hoodoo going on here, and you don't want to mess with it until you can use \fBxclate\fP very well. .TP \fBxclate\fP \-m \fBxapply\fP \-P5 '... |\fBxclate\fP %u' ... This puts the pixie dust on the \fBxapply\fP by hand, and is in effect exactly what \fBxapply\fP does under \fB\-m\fP. This is akin to making home-made distilled spirits (as is one is apt to get blown-up, poisoned, or shot by the law). .TP \fBxclate\fP \-m \fBxapply\fP \-m \-P6 '...' ... Force a new \fBxclate\fP manager into the process tree for this instance of \fBxapply\fP. Nested instances of \fBxapply\fP share a common manager by default, by explicitly starting a new managed stream we can group all the output of this \fBxapply\fP together in the common output. .TP eval chmod 0750 \e${xcl_$xcl_link%/\e*} A \fBksh\fP spell to make the present manager socket visible to our group. .TP eval chmod 0750 \e`dirname \e${xcl_$xcl_link}\e` The same spell under Borne shell, or csh. .TP xclate \-\- \-\- ls The spell to skip the \fIxid\fP, one double-dash to end the options, one to skip the optional \fIxid\fP. It is better form to put a meaningful \fIxid\fP on each stream, as the default of question-mark ("?") is silly. .TP xclate \-mr \-N /tmp/log.0 \fIprogram\fP ; tr '\e000' '\en' /tmp/log.0 Run a collated \fIprogram\fP, then replay the exit codes and \fIxid\fP for each task. .TP .nf xclate \-mu /var/run/cheater : >>/var/log/cheater & .fi Start a persistent diversion on the socket \fB/var/run/cheater\fP. Clients that can connect to that socket can write messages into \fB/var/log/cheater\fP, until someone issues a client with \fB\-Q\fP set. .TP .nf xclate \-m \-i "/etc/motd" tr '[A\-Z]' '[a\-z]' .fi A force \fI\*(PN\fP to open \fB/etc/motd\fP as \fIstdin\fP to a \fBtr\fP(1). .TP .nf xclate \-mi "\-" <$DATA xapply \fI...\fP .fi All child processes of the \fBxapply\fP may try to acquire a file descriptor on the $DATA file with "xclate \-I". Note \fBxapply\fP doesn't implement this since this option would serialize all of the child processes. .TP .nf xclate \-mi "$SOCK_TMP" nc log.example.com 4321 .fi Use xclate to connect to the local domain socket \fI$SOCK_TMP\fP, use that as \fIstdin\fP to a \fBnetcat\fP to \fBlog.example.com\fP on port 4321. .TP .nf xclate \-mi "$SOCK_TMP" sh \-c "exec nc log.example.com 4321 1>&0" .fi Same as the above, with a duplex connection to force the replies back to the UNIX-domain socket. Use \fBnc \-l \-U $SOCK_TMP\fP to create a local end-point (in a window). In another start a network listener with \fBnc \-l \-p 4321\fP. In a third window run the above command on the same host as the local end-point, please replace "log.example.com" with the name of the host running the listener. You should be bidirectionally connected between the two windows. An interrupt (^C) in the "nc \-l \-U" window or in the xclate window breaks the connection. .SH BUGS This program is very confusing to novice users. The complex file descriptor manipulations lead to cries of pain and massive denial. While \fBxclate\fP can be used over the top of a shell, it is considered "poor form" to leave one of those just lying around. .PP The "command mode" is more useful (most of the time) with input redirected from \fB/dev/null\fP. This is parallel to the bug where \fBssh\fP (or \fBrsh\fP) reads input it never needs. There is not a command line option to do that, because I think "), which must be quoted from the shell. .PP The name is a play on "xapply", "escalate" and "collate", and that's a bug all by itself. .PP Some programmers are confused when \fI\*(PN\fP's \fB%s\fP is not in sync with \fBxapply\fP's \fB%u\fP, this results from 2 root causes: races between subtasks in an \fBxapply\fP with a large parallel factor, and races between peer instances of \fI\*(PN\fP managed tasks unrelated to those started by \fBxapply\fP. The use of \fB%s\fP, in general, is just to let some applications re-sort the collated output as a post-processing filter. .SH AUTHOR Kevin S Braunsdorf .br NPCGuild.org .br msrc at ksb.npcguild.org .SH "SEE ALSO" .hlm 0 sh(1), csh(1), xapply(1l)'s \fB\-m\fP, \fB\-s\fP and \fB\-u\fP options, environ(7), mkcmd(1l), cat(1), hxmd(8l), ptbw(1l), wrapw(1l), dicer(5l), nc(1)