JOH |
|
Jabber Over HTTP Tunnel |
Sergey Poznyakoff |
JOH User Manual (split by chapter): | ? |
There are two most common scenarios for configuring Jabber Over HTTP proxy.
In the first scenario, you have a dedicated Jabber server and the port
80 (HTTP) is not used on that server. In this case you will use
standalone mode. In this mode johd
is configured
to listen on port 80 to proxy incoming requests to your
Jabber server and vice-versa.
In the second scenario, the port 80 is already in use by an HTTP
server running on the same box as your Jabber server. For such
cases, JOH provides a CGI mode. In this mode, you start
johd
to listen on an auxiliary port, and configure your
HTTP server to run a CGI program, joh.cgi
, which is included
in the package. The system then works as follows. HTTP polling
requests are received by your HTTP server, which invokes
joh.cgi
to handle them, In its turn, joh.cgi
extracts the necessary data from each request, reformats it and sends
it over to the johd
daemon over the auxiliary port. When
a subsequent request arrives, joh.cgi
receives the reply from
johd
, formats it as a HTTP response and sends it back to the
HTTP server, which sends it to the requesting client.
The CGI mode works only with HTTP Polling.
Of course, there may be combined cases, e.g.:
joh.cgi
would be installed on the HTTP server and johd
on the
Jabber server.
johd
in standalone mode on this machine
and configure it to communicate with your main Jabber server.
johd
Sockets Johd
reads its configuration from the command line. Only
the traditional short options are used. The order in which you place
options is important: some of them affect others that appear further
in the command line.
The ‘-l’ option configures a socket to listen on (hence its
mnemonics: listen). Its argument is an URL or address
specification for the socket. Normally, this specification is
the desired IP address and port number, separated by a colon. For
example, to have johd
listen on IP address 127.0.0.1, port
1111, you would write:
johd -l 127.0.0.1:1111 |
If you wish it to listen on a given port on all configured network interfaces, just specify that port alone, without a specific IP address, as in:
johd -l 1111 |
In fact, Johd
is able to work with three distinct
socket families: UNIX sockets, IPv4 and IPv6 inet addresses. There
are various ways to specify these. For a detailed discussion of them,
see URLs.
Any number of ‘-l’ options can be given: johd
will
open all required sockets and will listen for connections on any of
them.
The important point is the class of the socket to open. As you
already know, johd
works with two distinct socket classes:
HTTP sockets, which are supposed to receive data formatted in
accordance with the HTTP protocol, and auxiliary CGI sockets, which are designed
to communicate with joh.cgi
. By default, the latter is
assumed(2). The class of the socket to open is changed
by the ‘-c’ command line option: ‘-c HTTP’ tells
johd
to open all subsequent sockets for listening on HTTP
requests, and ‘-c CGI’ instructs it to open them for handling
internal CGI protocol data. The ‘-c’ option affects all
‘-l’ options that appear to the right of it in the command
line, until another ‘-c’ option is encountered, which changes
the default. To illustrate this, consider the following invocation:
johd -l 127.0.0.1:1111 \ -c HTTP -l 10.10.0.1 -l 192.168.0.2 \ -c CGI 10.10.0.1 |
It opens two sockets for auxiliary CGI: one at 127.0.0.1:1111 (it
appeared before the first ‘-c’ option and therefore belongs to the
default class, which is ‘CGI’) and the other at 10.10.0.1, which appears
after an explicit ‘-c CGI’. Notice that this later has no
port specification. If the port is missing. johd
will select
the default port for this class. The default port for ‘CGI’
is 1100(3), and the default for
‘HTTP’ is, of course, 80. Therefore, the command above will
listen for HTTP requests on 10.10.0.1:80 and 192.168.0.2:80.
Each incoming connection is validated via TCP
wrappers(4). The
default daemon (or service) name for validation coincides
with the name johd
was invoked with (i.e. is ‘johd’,
unless you renamed the program or started it via a symlink). However, the
validation rules will most probably depend on the class of socket that
received the connection: internal ‘CGI’ sockets in most cases should not
be visible outside your host, whereas ‘HTTP’ ones should be
accessible to everybody, Therefore, a special option is provided,
which changes the TCP wrapper service name for subsequent sockets.
This is the ‘-S’ option (mnemonics: Service name).
Similarly to ‘-c’, the ‘-S’ option affects all
‘-l’ options to the right of it, until another ‘-S’
option or end of the command line is encountered, whichever occurs
first.
Now, let's illustrate this by an improved version the example above:
johd -l 127.0.0.1:1111 \ -S johd-http -c HTTP -l 10.10.0.1 -l 192.168.0.2 \ -s johd-cgi -c CGI 10.10.0.1 |
In this configuration, the 127.0.0.1:1111 socket will be protected by the TCP service name ‘johd’, the two ‘HTTP’ sockets — by service name ‘johd-http’ and the ‘CGI’ socket 10.10.0.1 — by service name ‘johd-cgi’.
Connections to remote Jabber servers are also validated using TCP wrappers. However, they use different service name. The service name for validating a requested jabber connection is created using the following pattern:
srvname/jabber@ipaddr |
where srvname is the TCP service name, as described above, and ipaddr is the IP address of the server.
Configuring johd
to work in standalone mode is pretty
straightforward: all you have to do is give it an address (or
addresses) to listen on and instruct it to open these addresses in
‘HTTP’ class. In a simplest case, the following command will do:
johd -c HTTP |
It will instruct johd
to listen on port 80 on all configured
network interfaces. To select a particular address or addresses to
listen on, use the ‘-l’ option, as described in the previous
section.
It is important to configure your ‘/etc/hosts.allow’ to control accesses to the incoming HTTP port and outgoing Jabber connections. For example, the two lines below allow access to HTTP from anywhere and grant anybody the right to request any Jabber servers:
johd: ALL johd/jabber@ALL: ALL |
As a more complex example, the entries below allow access to HTTP from anywhere and limit the use of Jabber servers to 208.68.163.220 and 192.168.10.1. The use of 208.68.163.220 is granted to anybody, and the use of 192.168.10.1 is allowed only for clients coming from IP addresses in the range 192.168.0.1 — 192.168.0.254.
johd: ALL johd/jabber@208.68.163.220: ALL johd/jabber@192.168.10.1: 192.168.10.0/24 |
The ‘CGI’ mode is a bit more complicated, because it involves
configuring two components. However, the default settings are chosen
so as to simplify the configuration. First, select the socket to use
for interprocess communication between johd
and
joh.cgi
. If both processes run on the same box, then
‘localhost’ or some UNIX socket is a natural choice. Now, start the
daemon:
johd -l 127.0.0.1 |
Make sure the socket 127.0.0.1:1100 is accessible from localhost. In particular, if your ‘/etc/hosts.deny’ contains the line ‘ALL: ALL’, place this in your ‘/etc/hosts.allow’:
johd: 127.0.0.1 |
Similarly, make sure outgoing connections to selected Jabber servers are allowed for localhost:
johd/jabber@213.130.31.41: 127.0.0.1 |
Then copy joh.cgi
to your ‘cgi-bin’ directory and
you're done. You might also wish to configure your HTTP server to use
some good-looking alias for that. For example, in my Apache
configuration I use:
Alias /http-poll /var/www/cgi-bin/joh.cgi |
If your HTTP server and johd
are running on different
machines, you will need to inform joh.cgi
about the
address johd
is listening on. Suppose, for example, that
johd
is running on machine ‘A’ and is listening on
IP address 192.168.0.1, port 1100. The HTTP server is running on
the machine ‘B’, which has IP address 192.168.0.2. To tell
joh.cgi
it must connect to ‘192.168.0.1:1100’, set
the environment variable JOH_SERVER_URL
. For example, if
‘B’ is running Apache, then in your ‘httpd.conf’ you would
set:
SetEnv JOH_SERVER_URL 192.168.0.1:1100 |
Notice also, that you need to ensure that this socket on the box ‘A’ is accessible only to 192.168.0.2. For example:
johd: ALL |
johd: 192.168.0.2 |
johd
One of the basic assumptions made when designing johd
was
that it was to be run as a transport within Jabber configuration.
Therefore, after startup, johd
remains in the foreground and
does not disconnect from the controlling terminal. It also normally
sends all its diagnostic messages to the standard error output (but
see section Logging and Debugging, below.
To start jabber2
components we recommend using GNU
Pies
, instead of the default simple program manager shipped
with Jabberd2
. Pies
offers considerable
flexibility in handling jabber components. For a detailed description
of Pies, GNU Pies Manual: (pies)Top section `Top' in GNU Pies Manual. For an
example of Jabberd2
configuration with Pies, refer to
http://www.gnu.org.ua/software/pies/example.php?what=jabberd2.
To configure Pies to start johd
, add the following component
statement to your configuration file:
component johd { command "johd options"; strderr syslog err; }; |
Replace johd with the full pathname of the johd
binary, and options with the desired command line options.
For example:
component johd { command "/usr/sbin/johd -c HTTP"; strderr syslog err; }; |
Another way to start johd
is independently of the Jabber
server. To do so, give it the ‘-D’ command line option. This
option instructs johd
to disconnect from the controlling
terminal and run in the background as a daemon. Diagnostic messages
are then sent to the syslog, using the ‘daemon’ facility (this
can be changed using the ‘-F’ option; see section Logging and Debugging).
Normally, johd
continues its operation with the privileges
of the user who started it. If this user is root, you may wish
johd
to run as some other user. To do so, use the
‘-u’ option, e.g.:
johd -cHTTP -D -u nobody |
The daemon switches to new user after completing operations that require root privileges, such as, e.g. creating sockets that listen on ports below 1024, etc.
When starting johd
in daemon mode, it is also common to give
it the ‘-p’ option. This option takes a file name as argument
and causes the program to write its PID to that file after switching
to the background. If this file already exists, johd
will
read the PID from it and will check if a process with that PID is
actually running. If so, johd
refuses to startup and
outputs an appropriate diagnostics. Otherwise, it will overwrite the
file with the new PID value.
If both ‘-u’ and ‘-p’ are used, the pidfile is opened after switching to the user provileges. In this case, you should make sure the directory component of the pidfile is writable for the user supplied with the ‘-u’ option.
Following is an example startup command:
johd -D -p /var/run/johd.pid |
To automate startup and shutdown of johd
in daemon mode, use
the following shell script:
#! /bin/sh PIDFILE=/var/run/johd.pid case $1 in start) /usr/bin/johd -D -p $PIDFILE;; stop) test -f $PIDFILE && kill -TERM `cat $PIDFILE`;; restart) $0 stop $0 start;; *) echo >&2 "usage: $0 {start|stop|restart}" esac |
Proxying of Jabber connections is requested by HTTP requests
with either ‘POST’ or ‘CONNECT’ methods. Any other requests
received by johd
are normally dropped. However, ‘GET’
requests are handled separately. Normally, an incoming ‘GET’
request means that someone has pointed his web browser to the URL
served by johd
. When such a request arrives, johd
replies with a 404 response code. A compiled-in error page is sent
back in the response. This behavior can be customized in two ways.
First, you can supply a custom error page using the ‘-E’ command line option. The argument to this option must specify an absolute pathname to a valid HTML file. The contents of this file will be sent back in 404 responses.
Similarly to ‘-c’ and ‘-S’ options, the ‘-E’ option applies to all HTTP sockets created by subsequent ‘-l’ options which appear to the right of it, until another ‘-E’ or ‘-R’ option (see below) is encountered.
An example usage follows:
johd -c HTTP -E /etc/joh/404.html -l 10.10.10.1 |
Another way to handle ‘GET’ requests is to return a 303 response, redirecting the requester to another HTTP resource. This is achieved via the ‘-R’ option. Its argument is a valid URL, beginning with a ‘http://’. For example:
johd -c HTTP -R http://www.example.net/jabber |
Notice, that ‘-E’ and ‘-R’ options are mutually
exclusive. For example, the following invocation will reply
to ‘GET’ requests arriving to ‘10.10.10.1’ with the
error page read from ‘/etc/joh/404.html’, and will redirect any
‘GET’ request arriving to ‘10.10.10.2’ to
<http://www.example.net/jabber
>:
johd -c HTTP -E /etc/joh/404.html -l 10.10.10.1 \ -R http://www.example.net/jabber -l 10.10.10.2 |
The ‘joh.cgi’ utility provides similar features, except that it cannot send back a ‘404’ response.
If any request other than ‘POST’ arrives, ‘joh.cgi’
replies with the compiled-in error page, just as johd
does.
If the JOH_ERROR_PAGE
environment variable is set, and its value points
to a readable file, this file's contents is sent back instead.
If JOH_ERROR_REDIRECT
variable is set and its value is a URL
which begins with ‘http://’, joh.cgi
responds with a
redirection to that URL.
Normally, johd
prints any errors, warnings and other
diagnostic messages on standard error. You can, however, change this
default and direct all diagnostic messages to syslog. To do so,
specify the desired syslog facility with the ‘-F’ command line
option. For example:
johd -F daemon |
Allowed facility names for use with this option are: ‘user’, ‘daemon’, ‘auth’, ‘authpriv’, ‘mail’, ‘cron’, ‘local0’ through ‘local7’. All names are case-insensitive.
Notice, that when given the ‘-D’ option (see daemon),
johd
automatically assumes ‘-F daemon’, so you need
not use the ‘-F’ option, unless, of course, you want to change
the default facility.
Messages sent to syslog are prefixed by the program name. To change this prefix use the ‘-L’ option. Its argument will be used as a log tag to prefix each message with.
Each diagnostic message has a severity level associated with it. Severity levels are (in order of increasing severity): ‘debug’, ‘warning’, ‘info’, ‘error’, and ‘crit’. The latter is assigned to conditions which cause immediate termination of the program.
Normally, severity levels are not printed. To instruct johd
to precede each message with its severity level, use the ‘-P’
option.
Debugging diagnostics is useful when you trace some difficult
configuration problem or investigate a bug in johd
itself.
This diagnostics is printed only when the ‘-d’ option is given.
The argument to the ‘-d’ option is the debugging level,
an integer number ranging from 0 to 100. Level 0 effectively disables
all debugging and is equivalent to not specifying ‘-d’ option
at all. Subsequent levels produce increasing amount of debugging
information. Finally, the level 100 prints dumps of network packets
received and sent by johd
.
Notice, that the use of the ‘-d’ option with levels greater
than 10 requires a good knowledge about johd
internals and
slows down its operation, so use it sparingly, if at all.
When debugging johd
it may be helpful to know where
precisely in the source code each debugging message was generated.
This information can be obtained using the ‘-i’
(source-info) option. When it is given, each debug message is
additionally prefixed with the name of the source file and line number
in it.
The decision which class to take as the default is somehow arbitrary, we might as well have chosen HTTP, but historically it happened to be CGI.
Again, the choice was somewhat arbitrary, but we know of no other service using this number.
See hosts_access(5), for detailed description of TCP wrapper access control files.
Note also, that this feature can be disabled at compile time, by
the ‘--without-tcp-wrappers’ option to configure
,
although this is highly unrecommended.
? |
Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved.