xorl %eax, %eax

CVE-2009-4075: Solaris sshd(1M) Dangling Thread DoS

leave a comment »

This vulnerability was disclosed by Sun Microsystems on 23 November 2009 and it affects Solaris 10 as well as OpenSolaris based upon builds snv_99 through snv_123. The buggy code resides in usr/src/cmd/ssh/sshd/sshd.c source code file. So, as you can see below…

/*
 * Main program for the daemon.
 */
int
main(int ac, char **av)
{
	extern char *optarg;
    ...
	/*
	 * We don\'t want to listen forever unless the other side
	 * successfully authenticates itself.  So we set up an alarm which is
	 * cleared after successful authentication.  A limit of zero
	 * indicates no limit. Note that we don\'t set the alarm in debugging
	 * mode; it is just annoying to have the server exit just when you
	 * are about to discover the bug.
	 */
	(void) signal(SIGALRM, grace_alarm_handler);
	if (!debug_flag)
		(void) alarm(options.login_grace_time);
    ...
	 * The child is about to start the first key exchange while the monitor
	 * stays in altprivsep_start_and_do_monitor() function.
    ...
	/*
	 * Start the monitor. That way both processes will have their own
	 * PKCS#11 sessions. See the PKCS#11 standard for more information on
	 * fork safety and packet.c for information about forking with the
	 * engine.
	 */
	altprivsep_start_and_do_monitor(options.use_openssl_engine,
	    inetd_flag, newsock, startup_pipe);
    ...
authenticated:
	
	/* Authentication complete */
	(void) alarm(0);
	/* we no longer need an alarm handler */
	(void) signal(SIGALRM, SIG_DFL);

	if (startup_pipe != -1) {
		(void) close(startup_pipe);
		startup_pipe = -1;
	}

	/* ALTPRIVSEP Child */

This is part of the main code of the SSH daemon, in the above snippet you can read that it initializes an alarm signal handler with grace_alarm_handler() routine which is basically just this:

/*
 * Signal handler for the alarm after the login grace period has expired.
 */
static void
grace_alarm_handler(int sig)
{
	/* XXX no idea how fix this signal handler */

	/* Log error and exit. */
	fatal("Timeout before authentication for %s", get_remote_ipaddr());
}

However, this signal handler affects the unprivileged child but the monitor that is being later initialized also uses an event on that communication pipe. This could result in a DoS situation because of the dangling child thread in case of a time-out and consequently, exit because of the monitor.
To fix this, the patch updated various routines, first of all the above signal handler was updated to include more useful comments as you can read below.

/*
-* Signal handler for the alarm after the login grace period has expired.
+* Signal handler for the alarm after the login grace period has expired. This
+* is for the (soon-to-be) unprivileged child only. The monitor gets an event on
+* the communication pipe and exits as well.
 */
static void
grace_alarm_handler(int sig)
{
-	/* XXX no idea how fix this signal handler */

	/* Log error and exit. */
-	fatal("Timeout before authentication for %s", get_remote_ipaddr());
+	fatal("Timeout before authentication for %.200s", get_remote_ipaddr());
}

Next, the initialization of the alarm handler was moved a few lines after in order to be able to handle monitor too like this:

-	/*
-	 * We don\'t want to listen forever unless the other side
-	 * successfully authenticates itself.  So we set up an alarm which is
-	 * cleared after successful authentication.  A limit of zero
-	 * indicates no limit. Note that we don\'t set the alarm in debugging
-	 * mode; it is just annoying to have the server exit just when you
-	 * are about to discover the bug.
-	 */
-	(void) signal(SIGALRM, grace_alarm_handler);
-	if (!debug_flag)
-		(void) alarm(options.login_grace_time);

Which was added here:

	packet_set_nonblocking();

	/*
	 * Start the monitor. That way both processes will have their own
	 * PKCS#11 sessions. See the PKCS#11 standard for more information on
	 * fork safety and packet.c for information about forking with the
	 * engine.
+	 *
+	 * Note that the monitor stays in the function while the child is the
+	 * only one that returns.
	 */
	altprivsep_start_and_do_monitor(options.use_openssl_engine,
	    inetd_flag, newsock, startup_pipe);

	/*
+	 * We don't want to listen forever unless the other side successfully
+	 * authenticates itself. So we set up an alarm which is cleared after
+	 * successful authentication. A limit of zero indicates no limit. Note
+	 * that we don't set the alarm in debugging mode; it is just annoying to
+	 * have the server exit just when you are about to discover the bug.
+	 */
+	(void) signal(SIGALRM, grace_alarm_handler);
+	if (!debug_flag)
+		(void) alarm(options.login_grace_time);
+
+	/*
	 * The child is about to start the first key exchange while the monitor
	 * stays in altprivsep_start_and_do_monitor() function.
	 */
	(void) pkcs11_engine_load(options.use_openssl_engine);

Also, the ‘authenticated’ label was removed but this is part of bug ID 6875551 which non-security related.

		do_ssh1_kex();
		authctxt = do_authentication();
	}
-authenticated:
	/* Authentication complete */
	(void) alarm(0);
	/* we no longer need an alarm handler */
	(void) signal(SIGALRM, SIG_DFL);

In addition to this, src/cmd/ssh/libssh/common/packet.c was also changed because of this patch, first to include the following comments…

	DBG(debug("packet_send done"));
}

/*
 * Waits until a packet has been received, and returns its type.  Note that
 * no other data is processed until this returns, so this function should not
 * be used during the interactive session.
+*
+* The function is also used in the monitor to read the authentication context
+* in aps_read_auth_context() via packet_read_seqnr(), before the monitor enters
+* aps_monitor_loop() and starts using the process_input() function.
*/

int
packet_read_seqnr(u_int32_t *seqnr_p)
{

and secondly, to update packet_read_seqnr() function to perform more reliable handling of a possible closed or errorneous read operation on the connection’s socket.

		/* Read data from the socket. */
		len = read(connection_in, buf, sizeof(buf));
		if (len == 0) {
-			log("Connection closed by %.200s", get_remote_ipaddr());
+			if (packet_connection_is_on_socket())
+				log("Connection closed by %.200s",
+				    get_remote_ipaddr());
+			else
+				debug("child closed the communication pipe "
+				    "before user auth was finished");
			fatal_cleanup();
		}
-		if (len < 0)
-			fatal("Read from socket failed: %.100s", strerror(errno));
+		if (len < 0) {
+			if (packet_connection_is_on_socket())
+				fatal("Read from socket failed: %.100s",
+				    strerror(errno));
+			else
+				fatal("Read from communication pipe failed: "
+				    "%.100s", strerror(errno));
+		}
		/* Append it to the buffer. */
		packet_process_incoming(buf, len);
	}
	/* NOTREACHED */
}

Instead of the simple log() and fatal() calls, the daemon will now check the connection using packet_connection_is_on_socket() and only if this returns true, it’ll print the log() error message that the connection was closed, otherwise it’ll print a debugging message using debug() and move the execution flow to fatal_cleanup().
In a similar manner, in case of a read() that returns a negative value it will check the return value of packet_connection_is_on_socket() and it’ll either invoke fatal() because it failed to read from the socket or because of failure in reading from the communication pipe. For better understanding of this patch you’ll need to know exactly what packet_connection_is_on_socket() does. So, this is a function located at src/cmd/ssh/libssh/common/packet.c that returns true if the remote host is connected via socket or false in any other case.
Back to the patch we should now move to src/cmd/ssh/sshd/altprivsep.c which as it is implied by its name is the source code file that includes part of the privilege separation of SSH protocol. First of all, altprivsep_start_and_do_monitor() was changed to include some comments as you can read here:


+/*
+ * Start and execute the code for the monitor which never returns from this
+ * function. The child will return and continue in the caller.
+ */
void
altprivsep_start_and_do_monitor(int use_engine, int inetd, int newsock,
	int statup_pipe)
{
	pid_t aps_child;
	Authctxt *authctxt;

And inside this routine, the alarm signal handler was removed since it is now placed in main() routine of the SSHd as it was shown above. So…

		 *  - PAM cleanup
		 */

-		/*
-		 * Alarm signal handler is for our child only since that's the
-		 * one that does the authentication.
-		 */
-		(void) alarm(0);
-		(void) signal(SIGALRM, SIG_DFL);
		/* this is for MaxStartups and the child takes care of that */
		(void) close(statup_pipe);

Written by xorl

December 6, 2009 at 14:59

Posted in bugs, solaris

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s