Discussion:
SSL_accept hangs
Md Lazreg
2008-03-20 00:04:53 UTC
Permalink
Hi,

I have setup an SSL server that works fine up to 400 connected clients.

When I try to have more then 400 clients, then my server hangs in the
SSL_accept call.... This happens very randomly, sometimes beyond 1000
connected clients...

The server is dead once this happen and no other client can connect.

Please note that I am using non blocking sockets so SSL_accept _should_
return... but for whatever reason it does not.

I am using openssl-0.9.8e


Any suggestions please?

Thanks
Md Lazreg
2008-03-20 14:55:43 UTC
Permalink
Thanks Steve.

If this helps anyone fixing this issue here is the backtrace once SSL_accept
hangs:

SSL_accept
ssl23_accept
ssl23_get_client_hello
ssl23_read_bytes
BIO_read
sock_read
__read_nocancel



Thanks
We experienced a similar problem and had to back rev to 9.8.d
Steve
----- Original Message -----
*Sent:* Wednesday, March 19, 2008 8:04 PM
*Subject:* SSL_accept hangs
Hi,
I have setup an SSL server that works fine up to 400 connected clients.
When I try to have more then 400 clients, then my server hangs in the
SSL_accept call.... This happens very randomly, sometimes beyond 1000
connected clients...
The server is dead once this happen and no other client can connect.
Please note that I am using non blocking sockets so SSL_accept _should_
return... but for whatever reason it does not.
I am using openssl-0.9.8e
Any suggestions please?
Thanks
------------------------------
No virus found in this incoming message.
Checked by AVG.
Version: 7.5.519 / Virus Database: 269.21.7/1335 - Release Date: 3/19/2008
9:54 AM
David Schwartz
2008-03-20 16:44:31 UTC
Permalink
Post by Md Lazreg
Hi,
I have setup an SSL server that works fine up to
400 connected clients.
When I try to have more then 400 clients, then my server hangs in the
SSL_accept call.... This happens very randomly, sometimes beyond 1000
connected clients...
The server is dead once this happen and no other client can connect.
Please note that I am using non blocking sockets so SSL_accept _should_
return... but for whatever reason it does not.
What is your code *supposed* to do if SSL_accept bails out of accept
immediately with EMFILE? If you keep looping and calling SSL_accept forever,
then your code is going to loop forever.

ret=accept(sock,(struct sockaddr *)&from,(void *)&len);
if (ret == INVALID_SOCKET)
{
if(BIO_sock_should_retry(ret)) return -2;
SYSerr(SYS_F_ACCEPT,get_last_socket_error());
BIOerr(BIO_F_BIO_ACCEPT,BIO_R_ACCEPT_ERROR);
goto end;
}

DS


______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org
Md Lazreg
2008-03-20 17:06:40 UTC
Permalink
Hi David,

My code looks like this:

1 while(1)
2 {
3 r = SSL_accept(m_ssl);
4 if (r > 0)
5 {
6 break;
7 }
8 r = ssl_retry(r);
9 if ( r <= 0)
10 {
11 break;
12 }
13 }


The issue is not that it is going into an infinite while loop. The issue is
that SSL_accept on line 3 never returns!. My socket is a non blocking one so
as far as I know SSL_accept should return.

A backtrace shows that when this happen the server gets stuck in:

SSL_accept
ssl23_accept
ssl23_get_client_hello
ssl23_read_bytes
BIO_read
sock_read
__read_nocancel

after calling SSL_accept.

Thanks
Post by David Schwartz
Post by Md Lazreg
Hi,
I have setup an SSL server that works fine up to
400 connected clients.
When I try to have more then 400 clients, then my server hangs in the
SSL_accept call.... This happens very randomly, sometimes beyond 1000
connected clients...
The server is dead once this happen and no other client can connect.
Please note that I am using non blocking sockets so SSL_accept _should_
return... but for whatever reason it does not.
What is your code *supposed* to do if SSL_accept bails out of accept
immediately with EMFILE? If you keep looping and calling SSL_accept forever,
then your code is going to loop forever.
ret=accept(sock,(struct sockaddr *)&from,(void *)&len);
if (ret == INVALID_SOCKET)
{
if(BIO_sock_should_retry(ret)) return -2;
SYSerr(SYS_F_ACCEPT,get_last_socket_error());
BIOerr(BIO_F_BIO_ACCEPT,BIO_R_ACCEPT_ERROR);
goto end;
}
DS
______________________________________________________________________
OpenSSL Project http://www.openssl.org
David Schwartz
2008-03-20 17:38:33 UTC
Permalink
Post by Md Lazreg
Hi David,
1 while(1)
2 {
3 r = SSL_accept(m_ssl);
4 if (r > 0)
5 {
6 break;
7 }
8 r = ssl_retry(r);
9 if ( r <= 0)
10 {
11 break;
12 }
13 }

Well, that's obviously badly broken. It's probably not precisely your issue,
but it's related. Since the socket is non blocking, there is no place for
this code to block waiting for the connection!
Post by Md Lazreg
The issue is not that it is going into an infinite while loop.
That's just pure luck.
Post by Md Lazreg
The issue is that SSL_accept on line 3 never returns!.
My socket is a non blocking one so as far as I know
SSL_accept should return.
How did you make it non blocking exactly? And is the BIO non-blocking too?
Post by Md Lazreg
SSL_accept
after calling SSL_accept.
Sounds like you're lucky. The BIO is actually blocking and that's saving
your code from looping. At least you're not burning the CPU. ;)

What is your design intention if 'accept' returns EMFILE or ENFILE? If your
answer is "I have no idea" or "I never really thought about it", then it's
no surprise your code mishandles this case.

DS


______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org
Md Lazreg
2008-03-20 18:49:46 UTC
Permalink
Hi David,
Post by Md Lazreg
Post by Md Lazreg
Hi David,
1 while(1)
2 {
3 r = SSL_accept(m_ssl);
4 if (r > 0)
5 {
6 break;
7 }
8 r = ssl_retry(r);
9 if ( r <= 0)
10 {
11 break;
12 }
13 }
Well, that's obviously badly broken. It's probably not precisely your issue,
but it's related. Since the socket is non blocking, there is no place for
this code to block waiting for the connection!
Well, that is not true and I am sorry I did not give you the full code as it
is quite complicated but the snipet you see above is called after a new
connection is already accepted. So I have an outer loop that does a select
and once a new connection is detected and accepted without errors, I go
ahead establishing the ssl part... Something like:


ready_sockets = ::select(m_max_socket + 1, rfds, 0, 0,&tv);
if (ready_sockets > 0)
{
if (FD_ISSET(s->get_sock(),p->get_rfds()))
{
new_s->set_non_blocking(true);
if (s->accept(new_s))
{
call the code above which will call SSL_accept
}
else
{
/*error handling*/
}


So when the SSL_accept is called I already know that accept succeed and no
EMFILE or ENFILE is generated.


I am setting the socket as non blocking by simply calling:

if (fcntl(m_sock_fd, F_SETFL, O_NONBLOCK) == -1)
{
return false;
}

I am confused when you say if my BIO is non-blocking too. I thought that it
is non blocking since the underlying socket is non blocking. Is this a wrong
assumption? if so how can I make the BIO non blocking [BIO_set_nbio?]

Thank you for you help.
Post by Md Lazreg
Post by Md Lazreg
The issue is not that it is going into an infinite while loop.
That's just pure luck.
Post by Md Lazreg
The issue is that SSL_accept on line 3 never returns!.
My socket is a non blocking one so as far as I know
SSL_accept should return.
How did you make it non blocking exactly? And is the BIO non-blocking too?
Post by Md Lazreg
SSL_accept
after calling SSL_accept.
Sounds like you're lucky. The BIO is actually blocking and that's saving
your code from looping. At least you're not burning the CPU. ;)
What is your design intention if 'accept' returns EMFILE or ENFILE? If your
answer is "I have no idea" or "I never really thought about it", then it's
no surprise your code mishandles this case.
DS
______________________________________________________________________
OpenSSL Project http://www.openssl.org
David Schwartz
2008-03-20 23:51:22 UTC
Permalink
Post by Md Lazreg
Well, that is not true and I am sorry I did not give
you the full code as it is quite complicated but the
snipet you see above is called after a new connection
is already accepted. So I have an outer loop that does
a select and once a new connection is detected and accepted
ready_sockets = ::select(m_max_socket + 1, rfds, 0, 0,&tv);
if (ready_sockets > 0)
{
if (FD_ISSET(s->get_sock(),p->get_rfds()))
{
new_s->set_non_blocking(true);
if (s->accept(new_s))
{
call the code above which will call SSL_accept
}
else
{
/*error handling*/
}

Where is the call to 'accept' (the system's 'accept')? Did you cut out a
line before 'new_s->set_non_blocking'? Is 's->accept(new_s)' a wrapper
around 'accept'? Can you paste the code to this wrapper?
if (fcntl(m_sock_fd, F_SETFL, O_NONBLOCK) == -1)
{
return false;
}

This does not make the BIO non-blocking. That may or may not matter, but to
tell I need to see where the actual call to the system's 'accept' function
is taking place. And you still haven't pasted that code.
Post by Md Lazreg
I am confused when you say if my BIO is non-blocking too.
I thought that it is non blocking since the underlying socket
is non blocking. Is this a wrong assumption? if so how can I make
the BIO non blocking [BIO_set_nbio?]
Right. A blocking BIO with a non-blocking socket can cause serious problems.

Where is the actual call to 'accept' to accept the connection? What happens
if 'accept' returns EMFILE or ENFILE?

DS


______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org
Md Lazreg
2008-03-21 00:57:04 UTC
Permalink
Post by Md Lazreg
ready_sockets = ::select(m_max_socket + 1, rfds, 0, 0,&tv);
if (ready_sockets > 0)
{
if (FD_ISSET(s->get_sock(),p->get_rfds()))
{
new_s->set_non_blocking(true);
if (s->accept(new_s))
{
call the code above which will call SSL_accept
}
else
{
/*error handling*/
}
Where is the call to 'accept' (the system's 'accept')? Did you cut out a
line before 'new_s->set_non_blocking'? Is 's->accept(new_s)' a wrapper
around 'accept'? Can you paste the code to this wrapper?
Yes the 's->accept(new_s)' is a wrapper around the system 'accept'. Here is
the code for it:


bool csocket::accept ( csocket * new_socket ) const
{
int addr_length = sizeof ( m_addr );
new_socket->m_sock_fd = ::accept ( m_sock_fd, ( sockaddr * ) &m_addr, (
socklen_t * ) &addr_length );
if ( new_socket->m_sock_fd <= 0 )
{
return false;
}
else
{
return true;
}
}

So as you can see if accept returns EMFILE or ENFILE, I go immediately to
the "error handling" section.


I have added BIO_set_nbio call to my code following your advice :

m_sbio = BIO_new_socket(m_sock_fd, BIO_NOCLOSE);
BIO_set_nbio(m_sbio,1);
SSL_set_bio(m_ssl, m_sbio, m_sbio);


Unfortunately this did not make a difference and SSL_accept still hangs,
sometimes after processing more than a 1000 clients...

Thanks again.
Post by Md Lazreg
if (fcntl(m_sock_fd, F_SETFL, O_NONBLOCK) == -1)
{
return false;
}
This does not make the BIO non-blocking. That may or may not matter, but to
tell I need to see where the actual call to the system's 'accept' function
is taking place. And you still haven't pasted that code.
Post by Md Lazreg
I am confused when you say if my BIO is non-blocking too.
I thought that it is non blocking since the underlying socket
is non blocking. Is this a wrong assumption? if so how can I make
the BIO non blocking [BIO_set_nbio?]
Right. A blocking BIO with a non-blocking socket can cause serious problems.
Where is the actual call to 'accept' to accept the connection? What happens
if 'accept' returns EMFILE or ENFILE?
DS
______________________________________________________________________
OpenSSL Project http://www.openssl.org
David Schwartz
2008-03-21 02:29:46 UTC
Permalink
To Md Lazreg:

I think I found it.

ready_sockets = ::select(m_max_socket + 1, rfds, 0, 0,&tv);
if (ready_sockets > 0)
{
if (FD_ISSET(s->get_sock(),p->get_rfds()))
{
new_s->set_non_blocking(true); /* GAK!!!! */
if (s->accept(new_s))
{ /* HERE */
call the code above which will call SSL_accept
}
else
{
/*error handling*/
}


The line marked with the 'GAK' should be:

s->set_non_blocking(true);

You don't want the listening socket to block when you call 'accept' on it.
You can't make the newly-accepted socket non-blocking until after it exists.

At the 'HERE' tag, you should probably have a:
new_s->set_non_blocking(true);

Because you don't want the newly-accepted connection to block either.
(Though you may already cover that by setting the BIO non-blocking.)

DS


______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org
Md Lazreg
2008-03-22 01:12:44 UTC
Permalink
Post by David Schwartz
I think I found it.
I think you did find it.

Now I am able to process more than 1000 clients without hanging.

This is great. Thanks a lot for your expertise.

Loading...