SSL_peek vs. SSL_pending...

Post by Thomas J. Hruska
What I want to know is how do I tell OpenSSL that it is okay to do some
processing of socket data but not block even with blocking sockets?

You are asking for the impossible. There is no way to be sure a socket
operation will not block other than to set the socket non-blocking. Much
code has broken horribly due to not understanding this simple fact.

DS

______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org

Thomas J. Hruska

2007-08-24 23:38:13 UTC

Post by Thomas J. Hruska
What I want to know is how do I tell OpenSSL that it is okay to do some
processing of socket data but not block even with blocking sockets?

Platform: Windows.

MSDN Library documents select() as being exactly as I describe:

http://msdn2.microsoft.com/en-us/library/ms740141.aspx

(See the description of when readfds returns).

So now that the matter you describe has been cleared up, answer the
question.

David Schwartz

2007-08-24 23:47:09 UTC

Post by Thomas J. Hruska
http://msdn2.microsoft.com/en-us/library/ms740141.aspx
(See the description of when readfds returns).
So now that the matter you describe has been cleared up, answer the
question.

You misunderstand the documentation. Nowhere does it say that a future
operation *will* complete without blocking. (If you think it does, please
reproduce the exact section that provides this guarantee.)

DS

______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org

Thomas J. Hruska

2007-08-24 23:56:16 UTC

Which part of "For other sockets, readability means that queued data is
available for reading such that a call to recv, WSARecv, WSARecvFrom, or
recvfrom is _guaranteed not to block_." do you not understand?

David Schwartz

2007-08-25 00:24:14 UTC

Post by Thomas J. Hruska
Which part of "For other sockets, readability means that queued data is
available for reading such that a call to recv, WSARecv, WSARecvFrom, or
recvfrom is _guaranteed not to block_." do you not understand?

It means a hypothetical concurrent call, not a future actual call.

There is simply no way the implementation can assure that data will be
available in the future, and in practice, it does not.

Imagine if there was a call that waited until the size of a file was at
least X bytes. It might be reasonable to say that the function returns when
a call to 'stat' is guaranteed to return a size larger than X for the file.
This doesn't mean that a *subsequent* call to 'stat' will find the size to
be larger than X. It means that at the moment the decision was made to
return, a 'stat' call would have returned a size larger than 'X'.

Note that actual real-world applications have broken due to this
misunderstanding. I agree that your intepretation of the documentation is
quite natural; however, it is erroneous.

Like every other status-reporting function, Select reports on the status at
some hypothetical point between when you called it and when it returns. It
cannot guarantee what will happen in the future.

DS

______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org

Thomas J. Hruska

2007-08-25 02:22:06 UTC

It means a hypothetical concurrent call, not a future actual call.
There is simply no way the implementation can assure that data will be
available in the future, and in practice, it does not.

Then why in the world does select() even exist?

Post by David Schwartz
Imagine if there was a call that waited until the size of a file was at
least X bytes. It might be reasonable to say that the function returns when
a call to 'stat' is guaranteed to return a size larger than X for the file.
This doesn't mean that a *subsequent* call to 'stat' will find the size to
be larger than X. It means that at the moment the decision was made to
return, a 'stat' call would have returned a size larger than 'X'.

Sockets and files are two different things.

Post by David Schwartz
Note that actual real-world applications have broken due to this
misunderstanding. I agree that your intepretation of the documentation is
quite natural; however, it is erroneous.
Like every other status-reporting function, Select reports on the status at
some hypothetical point between when you called it and when it returns. It
cannot guarantee what will happen in the future.
DS

Um...if only one thread in one process has access to the socket handle,
I don't see how you could say that. The OS is going to go back on its
_guarantee_ that there will be data available in the next recv() call
and thus block?

And you still aren't answering my original question.

David Schwartz

2007-08-25 03:09:15 UTC

Post by Thomas J. Hruska
Which part of "For other sockets, readability means that queued data is
available for reading such that a call to recv, WSARecv,
WSARecvFrom, or
recvfrom is _guaranteed not to block_." do you not understand?

It means a hypothetical concurrent call, not a future actual call.
There is simply no way the implementation can assure that data will be
available in the future, and in practice, it does not.

Then why in the world does select() even exist?

The same reason every status-reporting function exists. Sometimes
information about the past is very useful. It just doesn't guarantee the
future.

Sockets and files are two different things.

I agree. I'm just talking about what the wording in the documentation
means. It is not guaranteeing future behavior but explaining what
past/present behavior happened.

Post by David Schwartz
Note that actual real-world applications have broken due to this
misunderstanding. I agree that your intepretation of the
documentation is
quite natural; however, it is erroneous.
Like every other status-reporting function, Select reports on the status at
some hypothetical point between when you called it and when it returns. It
cannot guarantee what will happen in the future.

Yep. Remember, 'select' is protocol-neutral. With UDP, a packet might be
dropped after 'select' returned but before you get a chance to call
'recv'. Some other protocol might allow unreceived data to be cancelled by
the other end.

Post by Thomas J. Hruska
And you still aren't answering my original question.

Which question? If the question is how you can guarantee that an operation
on a blocking socket will not block, the answer is that you cannot do so.
Many have tried, failed, and caused real problems as a result.

DS

______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org

Thomas J. Hruska

2007-08-25 03:55:46 UTC

Post by Thomas J. Hruska
Which part of "For other sockets, readability means that queued data is
available for reading such that a call to recv, WSARecv,
WSARecvFrom, or
recvfrom is _guaranteed not to block_." do you not understand?

It means a hypothetical concurrent call, not a future actual call.
There is simply no way the implementation can assure that data will be
available in the future, and in practice, it does not.

Then why in the world does select() even exist?

The same reason every status-reporting function exists. Sometimes
information about the past is very useful. It just doesn't guarantee the
future.

Sockets and files are two different things.

I agree. I'm just talking about what the wording in the documentation
means. It is not guaranteeing future behavior but explaining what
past/present behavior happened.

Post by David Schwartz
Note that actual real-world applications have broken due to this
misunderstanding. I agree that your intepretation of the
documentation is
quite natural; however, it is erroneous.
Like every other status-reporting function, Select reports on the status at
some hypothetical point between when you called it and when it returns. It
cannot guarantee what will happen in the future.

Post by Thomas J. Hruska
And you still aren't answering my original question.

Hmm...interesting. Essentially what you are saying is "If one thinks
they need to use select() on a blocking socket, use non-blocking sockets
instead. And only when non-blocking sockets are insufficient, use
select() (i.e. to avoid a CPU-eating polling type of situation without
sacrificing I/O performance that would be associated with sleep())."

Yes? If so, the above paragraph or something similar should be
documented somewhere important (e.g. the manpages). The "many have
tried, failed, and caused real problems" issue you state indicates
public documentation is not clear. The discussion of a SSL_select()
comes up every so often on this list and I believe this very issue is
the root cause.

Not every day I learn something new. Though I'm slightly horrified
because my code base is extensive and I do quite a bit of socket
programming...and I've been doing it wrong for about 7 years. This is
apparently going to be a LONG weekend.

David Schwartz

2007-08-25 07:47:57 UTC

Post by Thomas J. Hruska
Hmm...interesting. Essentially what you are saying is "If one thinks
they need to use select() on a blocking socket, use non-blocking sockets
instead. And only when non-blocking sockets are insufficient, use
select() (i.e. to avoid a CPU-eating polling type of situation without
sacrificing I/O performance that would be associated with sleep())."
Yes? If so, the above paragraph or something similar should be
documented somewhere important (e.g. the manpages).

That's one way to put what I'm saying. I agree it needs to be repeated more
often, that's one of the reasons I repeat it as often as I can.

Post by Thomas J. Hruska
The "many have
tried, failed, and caused real problems" issue you state indicates
public documentation is not clear. The discussion of a SSL_select()
comes up every so often on this list and I believe this very issue is
the root cause.
Not every day I learn something new. Though I'm slightly horrified
because my code base is extensive and I do quite a bit of socket
programming...and I've been doing it wrong for about 7 years. This is
apparently going to be a LONG weekend.

Sorry. I can tell you a few stories about people who assumed that they could
write code that never blocked and then when some strange combination of
events violated their assumptions, their code blocked, and their program
failed.

The Linux UDP inetd denial-of-service attack was one such failure.
Denial-of-service attacks on 'accept' are similar.

Now, as far as I know, there has never been a problem with a TCP read
blocking after a read hit on select (unless another process/thread grabs the
data, obviously). It's hard to imagine how that could ever be a problem in
the future either. But both of the examples I mention started out with the
same type of thinking. At the time that code was written, there was no way
it could fail either.

A TCP write is *always* a problem. Even setting the low water mark won't
save you because the low water mark is not set in *application* bytes, it's
set in *queue* bytes. And no law requires a one-to-one relationship between
them. I can't imagine a way a one-byte TCP write could block after a
'select' hit, but who knows, maybe someone will find a way to shrink the
window or something.

DS

______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org

Yves Rutschle

2007-08-27 14:10:13 UTC

Post by Thomas J. Hruska
Yes? If so, the above paragraph or something similar should be
documented somewhere important (e.g. the manpages).

That's one way to put what I'm saying. I agree it needs to be repeated more
often, that's one of the reasons I repeat it as often as I can.

I'm afraid it's worse than just the man pages.

After thinking for a fleeting moment that what you were
saying made no sense, then thinking some more and thinking
it did make sense, I went back to my standard bible
"Advanced Progamming in the Unix environment" (Stevens &
Rago), which actually explicitely states ("select and
pselect functions", p475):

"With [select's] return information, we can call the
appropriate I/O function (usually read or write) and know
that the function won't block."

I guess this should read "call the I/O function and know it
wouldn't have blocked if we'd called it instead of calling
select".

Post by Thomas J. Hruska
Not every day I learn something new. Though I'm slightly horrified
because my code base is extensive and I do quite a bit of socket
programming...and I've been doing it wrong for about 7 years. This is
apparently going to be a LONG weekend.

If even the reference litterature gets it wrong...

Y. -- who'll be writing better code from now on

______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org

Steffen DETTMER

2007-08-29 09:32:28 UTC

Hi!

I think it is important to note that a blocking read usually
should return if one single byte is available (even if more had
been requested)

Correct.

and a blocking write should return as soon as at
least one byte has been written.

No. A blocking write should block until all the requested data cen be
written.

ahh, interesting. Why should it? here, some (ancient) man 2 write tells:

write writes up to count bytes to the file referenced by the
file descriptor fd from the buffer starting at buf.

also, there is:

POSIX requires that a read() which can be proved to occur
after a write() has returned returns the new data. Note that
not all file systems are POSIX conforming.

but I think, does not require that write needs to block as long,
because an efficient implementation could queue the reset of the
data (e.g. if an NFS server is slow or a serial line not ready),
return in write, `remembering' this when read on this fd is
called and before internally attempting the read functionality,
it could empty the queue. I consider this comfortable for the
calling code (I have some simple buffering protocol
implementations doing something like this).

So wouldn't it be correct to implement a write that
systematically writes one single byte per call only (correct even
if unperformant of course)?

Are signals and EINTR also of concern? I think in practice it
makes things not easier...

For buffered communications, I assume, this should always be
possible. If there is data stored in some internal buffer (making
select returning `ready for reading'), this data must be readable
or an error must be returned. I would consider the loss of the
data from such an internal buffer an error which should be
reported (instead of blocking). `ready for writing' in such a
subsystem would mean that some write buffer has at least one byte
free and thus the next write call can return (without blocking).

You are thinking 'TCP' while we are talking about the semantics
for 'select'. Think about UDP. Is there anything in the
standard that prohibits an implementation under memory pressure
from just discarding a UDP receive queue?

Yes, there are the select usage semantics :-)
Discarding a UDP receive queue after a select returned that data
is inside could lead to a blocking read, I see. I think, it does
not matter much here, that UDP `allows' queue discarding because
it is not reliable, because as you said select is protocol
independent. So I think, if select tells (according to my ancient
man page) `to see if a read will not block' (<-- citation), a
correct implementation needs to guarantee that. If queue
discarding is possible, a flag must be stored (or so) to make
read return EAGAIN or whatever (probably causing most code to
break anyway, because noone expects EAGAIN after select > 0 :-)).

Of course, it probably isn't the best idea to rely on that, maybe
some embedded highly size optimised lib makes the one or other
compromise or so... :)

The statement `to see if a read will not block' does not sound
very concrete or formally. For instance, only the next read can be in
scope and probably only if no other call (recv, write, don't
know) is performed on this fd - I guess.

Now, think TCP for a second. Suppose a system received some TCP
data but delayed the ACK. It unblocks 'select', but later comes
under extreme memory pressure, so it discards the
unacknowledged data. Now, as far as I know, no implementation
does this. But nothing in the standard says it cannot do this
if it wants to.

Yes, and additionally, there may be implementations supporting a
select function but at the same time not even conforming the
standard, I think such `TCP stacks' exist.
BTW, which standard would it be, `4.4BSD'?

If it cannot be written for any reason I think a nice
implementation should return an error in this case.

Unfortunately, that's just not possible. The problem is that this would
require figuring out which socket operations are the subsequent operations
that you think should not block. This can't be done reliably, and will
occasionally break code that works fine now.

mmm... If the man page tells that `a read will not block' (which,
BTW, is also told for write, but not for accept), I think the
possible subsequent operations are defined: read or write.

I understand select as a call to (more or less) report the state
of some internal communication buffers and expect to make
guarantees about those buffers (no concurrent threads etc of
course).

The implementation simply cannot predict the future.

Do we speak about future `broken' implemenatations? If taking
implementation bugs into account, of course /nothing/ could ever
be guaranteed. Maybe even the select blocks in kernel 2.8.1241
because of a bug, sure :)

What I still not understood if this really is a bug (as my man
pages suggest) or if this is an acceptable (althrough less
performant and comfortable) implementations.

In the file size example, I expect read to return 0. I made a
small test program and on linux (accidently?) it does not block
when reading a truncated file (actually, select even returns
`ready for read' on an empty file).

A file is always ready. There is never anything to wait for.

I disagree here. Files may not be ready, because NFS Server may
not be responsing, a USB stick may be slow, a FTP file system may
need to dialup an ISDN line, the FIFO or STDIN could be empty,
the harddisk may be busy and need some milliseconds to read in
the requested blocks --- or the file could be a device or a
socket :)

David, do you mean `it cannot be guaranteed because no
implementation can be guaranteed to be 100% correct and may fail
in a complex situation' or do you mean `it cannot be guaranteed
logically/theoretically, even if the implementation is assumed to
be 100% correct, because of a logical dilemma'?

One could theoretically make an implementation that did 100%
guarantee it. But it would only make code that's broken appear
to work and it would break some code that currently works.
Consider any current program that calls 'select' just as a
status-reporting function and then expects a subsequent send to
block until all the data can be sent.

mmm... beside I'd consider a program expecting write to block
until all data has been written already broken, I see your point,
ok... So you say that theoretically it is guaranteed that read
won't block after select but practically there will always be a
risk to have an implementation not fulfilling this guarantee,
right?

oki,

Steffen

About Ingenico Throughout the world businesses rely on Ingenico for secure and expedient electronic transaction acceptance. Ingenico products leverage proven technology, established standards and unparalleled ergonomics to provide optimal reliability, versatility and usability. This comprehensive range of products is complemented by a global array of services and partnerships, enabling businesses in a number of vertical sectors to accept transactions anywhere their business takes them.
www.ingenico.com This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation.

About Ingenico Throughout the world businesses rely on Ingenico for secure and expedient electronic transaction acceptance. Ingenico products leverage proven technology, established standards and unparalleled ergonomics to provide optimal reliability, versatility and usability. This comprehensive range of products is complemented by a global array of services and partnerships, enabling businesses in a number of vertical sectors to accept transactions anywhere their business takes them.
www.ingenico.com This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org

David Schwartz

2007-08-29 15:07:04 UTC

Post by Steffen DETTMER
Hi!

I think it is important to note that a blocking read usually
should return if one single byte is available (even if more had
been requested)

Correct.

and a blocking write should return as soon as at
least one byte has been written.

No. A blocking write should block until all the requested data cen be
written.

ahh, interesting. Why should it?

Because this is what most people want it to do. If you tried to write two
bytes, why would you want to wait until the first one could be written but
not wait until the second one could be written? It just doesn't make much
sense.

Post by Steffen DETTMER
write writes up to count bytes to the file referenced by the
file descriptor fd from the buffer starting at buf.
POSIX requires that a read() which can be proved to occur
after a write() has returned returns the new data. Note that
not all file systems are POSIX conforming.
but I think, does not require that write needs to block as long,
because an efficient implementation could queue the reset of the
data (e.g. if an NFS server is slow or a serial line not ready),
return in write, `remembering' this when read on this fd is
called and before internally attempting the read functionality,
it could empty the queue. I consider this comfortable for the
calling code (I have some simple buffering protocol
implementations doing something like this).
So wouldn't it be correct to implement a write that
systematically writes one single byte per call only (correct even
if unperformant of course)?

Yes, it would be correct. It just would be sub-optimal.

There is some code out there that assumes that a blocking write will fully
complete unless there is an error. This code is, as you point out, broken.

However, a 'read' that blocked after some data was received would be broken.

Post by Steffen DETTMER
Are signals and EINTR also of concern? I think in practice it
makes things not easier...

Most implementations will return a short write under some circumstances.
Interruption is definitely one of them.

That's simply impossible to do. The problem is that there is no unambiguous
way to figure out whether an operation is the one that's not supposed to
block. Consider:

1) A thread calls 'select'.

2) That thread later calls 'read'.

If the 'select' changes the semantics of the 'read', then if the thread
didn't know that some other code called 'select' earlier, the later
read-calling code can break. Sometimes people call 'select' just for
statistical purposes and don't check the descriptor just because of what
'select' returned.

Post by Steffen DETTMER
Of course, it probably isn't the best idea to rely on that, maybe
some embedded highly size optimised lib makes the one or other
compromise or so... :)

There are known implementations that perform some data integrity checks at
'recv' time, so a 'select' hit that results in the data being dropped later
can lead to 'recvmsg' blocking. You can argue that these implementations are
deficient, but I think that argument would be inconsistent. This behavior is
accepted with 'accept', and it's precisely the same issue.

Post by Steffen DETTMER
The statement `to see if a read will not block' does not sound
very concrete or formally. For instance, only the next read can be in
scope and probably only if no other call (recv, write, don't
know) is performed on this fd - I guess.

The problem is that it becomes very hard to figure out what an "other call"
is. What about 'setsockopt'? If you don't even know what you're asking for,
you wouldn't even know if you had it. ;)

I'm talking about The Single Unix Specification or The Open Group Base
Specification.
http://www.opengroup.org/onlinepubs/009695399/functions/FD_SET.html
This is reasonably clear that 'select' reports current status, just as
functions like 'stat' do. They provide no more of a future guarantee than
functions like 'stat' do.

If it cannot be written for any reason I think a nice
implementation should return an error in this case.

mmm... If the man page tells that `a read will not block' (which,
BTW, is also told for write, but not for accept), I think the
possible subsequent operations are defined: read or write.

Why include "write" in the claim that a read will not block? What about
"setsockopt"? In any event, only Windows has documentation that uses that
kind of language. I think it was just sloppiness, as the select function on
Windows was a hack job just to support compatability with existing Berkeley
sockets implementations.

I understand select as a call to (more or less) report the state
of some internal communication buffers and expect to make
guarantees about those buffers (no concurrent threads etc of
course).

The implementation simply cannot predict the future.

These are not implementation bugs. These are implementation quirks. They
don't violate the relevant standards, so your code has to tolerate them if
it claims compliance with those standards.

A file is always ready. There is never anything to wait for.

Nevertheless, it's ready and cannot be waited for. With slow file systems,
they generally try to perfectly mimic the semantics of fast file systems. An
empty file is still ready to be read right now, to report correctly that it
is empty.

No, it's not guranteed even theoretically. The 'select' function is simply a
status-reporting function that does not and cannot guarantee what will
happen in the future. Assuming it will is as serious a bug as checking
permissions with something like 'access' and then assuming the information
must still be valid in the future.

DS

______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org

Yves Rutschle

2007-08-29 16:40:18 UTC

Post by Steffen DETTMER
Yes, and additionally, there may be implementations supporting a
select function but at the same time not even conforming the
standard, I think such `TCP stacks' exist.
BTW, which standard would it be, `4.4BSD'?

Actually, this page says:

"A descriptor shall be considered ready for reading when a
call to an input function with O_NONBLOCK clear would not
block, whether or not the function would transfer data
successfully."

Is that not to say that if select() says it's ready to read,
I'm guaranteed my next read() won't block?

I notice the 'would not block' instead of 'will not block'.
That makes me uneasy.

Y.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org

David Schwartz

2007-08-29 19:16:00 UTC

Post by Yves Rutschle
"A descriptor shall be considered ready for reading when a
call to an input function with O_NONBLOCK clear would not
block, whether or not the function would transfer data
successfully."

Right, that is a hypothetical concurrent read.

Post by Yves Rutschle
Is that not to say that if select() says it's ready to read,
I'm guaranteed my next read() won't block?

Nope. To say that, it would have to say "will not block". Instead it says
"would not block". This refers to a hypothetical concurrent call not
blocking at some instant in-between when you called 'select' and when it
returned.

For example, if you call 'stat' and get the size of a file, that size is the
size at some instant in-between when you called 'stat' and when it returned.
This is the same for all status-reporting functions.

Post by Yves Rutschle
I notice the 'would not block' instead of 'will not block'.
That makes me uneasy.

Exactly. It is not referring to any actual call.

DS

______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org

Steffen DETTMER

2007-08-30 10:57:32 UTC

and a blocking write should return as soon as at
least one byte has been written.

No. A blocking write should block until all the requested data cen be
written.

ahh, interesting. Why should it?

Because this is what most people want it to do.

This is acceptable for Perl, but not for C :-) Even if most
people would want a write contradicting its man page, I'd still
consider it wrong :)

Post by David Schwartz
If you tried to write two bytes, why would you want to wait
until the first one could be written but not wait until the
second one could be written? It just doesn't make much sense.

If the first byte (or any part of the buffer) could be written
instantly or (e.g. if no select returned ready before :)) after
some amount of time waited, write should return to give the
calling application the control.

I looked up write(2) on opengroup.org and found a page that
surprised me :)

The information on opengroup.org tell `The write() function shall
attempt to write nbyte bytes from the buffer...'. My man page
tell `write writes up to count bytes to the file...'. My man
page claims to be conforming to `SVr4, SVID, POSIX, X/OPEN,
4.3BSD'. The opengroup.org page distinguishes write semantics
based on what the fd is kind of (file, FIFO, STREAM, ...) which
IMHO cannot be correct because it destroys the abstraction.

It would be a pitty if some implementations would follow the
opengroup.org description and change the write semantics,
woulnd't it?

on this page, there are more `violations of file abstractions',
someone could get the impression that the `regular files on a
ext2 file system' or alike would be most true of the world, for
instance `The read() function reads data previously written to a
file.'. This is simply not true:

***@trinida:~ # md5sum tmp
fcbb60277dee9a96ec9c2fbfa47a478d tmp
***@trinida:~ # dd of=/dev/urandom if=tmp count=100 bs=1
100+0 records in
100+0 records out
***@trinida:~ # dd if=/dev/urandom of=tmp count=100 bs=1
100+0 records in
100+0 records out
***@trinida:~ # md5sum tmp
5cdceb17716e1dd4441f9bd4027fd75e tmp

:-)

When the file is /dev/urandom (a random number generator device
at least on linux), it shall NOT return the data previously
written. For devices (and sockets :)), I think this is obvious.
anyway.

Post by David Schwartz
There is some code out there that assumes that a blocking write
will fully complete unless there is an error. This code is, as
you point out, broken.
However, a 'read' that blocked after some data was received
would be broken.

ahh, ok, yes. I think, this is somewhat `symetric' for read and
write.

Post by Steffen DETTMER
correct implementation needs to guarantee that. If queue
discarding is possible, a flag must be stored (or so) to make
read return EAGAIN or whatever (probably causing most code to
break anyway, because noone expects EAGAIN after select > 0 :-)).

That's simply impossible to do. The problem is that there is no
unambiguous way to figure out whether an operation is the one
1) A thread calls 'select'.
2) That thread later calls 'read'.
If the 'select' changes the semantics of the 'read', then if the thread
didn't know that some other code called 'select' earlier, the later
read-calling code can break.

If some other thread called read (or another function), of course
before the next read a select must be called again. Only for the
next read guarantees may be made, not for the 103th read called
after the second reboot :)

I think, the API must behave in the same way independently
whether used by multiple threads or a single one, if possible.
Of course, care must be taken when e.g. two threads call select
on the same file descriptor and so on. Many ways to get it very
complex, hum...

Post by Steffen DETTMER
Of course, it probably isn't the best idea to rely on that,
maybe some embedded highly size optimised lib makes the one
or other compromise or so... :)

There are known implementations that perform some data
integrity checks at 'recv' time, so a 'select' hit that results
in the data being dropped later can lead to 'recvmsg' blocking.

I think, select simply cannot work on the low layer that may have
data that may not be valid because of pending integrity checks.
select must work on the same buffer as read (which gets
integrity checked data only).

Post by David Schwartz
You can argue that these implementations are deficient, but I
think that argument would be inconsistent. This behavior is
accepted with 'accept', and it's precisely the same issue.

mmm... my manpage talks about select and read (and select and
write), but not about accept. So I think this means for accept no
guarantees are made. My accept man page states clearly `To ensure
that accept never blocks, the passed socket s needs to have the
O_NONBLOCK flag set'.

I think, read and accept in conjunction with select simply have
different semantics. Is that right?

Post by Steffen DETTMER
The statement `to see if a read will not block' does not
sound very concrete or formally. For instance, only the next
read can be in scope and probably only if no other call
(recv, write, don't know) is performed on this fd - I guess.

The problem is that it becomes very hard to figure out what an
"other call" is. What about 'setsockopt'? If you don't even
know what you're asking for, you wouldn't even know if you had
it. ;)

setsockopt gets a filedescriptor (socket) as parameter, so it
counts as other call. Maybe there are problematic calls, right,
or sockets/files shared through fork(), where processes may
influence each other, so that it might look for one process that
behavior would be strange, but maybe in fact it is as expected
just confusing because of the other processes actions.

I think best is to avoid such constructions, seems to be very
difficult to handle...

Thanks for the link. My man page and my linux boxes here do not
comply to it (fortunality :)). But what a pitty that there are
multiple selects. I though, for sockets or select some agreement
(as 4.4BSD) would be commonly accepted. Now there is some Single
Unix doing it differently and POSIX got a pselect... What a
pitty.

(If `Portable Operating System Interface for Unix' and `Single
Unix' contradict, maybe it had better been called `Yet another
Unix' or so grpmf...)

So a BSD/Linux/POSIX compliant program working on a
BSD/Linux/POSIX compliant select won't work on a Single Unix
compliant OS.

That would mean as I understand it, that a program needs to know
whether it runs on e.g. Solaris (Single Unix) or Linux (POSIX) to
know how to use select? Is this really true? Or did I just
misunderstood?

This just shows that I have no clue about standards (I though SUS
would be `POSIX compliant' and BSD and linux would comply in
large parts).

Post by Steffen DETTMER
mmm... If the man page tells that `a read will not block' (which,
BTW, is also told for write, but not for accept), I think the
possible subsequent operations are defined: read or write.

Why include "write" in the claim that a read will not block?

Sorry, I meant, the `a read will not block' when the fd is
listed in readfds - similarily this is told for write: `write will
not block' when the fd is listed in writefds.

Post by David Schwartz
What about "setsockopt"?

I don't know if it could block, I think theoretically such option
may exist. Usually, I would expect it not to block (regradless
whether select was called or not). I don't know when someone
would call select before setsockopt (as you surely noted I'm not
so experienced in all those details).

I think, after setsockopt usually someone could simply reinvoke
select to check if read should be called (or whatever). I would
not even call select before setsockopt, only before read.
Actually, my code calls setsockopt only `at the begining' after
accept without any select. Is this wrong?

Post by David Schwartz
In any event, only Windows has documentation that uses that
kind of language. I think it was just sloppiness, as the select
function on Windows was a hack job just to support
compatability with existing Berkeley sockets implementations.

:-)

Similarily for other TCP/IP stacks, maybe for embedded devices.
Or for linux vs. Single Unix as it turned out...

A file is always ready. There is never anything to wait
for.

I disagree here. Files may not be ready, because NFS Server
may not be responsing, a USB stick may be slow, a FTP file
system may need to dialup an ISDN line, the FIFO or STDIN
could be empty, the harddisk may be busy and need some
milliseconds to read in the requested blocks --- or the file
could be a device or a socket :)

Nevertheless, it's ready and cannot be waited for. With slow
file systems, they generally try to perfectly mimic the
semantics of fast file systems. An empty file is still ready to
be read right now, to report correctly that it is empty.

But select on STDIN usually works (on linux) - is this linux
specific? I would consider it almost useless, if select couldn't
be called on STDIN because STDIN could be a `fast file system'!
Does this also mean that select wouldn't work as I expect it when
reading a e.g. ext2 filesystem from a slow media, let's say NFS
loop back, USB Stick or floppy disk? I assumed it would work, but
now as you pointed to it I find no statement about in the man
pages... :(

Post by Steffen DETTMER
mmm... beside I'd consider a program expecting write to block
until all data has been written already broken, I see your point,
ok... So you say that theoretically it is guaranteed that read
won't block after select but practically there will always be a
risk to have an implementation not fulfilling this guarantee,
right?

No, it's not guranteed even theoretically. The 'select'
function is simply a status-reporting function that does not
and cannot guarantee what will happen in the future.

sorry, I don't get this. I'm afraid that this become a kind of
circle :)

For me, select technically reports the state of a buffer. Of
course it could be implemented with a one byte buffer or maybe
even without any buffer but a flag (or set of flags). Anyway. So
lets say it is a buffer.

Because the access to the buffer is limited by the same API, I
think it should be able to guarantee. If the buffer is empties,
by a file truncation or so, either this could be unnoticed
(meaning, that the read returns the data, as it happens when
using fread buffers) or EOF would be returned - nonblocking in
any case. The status of the buffers encapsultated and hidden in
the implementation won't change in the future except through the
implementation - and this could be catched to make read at least
return an error instead of block. Actually, what happens with the
`source' of the data for this buffer (file, tcp, udp) does not
matter at all.

Without any buffers (or flags), select cannot be implemented of
course, because it is required to limit/control the access to it
(or at least to get reliable information/notification about to
set a flag).

So I would conclude that it is a status-reporting function but
also could guarantee. What do I miss?

Post by David Schwartz
Assuming it will is as serious a bug as checking permissions
with something like 'access' and then assuming the information
must still be valid in the future.

but access makes no statement at all about blocking/nonblocking
future calls? Also, I would assume that the information of access
will be valid in future as long as no other call will be made to
this resource (e.g. a chown via NFS or simply that the file would
be removed by another process - which require calls to this
resource, e.g. a remote unlink or so). Maybe access is a slightly
different topic?

Is that not to say that if select() says it's ready to
read, I'm guaranteed my next read() won't block?

I think `will not block' wouldn't be said correctly because it is
not required to even call read at all, and if it would not be
called it couldn't be said whether it blocks or does not block,
but may english is far away from being good enough to understand
such specifics correctly.

However, the `hypothetical concurrent call' IMHO is hypothetical
:) - maybe some `hypothetical sequential call' was meant, could
this be the case? I'm not sure if the word sequential is right,
maybe proximate, successive or even `in direct succession'?

At least the discussion IMHO shows that specs are not clear and
using APIs correctly is a challenge, because of those doubts the
best practice is to explicitely use non-blocking fds and that the
best documentation is no replacement for deep testing :-)

oki,

Steffen

About Ingenico Throughout the world businesses rely on Ingenico for secure and expedient electronic transaction acceptance. Ingenico products leverage proven technology, established standards and unparalleled ergonomics to provide optimal reliability, versatility and usability. This comprehensive range of products is complemented by a global array of services and partnerships, enabling businesses in a number of vertical sectors to accept transactions anywhere their business takes them.
www.ingenico.com This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation.

About Ingenico Throughout the world businesses rely on Ingenico for secure and expedient electronic transaction acceptance. Ingenico products leverage proven technology, established standards and unparalleled ergonomics to provide optimal reliability, versatility and usability. This comprehensive range of products is complemented by a global array of services and partnerships, enabling businesses in a number of vertical sectors to accept transactions anywhere their business takes them.
www.ingenico.com This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org

David Schwartz

2007-08-30 20:44:46 UTC

Post by Steffen DETTMER
This is acceptable for Perl, but not for C :-) Even if most
people would want a write contradicting its man page, I'd still
consider it wrong :)

I don't follow you.

I can think of no situation where you'd want to wait forever for the first
byte to be sent but only for a certain amount of time for the second byte to
be sent. That's one of the strangest suggestions I've ever heard.

Post by Steffen DETTMER
I looked up write(2) on opengroup.org and found a page that
surprised me :)
The information on opengroup.org tell `The write() function shall
attempt to write nbyte bytes from the buffer...'. My man page
tell `write writes up to count bytes to the file...'. My man
page claims to be conforming to `SVr4, SVID, POSIX, X/OPEN,
4.3BSD'. The opengroup.org page distinguishes write semantics
based on what the fd is kind of (file, FIFO, STREAM, ...) which
IMHO cannot be correct because it destroys the abstraction.

You don't really have a choice. If they published only the semantics for
'write' that applied to every possible thing you could ever want to write
to, they wouldn't be enough to allow you to write sane programs to deal with
sockets, files, or anything for that matter.

For example, suppose they only documented the semantics for 'select' that
applied to everything you could ever 'select' on. That would mean they
couldn't tell you that a listening TCP socket that had a new connection
would be marked ready for reading, because that applies only to sockets. So
then how would you know how to use 'select' for that?

Post by Steffen DETTMER
It would be a pitty if some implementations would follow the
opengroup.org description and change the write semantics,
woulnd't it?

Huh?

Post by Steffen DETTMER
on this page, there are more `violations of file abstractions',
someone could get the impression that the `regular files on a
ext2 file system' or alike would be most true of the world, for
instance `The read() function reads data previously written to a
fcbb60277dee9a96ec9c2fbfa47a478d tmp
100+0 records in
100+0 records out
100+0 records in
100+0 records out
5cdceb17716e1dd4441f9bd4027fd75e tmp
:-)

/dev/urandom isn't a file, it's a device.

Post by Steffen DETTMER
When the file is /dev/urandom (a random number generator device
at least on linux), it shall NOT return the data previously
written. For devices (and sockets :)), I think this is obvious.
anyway.

Again, /dev/urandom is not a file. If it helps, where you see the word
"file" replace the definition of "file" from the standard. (Or the words
"regular file".)

No, you missed my entire point. Please read it again. I was talking about
*THAT* read, not another read after that.

Post by Steffen DETTMER
I think, the API must behave in the same way independently
whether used by multiple threads or a single one, if possible.
Of course, care must be taken when e.g. two threads call select
on the same file descriptor and so on. Many ways to get it very
complex, hum...

Exactly, and because of that, it's impossible for 'select' to change the
semantics of a following 'read' or 'write'. There is no way to know whether
an application considers a particular write to be a "following" operation,
and it may be relying on the current semantics.

Post by Steffen DETTMER
Of course, it probably isn't the best idea to rely on that,
maybe some embedded highly size optimised lib makes the one
or other compromise or so... :)

There are known implementations that perform some data
integrity checks at 'recv' time, so a 'select' hit that results
in the data being dropped later can lead to 'recvmsg' blocking.

In other words, 'select' must predict the future. Sorry, that's not
possible. There is no way for 'select' to know what integrity checks will be
performed at read time.

Consider the following:

1) An application disables UDP cheksums.

2) An application calls 'select' and gets a 'read' hit on a packet with a
bad checksum.

3) An application peforms a socket option call asking for checksum checking
to be enabled.

4) An application calls 'recvmsg'.

Should it get the packet with the bad checksum? In other words, are you
really sure you want 'select' to *change* the semantics of the socket?

mmm... my manpage talks about select and read (and select and
write), but not about accept. So I think this means for accept no
guarantees are made. My accept man page states clearly `To ensure
that accept never blocks, the passed socket s needs to have the
O_NONBLOCK flag set'.
I think, read and accept in conjunction with select simply have
different semantics. Is that right?

They have different semantics for those devices that specify different
semantics. For 'select' overall, they have precisely the same semantics. One
checks for device readability one checks for device writability and what
that means depends on the device.

Post by Steffen DETTMER
The statement `to see if a read will not block' does not
sound very concrete or formally. For instance, only the next
read can be in scope and probably only if no other call
(recv, write, don't know) is performed on this fd - I guess.

The problem is that it becomes very hard to figure out what an
"other call" is. What about 'setsockopt'? If you don't even
know what you're asking for, you wouldn't even know if you had
it. ;)

setsockopt gets a filedescriptor (socket) as parameter, so it
counts as other call. Maybe there are problematic calls, right,
or sockets/files shared through fork(), where processes may
influence each other, so that it might look for one process that
behavior would be strange, but maybe in fact it is as expected
just confusing because of the other processes actions.
I think best is to avoid such constructions, seems to be very
difficult to handle...

I bring them up to address the fundamental point -- you don't want 'select'
to change the semantics of future socket operations. As a result, you can't
ask for 'select' to make future guarantees. You really can't have one
without the other.

Post by Steffen DETTMER
So a BSD/Linux/POSIX compliant program working on a
BSD/Linux/POSIX compliant select won't work on a Single Unix
compliant OS.
That would mean as I understand it, that a program needs to know
whether it runs on e.g. Solaris (Single Unix) or Linux (POSIX) to
know how to use select? Is this really true? Or did I just
misunderstood?

No. The standards are compatible.

Post by David Schwartz
Nevertheless, it's ready and cannot be waited for. With slow
file systems, they generally try to perfectly mimic the
semantics of fast file systems. An empty file is still ready to
be read right now, to report correctly that it is empty.

In general, 'select' is useless on ordinary files. How would it work with an
NFS file? If you 'select' on a file for readability, what are you waiting
for? What's going to change? Since no operation ever blocks on a regular
file, what would "readiness" mean in that context?

Post by Steffen DETTMER
Because the access to the buffer is limited by the same API, I
think it should be able to guarantee. If the buffer is empties,
by a file truncation or so, either this could be unnoticed
(meaning, that the read returns the data, as it happens when
using fread buffers) or EOF would be returned - nonblocking in
any case. The status of the buffers encapsultated and hidden in
the implementation won't change in the future except through the
implementation - and this could be catched to make read at least
return an error instead of block. Actually, what happens with the
`source' of the data for this buffer (file, tcp, udp) does not
matter at all.

But it can't be trapped to make read return an error instead of blocking.
That would require some unambiguous way to tell *which* 'read' the
application considered to be subsequent to the 'select'. As I've already
explained, that's impossible.

Post by Steffen DETTMER
So I would conclude that it is a status-reporting function but
also could guarantee. What do I miss?

That it cannot guarantee. If something after the 'select' returns success
causes the condition to change, the guarantee can only be sustained by an
unamibiguous way to identify the "subsequent" operation, and as I've already
explained, that's impossible.

Post by David Schwartz
Assuming it will is as serious a bug as checking permissions
with something like 'access' and then assuming the information
must still be valid in the future.

No, that's exactly right. It's valid in the future so long as nothing
changes. The problem is, a network connection has another end and that can
change things. It also has timers associated with it, and that can change
things. The system can also be under memory pressure, and that can change
things. The information is valid until something changes it.

Post by Steffen DETTMER
I think `will not block' wouldn't be said correctly because it is
not required to even call read at all, and if it would not be
called it couldn't be said whether it blocks or does not block,
but may english is far away from being good enough to understand
such specifics correctly.

You have a point there.

Post by Steffen DETTMER
However, the `hypothetical concurrent call' IMHO is hypothetical
:) - maybe some `hypothetical sequential call' was meant, could
this be the case? I'm not sure if the word sequential is right,
maybe proximate, successive or even `in direct succession'?

That's the problem. The call would have to have nothing intervening that
could change the status. With network connections, the other end, timers,
and even system memory pressure can change the status.

Post by Steffen DETTMER
At least the discussion IMHO shows that specs are not clear and
using APIs correctly is a challenge, because of those doubts the
best practice is to explicitely use non-blocking fds and that the
best documentation is no replacement for deep testing :-)

I agree with that. Don't assume you don't have a guarantee that isn't
explicit in the standard and that you don't even need. Especially when
precisely that has caused code to break in the past.

DS

______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org

Steffen DETTMER

2007-09-03 20:05:33 UTC

Post by Steffen DETTMER
If the first byte (or any part of the buffer) could be
written instantly or (e.g. if no select returned ready before
:)) after some amount of time waited, write should return to
give the calling application the control.

I can think of no situation where you'd want to wait forever
for the first byte to be sent but only for a certain amount of
time for the second byte to be sent. That's one of the
strangest suggestions I've ever heard.

I cannot imagine any situation where to wait forever, right, but
for some small command line tools (interruptable by ^C or so)
sometimes it makes sense to program in such a way. Some small
helper tool may wait for a request to arive (forever or ^C,
whatever happens first :)), but `during' communication, i.e.
after the first write worked, some different error handling after
some reasonable timeout is needed (however, this may not match
exactly here, because there is is on top of some protocol on top
of a serial link).

Maybe infinite timeouts or blocking I/O makes sense only in
more or less interactive things.

Post by Steffen DETTMER
I looked up write(2) on opengroup.org and found a page that
surprised me :)
The information on opengroup.org tell `The write() function
shall attempt to write nbyte bytes from the buffer...'. My
man page tell `write writes up to count bytes to the
file...'. My man page claims to be conforming to `SVr4, SVID,
POSIX, X/OPEN, 4.3BSD'. The opengroup.org page distinguishes
write semantics based on what the fd is kind of (file, FIFO,
STREAM, ...) which IMHO cannot be correct because it destroys
the abstraction.

mmm... this would be a pitty... I expect that write does
the same `something' reasonable on a socket as on a serial line.
I mean, I don't want a WaitForMultipleObject in case the object
accidently is something selectable :) but of course there are
always specifics.

Post by David Schwartz
For example, suppose they only documented the semantics for
'select' that applied to everything you could ever 'select' on.
That would mean they couldn't tell you that a listening TCP
socket that had a new connection would be marked ready for
reading, because that applies only to sockets. So then how
would you know how to use 'select' for that?

yeah, but using select before accept has a little taste of a
`workaround' in absence of another call, hasn't it? In this case
probably someone uses select just because its description is most
close to what is desired; according to my man page technically
select should not be influenced by accept situations but would be
a pitty. So yes, how would I know how to use 'select' for that?
now that you rose it, nice question.

Post by David Schwartz
/dev/urandom isn't a file, it's a device.

For me, it is a file. Wikipedia mentiones /dev/null as file
(clarifying that names are something associated with the file
itself)

Post by Steffen DETTMER
When the file is /dev/urandom (a random number generator
device at least on linux), it shall NOT return the data
previously written. For devices (and sockets :)), I think
this is obvious. anyway.

Again, /dev/urandom is not a file. If it helps, where you see
the word "file" replace the definition of "file" from the
standard. (Or the words "regular file".)

Would be horrible I think; such a limitation would reduce the
flexiblity a lot. I think device files and many other
files (including sockets) are a great idea. Its something where
you can read from and/or write to, why should it matter if it is
on a local hard disk.

No, you missed my entire point. Please read it again. I was
talking about *THAT* read, not another read after that.

sorry, seems I'm unable to get it (I read it several times :)). I
think the select could (if needed) store some flag (associated
with some fd) to remember that it returned that read must not
block by guarantee. Maybe some list including all fds where
select returned this. Any OS function (or, if possible, any OS
function that may influence this fd) resets the flag (no
guarantee anymore). But if read is called and would block because
of some changed situation it could decide to return right before
resetting the flag, maybe setting errno to EAGAIN. So I think the
guarantee itself could be given (not claiming that this would be
a good idea).

Post by Steffen DETTMER
I think, the API must behave in the same way independently
whether used by multiple threads or a single one, if
possible. Of course, care must be taken when e.g. two
threads call select on the same file descriptor and so on.
Many ways to get it very complex, hum...

Exactly, and because of that, it's impossible for 'select' to
change the semantics of a following 'read' or 'write'. There is
no way to know whether an application considers a particular
write to be a "following" operation, and it may be relying on
the current semantics.

Ahh, yes, interesting point! The concurrency may look different,
yeah. But it could be clarified, for instance, if
(semaphore-controlled / mutex protected) no read is done at the
(potentially) same time as select and vice versa. Don't know if
this would help much because it would mean that select and
reading cannot be done concurrently so someone may wonder why to
use threads. But could make sense if defined per fd.

Post by David Schwartz
There are known implementations that perform some data
integrity checks at 'recv' time, so a 'select' hit that results
in the data being dropped later can lead to 'recvmsg' blocking.

In other words, 'select' must predict the future. Sorry, that's
not possible. There is no way for 'select' to know what
integrity checks will be performed at read time.

Why predict future? If data was put to the read buffer (whether
verified or not), select and read won't block. If data is in the
buffer and by contract can be only removed by read (or close
maybe, doesn't matter), read won't block.
Wouldn't this work? I mean, at least theoretically?

Post by David Schwartz
1) An application disables UDP cheksums.
2) An application calls 'select' and gets a 'read' hit on a
packet with a bad checksum.

So this would mean the bad checksum would not be
detected/evaluated and the data would be stored to the buffer,
right?

Post by David Schwartz
3) An application peforms a socket option call asking for
checksum checking to be enabled.

ok, so from now new arriving data would not stored anymore to the
input buffer unless checksum was proven (not applied retroactive
of course - wouldn't be possible, because the checksums where not
even stored).

Post by David Schwartz
4) An application calls 'recvmsg'.
Should it get the packet with the bad checksum? In other words,
are you really sure you want 'select' to *change* the semantics
of the socket?

It gets the data arrived in the packet where the checksum was not
evaluated at all, because this was configured, yes, that would be
what I expect. select should not influence the mechanism at all
(checksum verification).

I'm sure you gave this example because my intuitive
interpretation is wrong, right? So where is the mistake?

Post by Steffen DETTMER
I think, read and accept in conjunction with select simply have
different semantics. Is that right?

They have different semantics for those devices that specify
different semantics. For 'select' overall, they have precisely
the same semantics. One checks for device readability one
checks for device writability and what that means depends on
the device.

yes, of course, the device may not even know about accept at all.
But I mean, for read non-blocking is guaranteed according to some
understandings but for accept noone claims this (but expects it
because the working examples show it :)).

Post by Steffen DETTMER
setsockopt gets a filedescriptor (socket) as parameter, so it
counts as other call. Maybe there are problematic calls, right,
or sockets/files shared through fork(), where processes may
influence each other, so that it might look for one process that
behavior would be strange, but maybe in fact it is as expected
just confusing because of the other processes actions.
I think best is to avoid such constructions, seems to be very
difficult to handle...

I bring them up to address the fundamental point -- you don't
want 'select' to change the semantics of future socket
operations. As a result, you can't ask for 'select' to make
future guarantees. You really can't have one without the other.

Ahh, this one I understand! My `flag setting' example surely
would change the semantics of future socket operation. But in a
case the application cannot distinguish (because it has no way to
decide whether this was changed or original behavior).

So you say that it is not only that select does not give this
non-block guarantee, but also that it is not desired to get such
a guarantee because of the price of changed semantics of future
socket operations, right?

What would be bad to change the semantics of future socket
operation, such as guaranteeing read won't block?

Post by Steffen DETTMER
But select on STDIN usually works (on linux) - is this linux
specific? I would consider it almost useless, if select couldn't
be called on STDIN because STDIN could be a `fast file system'!
Does this also mean that select wouldn't work as I expect it when
reading a e.g. ext2 filesystem from a slow media, let's say NFS
loop back, USB Stick or floppy disk? I assumed it would work, but
now as you pointed to it I find no statement about in the man
pages... :(

In general, 'select' is useless on ordinary files.

why that? hard disks are slow, just a few hunderds MB per second,
shared among the processes.

Ohh, actually here I assume that select does not only guarantee
that a read will not block, but also that a read will return
quickly. Don't know what the exact definition of `blocking' is.
In what I understand a 30 sec or more (undeterminded/unknown in
advance) call would be blocking.

Post by David Schwartz
How would it work with an NFS file?

for read, it would return ready for read as soon as some data
arrived in some local buffer I assume? Of course this requires
the usage of such intermediate buffers (which is not required by
the read API which is unlimited, so a performant implementation
would do things differently).

Post by David Schwartz
If you 'select' on a file for readability, what are you waiting
for? What's going to change?

the block devices data read into a kernel buffer? could take
quite time, for instance, on IDE CD-ROMs a read error may be
reported after 30 or even 120 seconds. Would be a mess if a GUI
would block such long. I mean, it would be like explorer.exe when
inserting a CD media: stalling. So in this case, the GUI could
select with some short timeout to be able to update the GUI in
between - or so.

Post by David Schwartz
Since no operation ever blocks on a regular file, what would
"readiness" mean in that context?

mmm... in the same way as for sockets of course: your data is
ready to be retrieved right now, instantly so to say.

Post by Steffen DETTMER
So I would conclude that it is a status-reporting function but
also could guarantee. What do I miss?

That it cannot guarantee. If something after the 'select'
returns success causes the condition to change, the guarantee
can only be sustained by an unamibiguous way to identify the
"subsequent" operation, and as I've already explained, that's
impossible.

mmm... a file (descriptor) is process-local. Let's require
single-threading. Now we could say: a subsequent operation is the
one that starts (is called) after the first operation has
finished (returned). Of course this would mean that calling
getpid() would expire the guarantee of select (so in practice
this simple approach would make no sense).
Wouldn't this work?

Post by Steffen DETTMER
At least the discussion IMHO shows that specs are not clear
and using APIs correctly is a challenge, because of those
doubts the best practice is to explicitely use non-blocking
fds and that the best documentation is no replacement for
deep testing :-)

I agree with that. Don't assume you don't have a guarantee that
isn't explicit in the standard and that you don't even need.
Especially when precisely that has caused code to break in the
past.

yeah, I can imagine... well, now I'll try to find out why close
sometimes blocks on my serial line (maybe kind of related, at
least it is causing some kind of defect in some system...).

I still failed to understand all the aspects (unfortunality
including the core aspect), I'll read all it again tomorrow, but
thanks a lot for your patienceful explanations!

oki,

Steffen

About Ingenico Throughout the world businesses rely on Ingenico for secure and expedient electronic transaction acceptance. Ingenico products leverage proven technology, established standards and unparalleled ergonomics to provide optimal reliability, versatility and usability. This comprehensive range of products is complemented by a global array of services and partnerships, enabling businesses in a number of vertical sectors to accept transactions anywhere their business takes them.
www.ingenico.com This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation.

About Ingenico Throughout the world businesses rely on Ingenico for secure and expedient electronic transaction acceptance. Ingenico products leverage proven technology, established standards and unparalleled ergonomics to provide optimal reliability, versatility and usability. This comprehensive range of products is complemented by a global array of services and partnerships, enabling businesses in a number of vertical sectors to accept transactions anywhere their business takes them.
www.ingenico.com This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org

David Schwartz

2007-09-04 00:36:18 UTC

Post by Steffen DETTMER
sorry, seems I'm unable to get it (I read it several times :)). I
think the select could (if needed) store some flag (associated
with some fd) to remember that it returned that read must not
block by guarantee. Maybe some list including all fds where
select returned this. Any OS function (or, if possible, any OS
function that may influence this fd) resets the flag (no
guarantee anymore). But if read is called and would block because
of some changed situation it could decide to return right before
resetting the flag, maybe setting errno to EAGAIN. So I think the
guarantee itself could be given (not claiming that this would be
a good idea).

As the examples show, there is no way to figure out *which* 'read' must not
block. There is no unamibiguous way to figure out which 'read' the
application thinks of as being the one that should not block. It's hard to
show with 'read', but I can show a simple example with 'write'.

Imagine an implementation that tried to ensure that a 'write' after 'select'
did not block. Consider:

1) An application calls 'select'. But this comes from a denial-of-service
attack detection code that's checking to see if the kernel buffers stays
full for too long. It has nothing to do with the I/O code.

2) The application calls 'write', expecting it to block until all the data
can be written.

Your change would break this application, as the 'select' would change the
semantics of the 'write'.

Now, consider this, there's a 'select' from one thread followed by a 'write'
from another thread. Are these two events unrelated, and the application
expects blocking semantics? Or is this the "subsequent write" that the
application expects not to block?

There's no way to tell.

Post by David Schwartz
In other words, 'select' must predict the future. Sorry, that's
not possible. There is no way for 'select' to know what
integrity checks will be performed at read time.

Why predict future?

Because whether or not the subsequent operation blocks depends upon the
condition of the network connection at that time.

Post by Steffen DETTMER
If data was put to the read buffer (whether
verified or not), select and read won't block. If data is in the
buffer and by contract can be only removed by read (or close
maybe, doesn't matter), read won't block.
Wouldn't this work? I mean, at least theoretically?

No, because 'select' has to work on protocols with all different kinds of
semantics. It is not theoretically possible to ensure that these semantics
will make sense with every protocol 'select' might be used with.

Post by David Schwartz
1) An application disables UDP cheksums.
2) An application calls 'select' and gets a 'read' hit on a
packet with a bad checksum.

So this would mean the bad checksum would not be
detected/evaluated and the data would be stored to the buffer,
right?

Not likely. That would mean the kernel has to verify the checksum in a
separate operation, which is a waste of memory bandwidth.

Post by David Schwartz
3) An application peforms a socket option call asking for
checksum checking to be enabled.

Suppose the input buffer *is* the packet buffer.

But you are demanding that 'select' influence the mechanism, because you are
saying that because there was a 'select' hit, the packet must be returned,
even if a subsequent change asks for it to be discarded.

If not for the 'select' hit, it would be perfectly reasonable to discard the
packet later.

Post by Steffen DETTMER
yes, of course, the device may not even know about accept at all.
But I mean, for read non-blocking is guaranteed according to some
understandings but for accept noone claims this (but expects it
because the working examples show it :)).

This is the problem. At one time, people expected it for 'accept', and their
code broke. Why tell people to repeat that mistake?

Post by David Schwartz
I bring them up to address the fundamental point -- you don't
want 'select' to change the semantics of future socket
operations. As a result, you can't ask for 'select' to make
future guarantees. You really can't have one without the other.

Ahh, this one I understand! My `flag setting' example surely
would change the semantics of future socket operation. But in a
case the application cannot distinguish (because it has no way to
decide whether this was changed or original behavior).
So you say that it is not only that select does not give this
non-block guarantee, but also that it is not desired to get such
a guarantee because of the price of changed semantics of future
socket operations, right?

That's just one of the many reasons.

Post by Steffen DETTMER
What would be bad to change the semantics of future socket
operation, such as guaranteeing read won't block?

There is no unambiguous way to figure out *which* read shouldn't block. And
if you change the semantics for the wrong read, you would break an
application that relies on blocking semantics.

Consider a 'select' followed by a 'read' in another thread. Is that the
operation that shouldn't block or are the 'select' and the 'read' unrelated?

Post by David Schwartz
In general, 'select' is useless on ordinary files.

why that? hard disks are slow, just a few hunderds MB per second,
shared among the processes.

Because 'select' is not asynchronous I/O. It's just a way to wait for
something to be ready so you can *start* an operation. And with files, the
sooner you start the better. It's just not the right tool for the job.

Post by Steffen DETTMER
Ohh, actually here I assume that select does not only guarantee
that a read will not block, but also that a read will return
quickly. Don't know what the exact definition of `blocking' is.
In what I understand a 30 sec or more (undeterminded/unknown in
advance) call would be blocking.

There's no precise definition. There's a vague notion of 'fast' (like
waiting for another process to release a kernel mutex because it's accessing
the filesystem) and 'slow' (like waiting for network activity). But
sometimes the borders get blurred. For example, NFS is really slow
operations, but it pretends their fast to give the same semantics as local
filesystems.

Post by David Schwartz
How would it work with an NFS file?

Data won't arrive until you call 'read'. And 'read' is blocking on NFS files
normally. You'd need a non-blocking read, followed by 'select', followed
perhaps by repeating the 'read'. It's hard to see how you would know, from
'select's mere indication of readability, *what* 'read' isn't going to
block.

Again, just not the right tool for the job.

Post by Steffen DETTMER
So I would conclude that it is a status-reporting function but
also could guarantee. What do I miss?

That it cannot guarantee. If something after the 'select'
returns success causes the condition to change, the guarantee
can only be sustained by an unamibiguous way to identify the
"subsequent" operation, and as I've already explained, that's
impossible.

You might be able to define a narrow subset of cases for which the guarantee
could be made, but then you'd just be repeating the mistakes of the past.
People did this same thing with 'accept', and it bit them on the butt.

What happens when a system library catches an internal signal because an
asynchronous I/O event just completed? Does that expire the guarantee? What
if you have no idea what system libaries do behind your back?

Again, we can learn from mistakes or we can repeat them. If you want
non-blocking behavior, just ask for it. Then you are assured to get it.

DS

______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org

Steffen DETTMER

2007-09-04 10:25:54 UTC

Hi again :)

Post by Steffen DETTMER
sorry, seems I'm unable to get it (I read it several times :)).

2) The application calls 'write', expecting it to block until
all the data can be written.

yes, we already talked about. I still think that this application
is simply wrong (`write writes up to count bytes', so maybe less
than count, for example zero or one). I remember that you considered
a related assumption as strange :)

No, because 'select' has to work on protocols with all
different kinds of semantics. It is not theoretically possible
to ensure that these semantics will make sense with every
protocol 'select' might be used with.

oki, maybe this is how it is implemented. But if ANY
communication would be implemented using some `kernel I/O
buffer', select could work. A price for this could be to lose
some functionality, such as retroactive CRC check frame dropping
(for whomever this would make sense).

Post by David Schwartz
1) An application disables UDP cheksums.
2) An application calls 'select' and gets a 'read' hit on a
packet with a bad checksum.

So this would mean the bad checksum would not be
detected/evaluated and the data would be stored to the
buffer, right?

Not likely. That would mean the kernel has to verify the
checksum in a separate operation, which is a waste of memory
bandwidth.

Is is a big difference whether to calculate the CRC during select
instead of inside read?

Post by David Schwartz
3) An application peforms a socket option call asking for
checksum checking to be enabled.

ok, so from now new arriving data would not stored anymore to
the input buffer unless checksum was proven (not applied
retroactive of course - wouldn't be possible, because the
checksums where not even stored).

Suppose the input buffer *is* the packet buffer.

Such an implementation would be wrong, because it could never
give the read won't block-guarantee :-)

Post by David Schwartz
4) An application calls 'recvmsg'.
Should it get the packet with the bad checksum? In other
words, are you really sure you want 'select' to *change*
the semantics of the socket?

This subsequent change IMHO does not have to be able to discard
already received packet (it affect the packets to be retrieved
from this time on). This was what I meant with this option does
not need to be retroactive (`with retrospective effect`?).

Post by David Schwartz
If not for the 'select' hit, it would be perfectly reasonable
to discard the packet later.

But not required, so why wasting the additional resources? The
framing and headers (at least the CRC) would need to be stored
(instead of just the payload data).

Post by Steffen DETTMER
yes, of course, the device may not even know about accept at
all. But I mean, for read non-blocking is guaranteed
according to some understandings but for accept noone claims
this (but expects it because the working examples show it
:)).

This is the problem. At one time, people expected it for
'accept', and their code broke. Why tell people to repeat that
mistake?

ohh, no, this was not what I meant at all!

Of couse I completely agree to tell people to use non-blocking
I/O when wanting non-blocking I/O! As it is not required to rely
on the guarantee that some read after select would not block, so
they should not rely on it (this works if this guarantee does not
exist logically, or if the guarantee is not implemented correctly
or not for all types or whatever - works in all cases so it is
the safe way).

I also understood (and agree) to your summary `select does not
guarantee that read will not block'.

(it is just the reason for it or `theoretical' vs. `practical'. I
think, it could `theoretical' be made this way, actually, I
assume [:)] it even `practical' sometimes/often the case).

Post by David Schwartz
Consider a 'select' followed by a 'read' in another thread. Is
that the operation that shouldn't block or are the 'select' and
the 'read' unrelated?

If the read was started (called) after the select finished
(returned), then this read (and only this read) is the subsequent
operation. If two threads invoke reads concurrently, there is no
way for the threads to determine which read will block and which
will not block, so the program should not do so.

Post by Steffen DETTMER
for read, it would return ready for read as soon as some data
arrived in some local buffer I assume? Of course this requires
the usage of such intermediate buffers (which is not required by
the read API which is unlimited, so a performant implementation
would do things differently).

Data won't arrive until you call 'read'. And 'read' is blocking
on NFS files normally. You'd need a non-blocking read, followed
by 'select', followed perhaps by repeating the 'read'.

Wouldn't this mean, that select would not work as specified and thus
would be a bug? Maybe I'll play with select on a NFS file when
I'd get some time. Could become surprising after that what I
learned now... :)

Post by David Schwartz
You might be able to define a narrow subset of cases for which
the guarantee could be made, but then you'd just be repeating
the mistakes of the past. People did this same thing with
'accept', and it bit them on the butt.
Again, we can learn from mistakes or we can repeat them. If you
want non-blocking behavior, just ask for it. Then you are
assured to get it.

Yes, this is the straight-forward, reliable, simple to understand
and pragmatic way, as you already told, right. This one I
understood. When the fd will never block, openssl also will never
block, so this simply is the easy and safe way of working.

Thanks for your explanation. Hope my questions weren't too
annoying for the list.

oki,

Steffen

About Ingenico Throughout the world businesses rely on Ingenico for secure and expedient electronic transaction acceptance. Ingenico products leverage proven technology, established standards and unparalleled ergonomics to provide optimal reliability, versatility and usability. This comprehensive range of products is complemented by a global array of services and partnerships, enabling businesses in a number of vertical sectors to accept transactions anywhere their business takes them.
www.ingenico.com This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation.

About Ingenico Throughout the world businesses rely on Ingenico for secure and expedient electronic transaction acceptance. Ingenico products leverage proven technology, established standards and unparalleled ergonomics to provide optimal reliability, versatility and usability. This comprehensive range of products is complemented by a global array of services and partnerships, enabling businesses in a number of vertical sectors to accept transactions anywhere their business takes them.
www.ingenico.com This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation.
______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users-MCmKBN63+***@public.gmane.org
Automated List Manager majordomo-MCmKBN63+***@public.gmane.org

David Schwartz

2007-09-04 11:08:53 UTC

Post by David Schwartz
Consider a 'select' followed by a 'read' in another thread. Is
that the operation that shouldn't block or are the 'select' and
the 'read' unrelated?

I've explained why this doesn't work at least three times now and you still
don't understand it. I'm not sure what else I can do. Please read over my
example again where you try to do the same thing with 'write' and the
implementation can't tell whether a particular 'write' is supposed to have
normal blocking semantics or special "subsequent operation" non-blocking
semantics.