Overlapped serial i/o is broken?

I'm trying to do overlapped serial i/o under Win95, and having trouble. This is where I'm gathering info on the subject.

Note the message from Mike Lavey, who used GetCommState/SetCommState to work around a related problem.
- Dan

Overlapped Serial I/O Examples

Related Pages

Pleas for Help


Subject:      Win95, overlapped i/o, and comm port timeout problem
From:         dank@alumni.caltech.edu
Date:         1998/02/16
Newsgroups:   microsoft.public.win32.programmer.kernel,comp.os.ms-windows.programmer.win32


Under Windows 95, it seems that if I use SetCommTimeouts
to specify any blocking at all, overlapped ReadFiles
will work up to a point, then hang, i.e. the last one issued
*never* completes.  I never get another byte from the port.
This usually happens only on one side; the other side receives ok.

The problem happens as soon as a "CONNECT 19200" message comes back
from the modem and my code starts sending packets.
(I am opening the serial port and sending AT commands myself.)

Things that make the problem go away include:
1. Run under Windows NT.
2. Run on lucky machines.
3. Run with debugging libraries.
4. Set timeouts to specify "always return immediately".

The serial chips on all my machines are the usual 16550AF.
The machines that work OK are P90's running original Win95;
the ones that don't are P133's running Win95 OSR2.
I haven't tried mixing OS's.

I was able to get around this problem on my original Win95
machines by switching from {10,0,100} timeouts to {0,0,50}
timeouts, but on my OSR2 machines, the only timeouts that work are
{MAXDWORD,0,0}.

Has anyone ever seen this problem?  Can anyone point out my error?

A zipfile containing my serial i/o module is at
http://www.alumni.caltech.edu/~dank/serio/serio.zip
(I'd attach it here, but I don't think Deja News can do that.)
This is the layer that does all the win32 calls, not the layer above
it that dials or sends packets.
Here's an excerpt from serio_open() showing how I set the timeouts:

    // set the time-out parameters for all read and write operations
#if 1
    // Cause ReadFile to never wait.
    // Works OK in Win95.
    CommTimeOuts.ReadIntervalTimeout = MAXDWORD ;
    CommTimeOuts.ReadTotalTimeoutMultiplier = 0 ;
    CommTimeOuts.ReadTotalTimeoutConstant = 0 ;
#elif 0
    // Cause ReadFile to wait 50 milliseconds, then time out unconditionally.
    // Always fails on some Win95 machines; the overlapped read *never*
completes.
    // Works on more machines than does {10,0,100} (see below).
    CommTimeOuts.ReadIntervalTimeout = 0 ;
    CommTimeOuts.ReadTotalTimeoutMultiplier = 0 ;
    CommTimeOuts.ReadTotalTimeoutConstant = 50 ;
#elif 0
    // Cause ReadFile to wait 100 milliseconds for traffic to start, but
    // wait only 10 milliseconds for traffic to resume if it goes silent
    // after starting.
    // Always fails on some Win95 machines; the overlapped read *never*
completes.
    CommTimeOuts.ReadIntervalTimeout = 10 ;
    CommTimeOuts.ReadTotalTimeoutMultiplier = 0 ;
    CommTimeOuts.ReadTotalTimeoutConstant = 100 ;
#endif

I don't have time right now to package up a test case showing how I
dial and where the problem occurs, darn it, so I thought I'd ask
now rather than waiting until I had the perfect test case to post.

Subject:      WriteFile overlapped I/O does not work under Windows 95 or 98b3 !
From:         "Lars Ericsson" lae-at-ndc.se
Date:         1998/02/18
Newsgroups:   comp.os.ms-windows.programmer.win32,microsoft.public.win32.programmer.kernel

Hi,

I have been developing a multithreaded serial communication application
using VC 4.2. It is a multithreaded application where one thread is used
for communication with one COMx port. The threads uses OVERLAPPED i/o for
both reading and writing. 
The reading part arms a read I/O with both a character time-out and a total
read time-out. 
The write part just writes all bytes and then uses the
GetOverlappedResult() to figure out when the write has completed.

The application is running on a large number of Windows NT systems (3.51 &
4.0) without any problems. We then decided to approve it on Win95.

Since the COMM API is almost identical (at least regarding asynchronous
communication) we thought it would be a simple case.

At the first try the application seems to run fine. Then for some reason it
just froze. The read part is OK. After some debugging we figured out that
the problem is that the WriteFile() overlapped event is lost in some
situations. We have confirmed all data bytes in the WriteFile() call has
been properly transmitted on the COMx port, but the overlapped event is
never signalled.

The following has been tested:
- Fast and slow hardware. Faster PC -> faster it hangs.
- 9600 - 115200 baud. Higher baud -> faster it hangs.
- Stressing the system with other applications (WORD). More stress ->
faster it hang.
- With and without the FIFO enabled,  hangs both with and without.

Actions:
- I have tested with Win95, it still hangs.
- I have loaded some new drivers from the ftp.microsoft.com , it still
hangs.
- I have tested with Win95 OSR2, it still hangs.
- I have tested with Win98 beta 3, it still hangs.

I have noticed some issues regarding Win95 and its overlapped IO handling
for asyncronus communication. I doubt that many people has made the same
error. 

THERE MUST BE SOME PROBLEMS in the I/O system in Windows 95.


Is there nay one that knows of any documented problems or patches available
that addresses problems in Windows 95 asynchronous i/o systems.

Please help,
Lars Ericsson, lae-at-ndc.se

Subject:      re: WriteFile overlapped I/O 
From:         "Lars Ericsson" lae-at-ndc.se
Date:         1998/03/25

Hi Dan,

I have been very busy the last weeks and have not been able to reply and 
update you on my latest progress. The original problem with the hanging 
FileWrite() call is still a mysterious.

My software can operate in two different modes. One uses read time-outs to 
terminate the read operations and the other one reads as many characters as 
are available. If no character is available, the read call will request 1 
character, which will cause the read to terminate as soon as the first 
character arrives. This method is also known as the 'caveat'.
The code was originally written in such way the read time-outs was set 
identically independent of the mode (time-out or caveat). This works great 
in WinNT.

My code originally used the following TIMEOUT settings:
 ------------------------------------------------------------------------
CommTimeOuts.ReadIntervalTimeout         = (20000 / dcb.BaudRate) ? (20000 /dcb.BaudRate) : 1 ;
CommTimeOuts.ReadTotalTimeoutMultiplier  = 0 ;
CommTimeOuts.ReadTotalTimeoutConstant    = TOTAL_READ_TIMEOUT;
CommTimeOuts.WriteTotalTimeoutMultiplier = (20000 / dcb.BaudRate) ? (20000 / dcb.BaudRate) : 1 ;
CommTimeOuts.WriteTotalTimeoutConstant   = TOTAL_WRITE_TIMEOUT ;


After reading your information (HTML) and the manual a couple of times I 
found the following in the 'Serial Communications in Win32' article:
timeouts.ReadIntervalTimeout             = MAXDWORD;
timeouts.ReadTotalTimeoutMultiplier      = 0;
timeouts.ReadTotalTimeoutConstant        = TOTAL_READ_TIMEOUT;
timeouts.WriteTotalTimeoutMultiplier     = (20000 / dcb.BaudRate) ? (20000 / dcb.BaudRate) : 1;
timeouts.WriteTotalTimeoutConstant       = TOTAL_WRITE_TIMEOUT;

[note: above paragraph corrected 98/3/30 drk]

These settings are necessary when used with an event-based read described in 
the "Caveat" section earlier. In order for ReadFile to return 0 bytes read, 
the ReadIntervalTimeout member of the COMMTIMEOUTS structure is set to 
MAXDWORD, and the ReadTimeoutMultiplier and ReadTimeoutConstant are both set 
to zero.

Since my problem was related to the Write operation I did not really thought 
this could be the problem but I gave it a chance. I modified my code in such 
way that in uses different time-out settings depending on the read mode 
used. IT WORKS ON WinNT, Win95 & Win98 !!!!

Summary:
========
 - One could ask how a different read timeout setting could cause the write 
operation to hang ?
 - Even with the old read time-out settings, the write worked most of the 
time !
 - My software product has now been used for a while on different platforms 
and has so far proven to work OK on both WinNT and Win95.
 - Regarding the Event type (manual or automatic reset) used in a Overlapped 
I/O operation. I have been in that trap as well. The first (before Win95) 
versions of  Win32 documentation did not document this, and it works with an 
automatic event on WinNT. But when Win95 was released, the document was 
updated !!!


Keep in touch,
Lars Ericsson


From: Steve Rosenberry rosey-at-voicenet.com
Date: Tue, 24 Feb 1998 18:06:58 -0500
Subject: Re: WriteFile overlapped I/O does not work under Windows 95 or 98b3 !

Lars,

I have the exact same problem and, in fact, have an open support
incident with Microsoft Developer Network Support on this.  I've pointed
them towards your newsgroup post and also to Dan Kegel's post and web
page.

We did find that everything worked on a 166MHz Pentium clone that we
have, we saw failures on a Dell 90MHz Pentium and on 133MHz Pentium
clone by the same folks who put together the 166.

You apparently have investigated this further than I have.  If you any
additional information that I can pass to MSDN, I will and I'll let you
know when (ever the optimistic one) they provide a solution.

My workaround is:

  // pCom is a pointer to our comm port control structure.
  // CommOpen and CommClose are our functions that wrap CreateFile() and
  // CloseHandle() using our control structure.

  OVERLAPPED OverlapInfo;
  OverlapInfo.Internal = 0;
  OverlapInfo.InternalHigh = 0;
  OverlapInfo.Offset = 0;
  OverlapInfo.OffsetHigh = 0;
  OverlapInfo.hEvent = pCom->WriteEvent;
  ResetEvent( pCom->WriteEvent );

  fWriteStat = WriteFile( pCom->hFile, char_buffer, dwLength, 
                         &dwLength, OVERLAPINFO ); 

  if( !fWriteStat )
  {
    COMSTAT  ComStat;
    DWORD    dwErrorFlags;
    DWORD    rc = GetLastError();
    if( rc == ERROR_IO_PENDING )
    {

      ClearCommError( pCom->hFile, &dwErrorFlags, &ComStat );

      // wait for half second for this transmission to complete
      while( WAIT_OBJECT_0 != WaitForSingleObject( pCom->WriteEvent, 500) )
      {

        // every half second we check to see if transmit queue is empty

        ClearCommError( pCom->hFile, &dwErrorFlags, &ComStat );
        if( ComStat.cbOutQue == 0 )
        {

          // if its empty, check one last time for the signal from the 
          // serial operation
          if( WAIT_OBJECT_0 == WaitForSingleObject( pCom->WriteEvent, 0) )
          {
            break;
          }

          // if we lost the signal, the only recovery we found was to 
          // close the port and open it again.
          CommClose( pCom );
          CommOpen( pCom );
          return( -1 );

        }

      }  /**** while( WAIT_OBJECT_0 != WaitForSingleObject(pCom->WriteEvent, 500 ) ) ****/

      if( !GetOverlappedResult( pCom->hFile, &OverlapInfo, &dwLength, FALSE ) )
      {
        rc = GetLastError();
        ClearCommError( pCom->hFile, &dwErrorFlags, &ComStat );
        dwLength = 0;
      }
      else if( dwLength == 0 )
      {
        ClearCommError( pCom->hFile, &dwErrorFlags, &ComStat );
      }

      return( dwLength );

    }
    else
    {
      ClearCommError( pCom->hFile, &dwErrorFlags, &ComStat );
      dwLength = 0;
    }
  }

  return( dwLength );

}

From: "Javier Andres" jandres-at-vtools.es
Date: Wed, 25 Feb 1998 16:32:45 +0100
Subject:  Re: WriteFile overlapped I/O does not work under Windows 95 or 98b3 !

Lars,

The problem you describes is very very similar to one we have been fighting
with for months. There is a difference with your problem, we use two
threads, one for reading and another writting.

Our conclusion is that there is some error in the Win32 Com API that hangs
up the COM when you are sending and reading data without emptying the 
receiving buffer.

Finally we found a solution to the problem by making sure that whenever we
call a read to get data from the COM port we take all the bytes that are in
the reception buffer of the Win32 API.

I hope this help, but let me know.

Javier

From:  "Johan Nilsson" jni-at-esrange.ssc.se.---
Subject:  Overlapped i/o and ODBC

I myself thinks this whole thing sounds a bit unlikely and it might be a
stupid question, but this is the only reason I've found out why my
overlapped i/o sometimes fails.

I use the VC++ 5.0 SP3, NT4.0 SP3 and the MFC ODBC classes.

The problem is; sometimes, if I execute an overlapped i/o write (on a
client or server named pipe) immediately after opening (via
CDatabase::Open) an ODBC datasource, and then wait for the i/o to complete
it doesn't within normal time (I use 5 !seconds!, not milli-). Last error
if of course ERROR_IO_PENDING.

Does anyone know, if the loading of the ODBC dlls takes place
asynchronously on a high priority, so that this could prevent my i/o from
being completed within a reasonable time?

The same problem also occurs when I leave the ODBC connection open for a
long time w/o using it (dlls swapped out?); if I then writes a record to
the database and immediately after performs a overlapped write, this will
also not complete in normal time.


Subject: Serial Port problems Tx paused due to CTS low but CTS is high!
Date: Mon, 23 Mar 1998 19:26:51 GMT
From: 259-5980-at-mcimail.com (Mercedes)
Newsgroups: comp.os.ms-windows.programmer.win32

My serial port routines send anywhere from 10 to 80 megabytes of data
before all transmission stops.  I have a 10 second timeout on a write
thread.  When this times out, I call ClearCommError and
GetCommModemStatus.  The COMSTAT structure indicates that transmission
was paused due to the CTS line being low but the GetCommModemStatus
call indicates that CTS is high.  I had to deal with this issue under
DOS ages ago where the THRE and MSR interrupts would get lost due to
higher priority interrupts.  (RX and LSR).  Is Win32 having the same
problem of missing the MSR interrupt?  If so, is there any way to tell
it that CTS is high?  (Short of closing and reopening the port again)
I'm not sure I want to rely on software flow control alone, especially
since this isn't really a hardware problem.  It just seems to be the
Windows drivers that are missing the interrupt.

Of course, if ClearCommError and GetCommModemStatus aren't returning
correct statuses, then all of this means nothing and the problem is
elsewhere.  I would hope that isn't the case.  What would be the point
of returning the status if it wasn't consistent.

Any help would be greatly appreciated.


Subject: Overlapped serial I/O problems
Date: Wed, 20 Jun 2001
From: Mike Lavey 

Dan,

I visited your area of the www.alumni.caltech.edu site today whilst
searching the net for answers to my overlapped comms problem.

I must say putting the "overlap.htm" page on the net is a great idea,
and I thank you for your work here.  It has been a great help to me,
just knowing other people had the same problems stopped me from going
out of my mind wondering what I had done wrong in my code and caused to
me rethink what I was doing.

I tried the some of suggestions mentioned on the page but none of them
worked in my particular situation, but it did put me on the right track.

My problem was that whenever an overlapped write completed with zero
bytes sent, it would never recover and every subsequent write would fail
in the same way.

I took a copy of the DCB, using GetCommState, before starting the data
transfer and when a write failed in this way I reasserted the DCB,
using SetCommState.

This fixed the problem. Again this was another situation where the
code was working great under WinNT/2000 but encountered problems under
Win95/98.  I think there is a bug in the Win95/98 Kernel that causes
the DCB to become corrupt on occasion.

Thanks again for your help.
Mike

Other Helpful Info

Subject:      Re: Serial Communication Trouble with Overlapped I/O, Win95
From:         Duke Robillard duke-at-io.com
Date:         1997/09/20
Newsgroups:   comp.os.ms-windows.programmer.win32,comp.os.ms-windows.programmer.win32,comp.os.ms-windows.programmer.misc

Geoffrey Levand wrote:
> Duke Robillard :
> 
> >I'm having trouble with overlapped I/O on a serial port under Windows
> >95.  When I call WriteFile(), it returns 0, and sets ERROR_IO_PENDING,
> >as expected.  So I call GetOverlappedResult(..., TRUE) to wait for the
> >operation to finish.  It returns 1, which means success, according to
> >the docs (actually, non-zero means success), but the number of bytes
> >written (the 3rd parameter) is 0.  Anyone know what that might mean?
>
> Sounds like you have Win32 hardware handshaking enabled, but your
> device isn't setting the RTS, DSR lines. 

Thanks, Geoffrey, this was very, very close to correct.  It turned out
that
I didn't have the Flow Control flags right.  I needed

   PCF_DTRDSR | PCF_RTSCTS

But I only had the first one.

Thanks!

Duke Robillard, duke-at-io.com

Subject:      Re: Good primer/reference for overlapped IO?
From:         "PC Netwrok Group" gglass-at-cerner.com
Date:         1997/11/11
Newsgroups:   microsoft.public.win32.programmer.tools


I'd recommend "Communications Programming for Windows95" by Charles A.
Mirho & Andre Terrisse (Microsoft Press)

The cover says "The developers guide to TAPI and MAPI in Windows 95" but I
found everything I needed to do async in Win95.  It is a LOT more work than
the old "hanging out on the port and polling for bytes" but the book
helped.


Subject:      fyi: OVERLAPPED WriteFile blocks in Win95 if no event handle
From:         Dan Kegel 
Date:         1998/01/28
Message-ID:   <34CF129F.B960DDF3@alumni.caltech.edu>
Newsgroups:   microsoft.public.win32.programmer.kernel,comp.os.ms-windows.programmer.win32

FYI- here's one more way Win32 differs between Win95 and WinNT:

To do overlapped serial comm i/o in WinNT, all you have to do is
create an overlapped serial i/o handle with CreateFile 
and FILE_FLAG_OVERLAPPED, then supply a zeroed-out OVERLAPPED
structure when calling WriteFile.  WriteFile will then happily
return immediately, having queued up your overlapped write.

However, WriteFile will block (i.e. perform the i/o immediately,
and not queue up an overlapped write) in Win95 *unless you put an
event handle into the OVERLAPPED structure*.
Same thing for ReadFile. 

This isn't documented anywhere that I could see.
I'm mentioning it here to maybe save some DejaNews user half a 
day's sleuthing to see why their app works on WinNT and not Win95.
- Dan

Subject:      Re: WaitCommEvent and WriteFile deadlock ?!
From:         Dan Kegel
Date:         1998/02/17
Trialog trialog-at-compuserve.com wrote:
> Here a problem I'm experiencing now with Comm win32 API and non
> overlapped I/O.
> I have a program toread and send data over the serial port, written with
> VC5.0 and that correctly works under win95. It is a multithreaded
> program, one thread to listen serial port, and another thread to send
> data over the serial port.
> 
> My problem is whenI run this program under NT4.0, it fails. 
> ... I do not understand why does it stop since WriteFile and WaitCommEvent
> may not interact directly (one is an i/o function, the other is a comm 
> function). Is the handle of the
> corresponding resource (the serial port) locked after a WaitCommEvent ?
> I have tried (just to test) to remove the WaitCommEvent call : WriteFile
> doesn't stop...

I just did a DejaNews search for win95 & overlapped & serial & thread
while gathering grist for my "Overlapped serial I/O is broken?" page
at http://www.alumni.caltech.edu/~dank/overlap.htm , and found 
an article that might answer your question.  In short, you can't
do what you're doing without overlapped I/O, at least on NT.
Article follows.
- Dan

-- Begin --
Subject:      Re: Two Threads of Equal Priority Sharing Device Handle (and com 
port read+write)
From:         "Dave Steckler" davidst-at-nobeltec.com
-- End --

Subject: Re: Win32 Serial communication for 95
From: Jamie Robb

You must be using auto reset events for the overlapped IO.  
Try using a manual reset instead. Don't forget to ResetEvent() 
before calling ReadFile or WriteFile.

Don't ask me why the autos don't work, because I don't know!

Jamie


Subject:      Serial comm overlap etc. GOTCHAS
From:         yourfriendNOSPEW-at-worldpost.com
Date:         1998/03/21
Newsgroups:   comp.os.ms-windows.programmer.win32

I'd like to post a bit of what I learned in the past few days while
developing a serial i/o module for a commercial win95 program.

The program now does better than 1,000 characters a second in and out of
the same port through a loopback plug. No errors. I'm happy.

Here's my 'gotcha' list, including the ones you already know:

--in and out through the same port requires overlapped operation.

--using 'overlap' in an open, read, or write call requires 'overlap'  in
all 3.

--you need separate overlay structures for read and write.

--each overlay structure needs its own event handle.

--to pump out characters as fast as possible, you have to test for
'xmit-empty' else you'll outrun MS's drivers and cause errors, so you need
a waitcommevent to test for xmit-empty. This has to be in a separate thread
(which spends most of its time blocked).

--To receive characters as fast as possible while doing other things, it's
not enough to issue a read with 'nowait', because this call has a bug and
it returns characters out of order. So you need another thread to wrap a
getoverlappedresult call.

--these two additional threads should run at higher priority than your main
thread.

--all code that accesses data which is shared between threads must be in a
a critical section.
--------------------------------------

I'm not finished because I have to polish my error detection, but that's it
for now. Hope I helped someone...
   Dean


Subject:      Re: Serial comm overlap etc. GOTCHAS
From:         p.neutelings-at-computer.org (Peter J. Neutelings)

Dean wrote:
>--in and out through the same port requires overlapped operation.
Or just duplicate the filehandle.
Using a duplicated is probably also more portable to other operating
systems (if this is something you are looking after).