Hello,
I'm currently tracking down an ongoing problem we are having with an IIS
6.0 web application deploymnet. The environment is a small web cluster with
two load-balanced web servers (win2k3 std, IIS6, ASP.NET 1.1, C# application
code) and a single database server (win2k3 std., SQL 2000 std). On each web
server, we have two distinct web applications running, for a total of 4 app
instances across the two web servers.

The problem we're seeing is that after standing idle for approximately
24-48 hours, where the only traffic seen is regular heartbeat tests from our
load balancer, IIS becomes unresponsive and begins logging 503 errors to the
IIS log. I've examined the following available logs for relevant entries:
IIS log, perfmon (various monitors), Eventlog System, Eventlog, Security,
Eventlog Application, HTTP Error and custom application logs of our own
design.

The IIS log shows the app responding normally (HTTP 200, 302, etc.) and then
all traffic stops for several hours. Typically the next entry shows an app
restart as expected as part of the application pool recycling threads.
Shortly thereafter with no serviced HTTP requests, the IIS logs show the app
recycling again whereupon it begins returning 503 errors until we manually
recyclel the w3csvc service.

The HTTP Error log shows similar entries but with a large number of
connection abandonded entries immediately prior to the 503 errors starting.
Here's a sample of the HTTP errors as the 503 errors start:

2006-02-09 06:37:13 192.168.1.1 1201 192.168.1.18 80 HTTP/1.0 GET / -
789700071 Connection_Abandoned_By_AppPool DefaultAppPool
2006-02-09 06:37:13 192.168.1.1 1205 192.168.1.16 80 HTTP/1.0 GET / -
680368608 Connection_Abandoned_By_AppPool DefaultAppPool
2006-02-09 06:37:13 192.168.1.1 1208 192.168.1.14 80 HTTP/1.0 GET / 503 1
N/A DefaultAppPool
2006-02-09 06:37:18 192.168.1.1 1215 192.168.1.18 80 HTTP/1.0 GET / 503
789700071 N/A DefaultAppPool
2006-02-09 06:37:18 192.168.1.1 1217 192.168.1.14 80 HTTP/1.0 GET / 503 1
N/A DefaultAppPool
2006-02-09 06:37:19 192.168.1.1 1219 192.168.1.16 80 HTTP/1.0 GET / 503
680368608 N/A DefaultAppPool

-------------------------------
The Security EventLog shows the following failed login for our non-default
anonymous user account:

Event Type: Failure Audit
Event Source: Security
Event Category: Logon/Logoff
Event ID: 534
Date: 2/8/2006
Time: 10:37:38 PM
User: NT AUTHORITY\SYSTEM
Computer: WEB4
Description:
Logon Failure:
Reason: The user has not been granted the requested
logon type at this machine
User Name: IISUSER_ANON
Domain: WEB4
Logon Type: 4
Logon Process: Advapi
Authentication Package: Negotiate
Workstation Name: WEB4
Caller User Name: NETWORK SERVICE
Caller Domain: NT AUTHORITY
Caller Logon ID: (0x0,0x3E4)
Caller Process ID: 2496
Transited Services: -
Source Network Address: -
Source Port: -
-----------------------

My assumption is that we are doing something in our application as a result
of the regular heartbeat requests from the load balancer, that ultimately
starve the server of needed resources. However, the perfmon logs show plenty
of available memory, cpu and drive space and no obvious problems in the .NET
CLR and ASP.NET counters. I've also checked for excessive session objects
not getting recycled and that does not appear to be a factor either.

Another symptom we've seen that we believe may be related is that frequently
IIS requires *two* restarts to completely come back to life. We are using
"net stop w3svc" to stop IIS and "net start w3svc" to restart.
Occassionally, the first time we issue a "net stop w3svc" the IIS service
will respond immediately indicating that it has stopped. We then restart the
servcie which also responds immediately. However, upon restart of IIS, the
problem frequently seems to persist.

The second stop command typically takes several minutes to complete before
the reporting that it has stopped successfully. Restarting the service at
this point will then allow our web appliation to function normally until the
problem starts again.

Any ideas on how to troubleshoot further? I'm stumped.

RE: Idle IIS 6.0 Server Eventually Stops Responding, Logs 503 errors by basin

basin
Wed Feb 22 09:10:27 CST 2006

Try using IISState in hang mode and see if it comes up with anything

"Matt Towers" wrote:

> Hello,
> I'm currently tracking down an ongoing problem we are having with an IIS
> 6.0 web application deploymnet. The environment is a small web cluster with
> two load-balanced web servers (win2k3 std, IIS6, ASP.NET 1.1, C# application
> code) and a single database server (win2k3 std., SQL 2000 std). On each web
> server, we have two distinct web applications running, for a total of 4 app
> instances across the two web servers.
>
> The problem we're seeing is that after standing idle for approximately
> 24-48 hours, where the only traffic seen is regular heartbeat tests from our
> load balancer, IIS becomes unresponsive and begins logging 503 errors to the
> IIS log. I've examined the following available logs for relevant entries:
> IIS log, perfmon (various monitors), Eventlog System, Eventlog, Security,
> Eventlog Application, HTTP Error and custom application logs of our own
> design.
>
> The IIS log shows the app responding normally (HTTP 200, 302, etc.) and then
> all traffic stops for several hours. Typically the next entry shows an app
> restart as expected as part of the application pool recycling threads.
> Shortly thereafter with no serviced HTTP requests, the IIS logs show the app
> recycling again whereupon it begins returning 503 errors until we manually
> recyclel the w3csvc service.
>
> The HTTP Error log shows similar entries but with a large number of
> connection abandonded entries immediately prior to the 503 errors starting.
> Here's a sample of the HTTP errors as the 503 errors start:
>
> 2006-02-09 06:37:13 192.168.1.1 1201 192.168.1.18 80 HTTP/1.0 GET / -
> 789700071 Connection_Abandoned_By_AppPool DefaultAppPool
> 2006-02-09 06:37:13 192.168.1.1 1205 192.168.1.16 80 HTTP/1.0 GET / -
> 680368608 Connection_Abandoned_By_AppPool DefaultAppPool
> 2006-02-09 06:37:13 192.168.1.1 1208 192.168.1.14 80 HTTP/1.0 GET / 503 1
> N/A DefaultAppPool
> 2006-02-09 06:37:18 192.168.1.1 1215 192.168.1.18 80 HTTP/1.0 GET / 503
> 789700071 N/A DefaultAppPool
> 2006-02-09 06:37:18 192.168.1.1 1217 192.168.1.14 80 HTTP/1.0 GET / 503 1
> N/A DefaultAppPool
> 2006-02-09 06:37:19 192.168.1.1 1219 192.168.1.16 80 HTTP/1.0 GET / 503
> 680368608 N/A DefaultAppPool
>
> -------------------------------
> The Security EventLog shows the following failed login for our non-default
> anonymous user account:
>
> Event Type: Failure Audit
> Event Source: Security
> Event Category: Logon/Logoff
> Event ID: 534
> Date: 2/8/2006
> Time: 10:37:38 PM
> User: NT AUTHORITY\SYSTEM
> Computer: WEB4
> Description:
> Logon Failure:
> Reason: The user has not been granted the requested
> logon type at this machine
> User Name: IISUSER_ANON
> Domain: WEB4
> Logon Type: 4
> Logon Process: Advapi
> Authentication Package: Negotiate
> Workstation Name: WEB4
> Caller User Name: NETWORK SERVICE
> Caller Domain: NT AUTHORITY
> Caller Logon ID: (0x0,0x3E4)
> Caller Process ID: 2496
> Transited Services: -
> Source Network Address: -
> Source Port: -
> -----------------------
>
> My assumption is that we are doing something in our application as a result
> of the regular heartbeat requests from the load balancer, that ultimately
> starve the server of needed resources. However, the perfmon logs show plenty
> of available memory, cpu and drive space and no obvious problems in the .NET
> CLR and ASP.NET counters. I've also checked for excessive session objects
> not getting recycled and that does not appear to be a factor either.
>
> Another symptom we've seen that we believe may be related is that frequently
> IIS requires *two* restarts to completely come back to life. We are using
> "net stop w3svc" to stop IIS and "net start w3svc" to restart.
> Occassionally, the first time we issue a "net stop w3svc" the IIS service
> will respond immediately indicating that it has stopped. We then restart the
> servcie which also responds immediately. However, upon restart of IIS, the
> problem frequently seems to persist.
>
> The second stop command typically takes several minutes to complete before
> the reporting that it has stopped successfully. Restarting the service at
> this point will then allow our web appliation to function normally until the
> problem starts again.
>
> Any ideas on how to troubleshoot further? I'm stumped.