Hi there
We have been having a rather strange problem on one of our classic ASP
websites for a couple of weeks now. During office hours when under
load the site has 100% uptime. The problem is, that after hours
(usually around 2 AM-6AM) and over weekends, when there are hardly any
visitors, the site just becomes non responsive at times. It basically
just times out.
It runs on two Dell PowerEdge servers with Windows Server 2003
Enterprise Edition, a web server and seperate SQL server. The site is
not that busy, only about 16 000 unique visitors per day, so it is not
a server performance issue as we run decent hardware. The technologies
used are IIS 6.0 and SQL Server 2000 with load balanced Cisco switches
and firewalls between the DB and Web servers.
There are no errors in the System or Application logs, the http
performance log is also fine with nothing out of the ordinary.
The symptoms are pretty strange, the site just starts timing out at
night. There are no services scheduled at the time that the site goes
down, it also does not go down at the same time every night or over
weekends.
When I restart IIS, the site comes back online for a while then just
goes down again. However when I do not restart IIS, the site keeps on
going online and offline at random until permanently recovering by
itself around 6 AM before business hours and then staying online
throughout the day for the cycle to repeat itself the following
night.
Because it only goes down at night, it basically rules out the
majority off causes that I have looked at;
- Performance issues on the servers (would have failed under load)
- Anti-Virus/Firewall etc issues interfering with the server (would
have caused problems during the day)
- Website application issues (would have caused problems during the
day and we use the same CMS for a few sites and we don't have issues
on any other site)
- Backups on the server (Backup schedule does not correlate with
downtimes)
- No errors reported in eventlogs or http performance logs
- Windows updates (no updates were installed at the time that the
problems started)
- Server changes (there were no known changes to the servers at the
time the problems started)
What I am still looking at;
- DoS attacks at night, however there is no evidence of this and we do
not maintain the firewalls so log access is almost impossible and the
ISP does not wan't to play along, which makes me a bit suspicious.
- General network issues at night in the datacentre (this would
however not explain why an IIS reset temporarily fixes the problem)
- The site was working perfectly until a certain date, then suddenly
it started going down. Trying to find out what changed.
- What I find funny is that certain connections to the server seems to
go through. For instance on one internet connection the site is
available but on others it times out. It also times out on one
connection only to work 10 seconds later after a refresh on the same.
So it feels like something is interfering with sessions.
My main focus is however still on IIS 6.0. I feel that is where the
problem is but I am pretty much stumped currently. Anyone experieced
anything similar? Know of what could be causing this?
Sorry for the long and rambling explanation, if you need any more
details please let me know.
Thanks!