PDA

View Full Version : Intermittent problems with IIS


Sebastian
2004-10-06, 10:50 AM
Hello everyone out there.

We have a Windows 2003 server, serving around 50 websites, most of them very-low traffic, except for 3 or 4 of them. We are running an email server (MailEnable) and stats server on the same machine, CPU usage is very low, bandwith usage is also very low... never had any problem. Most of the sites we host use ASP scripting.

This normal working changed a couple of months ago. We started suffering from web server downs. After some of these downs, I realized that these were not server downs, but IIS taking too much time (several seconds, sometimes up to 20 or 30) to return the pages.

I even catched some of these "downs" while I was logged in on TS. I noticed that CPU usage stayed low. I then tried to analyze the log files, and added a performance log for IIS, one record per second. Analyzing the performance log files (not IIS log files), I noticed that, on these events, IIS stops responding to further requests and they are "queued"... number of queued requests informed in these log files increases during some seconds... and after 10-30 seconds, the number of queued requests get back to zero, and everything returns to work normally again. During these 10-30 seconds, the server is not serving pages.

I thought of the stats server, which I disabled, but the problem still happened.

At one time, I saw that the IIS log files indicated activity from Google adWords spider and Overture... sometimes requesting a script 2 or 3 times per second. I thought that they might be requesting too many pages at once, but I checked the URL they were requesting, and page is processed very fast.

Any clue? This is happening more than 5 times a day, so my clients are getting kind of nervous... any help will be greatly appreciated!

Thanks!

Sebastian
2004-10-06, 11:52 AM
I also noticed on IIS log files that these lines are written most times this happens... seems that IIS restarts after some problem?

#Software: Microsoft Internet Information Services 6.0
#Version: 1.0
#Date: 2004-10-05 21:31:21
#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status

DXD
2004-10-06, 19:41 PM
It would sound to me that one of your ASP websites is causing your IIS Server problems.

For your sites do you have them running in High Isolation? (Application Protection)

Low causes websites to run in the IIS process space.
- If 1 site causes an error ALL sites including IIS will die and restart.

- Medium causes websites to run in pools. If a site in a pool causes an error it can kill itself and everyone else in the same pool.

- High Isolation causes websites to run in their own process space and if they die they do no harm to anyone else.

Of course you would say HIGH is the best way to go and it is when hosting alot of websites except there is a cost penalty to this.

There is additional processing time required to start up those seperate processes.

My guess is your websites are running in Medium and one of the sites is not playing nice and causing the others to crash.

But hey I'm no expert, well I might be, I know alot about everything but I would never call myself an expert at any one thing.

Hopefully this helps.

Sebastian
2004-10-06, 20:00 PM
Thanks Chris for your response!

Yes... my first thought was that some ASP script was behaving badly. Actually, I downloaded all IIS log files, merged into one, and sorted all entries using a little program... and then tried to see if there was one script or site that could be responsible for the problem... but I could not see any relationship between the script or site last requested and the appearance of each problem.

I just read that IIS 6 manages application isolation different than IIS 5... apparently, you can have multiple application worker process running at once, and IIS 6 is smart enough to ping them, restart if they are faulty, or do so after a certain number of requests, limit bandwidth and CPU usage, etc etc...

I just did some changes to the server configuration: I allowed two worker processes to run for the standard application pool (which is holding all the sites I am running now)... As I read, if one of these worker processes gets unresponsive, the other one will take over. If this solves the problem, I will try to create another application pool, and place some websites on one, and the rest in the other pool. I will assign one worker process to each one, and see which application pool stops responding (and if the other works ok!). Performing this some times, I could track down which application is causing the problem.

Now, if it is not an application causing the problem, all this theory will be useless... :( ... well, I will cross my fingers and let you know!

Thanks again!