Friday, 30 December 2011

Explaining IIS


Introduction

Whenever I look at Bill Gates, I can't help but ask: how big is Microsoft again?  Don't give me market capitalization and market share: I'm feeling paranoid and anyone can doctor the numbers.  Happily, there's the Internet. 

If you don't trust Google, you can just point your browser at random websites, over and over again, and ask them which server they're running on.  As it turns out, one out of four runs Microsoft - specifically its Internet Information Services (IIS) application.  It's a bit behind Apache's HTTP server, but it's a respectable figure nonetheless, and not really surprising at all.
Until you look at it from a historical perspective, that is, considering how indispensable web servers are, it may surprise you to know that ISS began as a free add-on to Windows NT 3.51, and remains bundled with the majority of its flagship OS to this day, even in such client versions as Vista and 7 (IIS 7.5), albeit with limited functionality in the latter two. Though thoroughly revamped in recent editions and supporting all the standard protocols (FTP/S, HTTP/S, SMTP and NNTP), its powerful potential is nevertheless denied to most Windows user simply because it's an optional feature.
In this article, we'll walk down that electronic memory lane, and see how an innocuous service quartered the World Wide Web.

History

Microsoft took its first stab at web servers almost two decades ago. Since the World Wide Web was anything but world-wide back then, it didn't take the project with a lot of enthusiasm.  Its research arm in the University of Edinburgh duly came up with a server.  It was so obviously unsellable (at a short term, anyway) they gave it out as freeware.
They had a point: the EMWAC server was fatally flawed because it had no way of scaling up.  As internet traffic grew at breakneck speed, Microsoft found its own website creaking under unrelenting pressure.  For those who were yet to be convinced by the dreamy potential of MS products, as well as that of the internet at large, a 404 error was a tad underwhelming.
So it had to bite the bullet and develop a new web server in-house.  Since nobody could fathom an age where individuals would ever want to host their own websites back then, the team quite sensibly tied the first IIS to the Windows NT 3.51 platform. The second version stuck to the strategy, supporting NT 4.0. ASP came next (IIS 3.0), and Gopher bade us farewell soon thereafter (IIS 4.0).
By the time XP arrived on the scene, 5.1 was ready to go, with its FastCGI.  Since web hosting was no longer seen as the prerogative of servers, for the first time, a limited version was released for personal editions of the Windows OS.  The myriad distinctions between professional and home users were finally dropped in Windows 7.  In XP Pro, a maximum of ten simultaneous connections were allowed.  The limit was changed to one of concurrent connections later on, so extra requests were queued rather than dropped.

Security

The moment you realize IIS was designed to fend off a looming disaster, you will come to forgive its patchwork and haphazard design.  There's simply no other way than piggyback on Windows NT's vast infrastructure.
Of course, there were heavy drawbacks.  One spectacular casualty was security: IIS' vulnerabilities were epic. There were three chief back-doors: a wide attack surface due to an automatic installation of unnecessary components, an Operating System which was generous towards rights and privileges and the explosion of dynamic scripts.
The loopholes culminated in the Code Red worm, which was a wake-up call for the corporation.  While IIS was provided ex gratia, it is nonetheless an integral component of many Windows servers.  Just as Internet Explorer became the browser of choice for the majority of web surfers, IIS became the mainstream web hosting implementation without even trying.  Why would you bother to revamp a tied-in product unless your reputation was being dragged in the mud?
So redemption came in due course.  Holes were plugged incrementally: shiny security features were rolling out in 4 through 6, even though the majority of them weren't features at all.  Web Service Extensions, for instance, was actually a ban on program execution, unless expressly authorized by the system administrator.
Another glaring design fault that was eventually remedied in 6.0 was the user policy.  IIS used to run website in-processes under a super-user account, which is about as sane as handing your shop's keys to every visitor that happens to pop by. In 6.0 a new account was assigned for the task, and since it was significantly less privileged, any successful attacks would be contained.  The kernel HTTP stack was revamped, with a stricter request parser and cache.
These innovations (or rather grudging admissions of negligence, if you've fallen foul of them before) didn't reassure too many people.  There are still a variety of third-party security suites for the paranoid.  On the application layer, the user can access a more comprehensive list of elements (even a shiny new GUI isn't out of the question).

Dynamic scripts

Another headache came in the form of dynamic scripts. When IIS was born, nobody expected a web server to do anything more than feed the browser something static to read.  It's like selling a newspaper.  Now we have very sophisticated, demanding scripts linked or even embedded in a page.  The web server becomes not so much as a bookshop as an in-house copywriter.
When dynamic scripts first came onto the scene, there was a cacophony of program snippets.  Since it was downright insane to write separate versions of codes for different OSs and different web servers, the industry sat down together and agreed on a standard API.  The Common Gateway Interface became a widely supported API.
So you write up the application in whichever language you fancy - Perl, C++, Pig Latin - and it'll take text inputs and write the page. So far so good for the IIS user!  There are only a few minor caveats: you'd better know what you're doing, because Microsoft isn't going to provide any tools.  It's not running a charity, is it?
By the time you've mastered it all, you'll realize in agony that IIS isn't being very supportive after all.  Unlike UNIX, whose programming philosophy is to do one thing and do it well, Windows prefers to do everything under a single process.  It will only spawn new ones reluctantly, and the time and resource overhead add up inexorably when you have multiple external executables.
So Microsoft came up with another approach. Executables are running your server down?  Replace them with Windows DLLs!  The Internet Server API (ISAPI) was confined to the Windows platform thereafter: if need be, you could always run Apache.  Thankfully, migration is eased by a free Perl implementation from ActiveState.
It strikes as a boatload of poetic justice, but now that everything is running under a single process, IIS has simply replaced one problem with another.  With all the eggs in one basket, IIS is now forced to share the memory address space with the DLLs that run the script.  There's no guarantee that the scripts are always immaculately written, so memory leaks would drag down the whole ship with it.  If one of them crashes, everything goes down with it.  It took Microsoft six generations to introduce a measure of isolation for processes.
But at least, Microsoft has carved out a turf to call its own.  Responding to calls from developers, Active Server Pages was built on top of ISAPI in IIS 3.0.  It is essentially the Visual Basic of web application: you put together some off-the-shelf COM objects, such as the Session one (essentially a way of remembering each visitor's preferences as they walk around), and they spew out HTML that can go directly from the server to the browser.  Visual InterDev became the developer's tool of choice for building such applications, since it provided a visual preview of the interaction between HTML and dynamic script.  Of course, developers then proceeded to complain how impossibly difficult it is to actually construct COM objects. 
And of course, until they were isolated with the rest of ISAPI in 6.0, badly built COMs could and did bring down the whole server with it. The competent coders on the other hand, frowned upon the pared-down approach.  Codes became more difficult to organize or debug. As the size and complexity of the web application increased, so did the length of the script and the advantages of a simple scripting language dissipated.  Meanwhile, genuine VB developers rejected the moniker altogether, citing the lack of high-level components in the Visual InterDev workflow.
In the end, they gave up with it all.  ASP.NET duly succeeded as the web application platform of choice, but that's another story.
In its latest incarnation, IIS had everything rebuilt from scratch. There's only so much you could plaster over before the walls come crumbling down.  Besides, internet demands have grown so drastically, so it is unthinkable that an approach which began in the 90's could scale up sufficiently.
The biggest change is the architecture. Everything is packaged in modules and installed correspondingly depending on your needs.  Apart from a more rational redistribution of tasks and responsibilities, it avoids the dangerous scenario where you install a possible extension by accident, only to leave them idle and unattended until a hacker comes around.
The core module handles all tasks related to HTTP in the pipeline.  It does all the old-fashioned work of responding to requests, disappointing people with 404s, and the occasional redirect. All security is handled in a separate module, which is responsible for such things as authentication, URL authorization, and filtering as specified by the administrator. 
Of course, there is still a small matter of actually giving out content, and as far as static ones go, they are handled in the same old way. But of course there are differences. For example, a typical website in 2010 might hold as much graphics as the entire GeoCities 15 years ago. Compression has become a complex and urgent task, and it has its own module. Responses get compressed and coded on the fly, and static content is pre-compressed.
Even such minutiae as caching and logs get their separate compartments, since a well-managed cache can cut down loading time and miscellaneous resources considerably. Especially if we are talking about twenty million clicks on a single internet meme. 
Of course, if you insist upon every single new functionality under the sun, there are quite a few in IIS 7.  FTP publishing for example has become incredibly sophisticated.  Your website can now accept SSL-based FTP connections, which is just as secure as SSL but without the right to do anything else than publish.
If you are a system administrator, there is a pack just for you: it is a user interface support for a multitude of management features under IIS 7.0. You can set ASP.NET authorization, configure your FastCGI or filter requests to your heart's content without breaking out that 900-page manual. Fret not if you have been running multiple servers across oceans and continents: there is more help yet. Load balancing has been made a lot easier through the Application Request Routing module, which is based on proxy structures and forwards requests to content servers according to such factors as HTT headers, server conditions, and a finely designed load balance algorithm.  "Media service" is the key term here: there's another module that delivers it specifically.  But if you work with database instead of content, there's also a manager for you.
The cherry on the cake about the Web Deployment Tool: it synchronizes and migrates 6.0 to 7.0, and handles web applications to boot.

Conclusion

Just migrate already. If you are thinking of hosting a web server, you might save yourself yet. But for those of us who're already in, there's no exit.
IIS has undergone nothing short of a metamorphosis, from a makeshift feature to a full-blown service.  But make no mistake: it's only there to make sure you'll buy the Operating System.  But then chances are you've got one already, haven't you?

No comments:

Post a Comment