Webhosting bottlenecks (16)

1 Name: CyB3r h4xX0r g33k 2006-01-17 23:30 ID:yaH2BcdZ

I was thinking about the current trend of webhosts to offer huge amounts of bandwidth and disk space. It seems to me that this is similar to the shift in bottlenecks of computers themselves. As the CPU grew exponentially faster/costly than the volatile memory, which itself grew exponentially faster/costly than the hard drive, people had to adjust their techniques to account for this, introducing caching and so on.

So if this trend continues, will we have to start writing incredibly wasteful applications just to avoid using a few cycles? I always liked how Wakaba wrote once to disk instead of compiling everything every view. It's a huge problem with my Mediawiki installation, especially since I can't install a PHP caching system without root access.

2 Name: dmpk2k!hinhT6kz2E 2006-01-18 06:30 ID:Heaven

> will we have to start writing incredibly wasteful applications just to avoid using a few cycles?

I'm not certain I follow.

3 Name: CyB3r h4xX0r g33k 2006-01-18 20:17 ID:pY1kgyBg

This seems like a veiled rant against Dreamhost, who offer about 20gigs of disk space and a terabyte of bandwidth on their entry-level shared hosting plan, but limit you to 60 minutes of CPU time per day.

4 Name: dmpk2k!hinhT6kz2E 2006-01-19 04:43 ID:Heaven

It would seem to me that 60 min of CPU a day is a lot. As long as you avoid too many CGI calls, you could probably get close to that 1TB limit.

(Not that I'd know)

5 Name: CyB3r h4xX0r g33k 2006-01-19 22:27 ID:yaH2BcdZ

It is with static pages. But dynamic pages run into problems without proper caching.

6 Name: dmpk2k!hinhT6kz2E 2006-01-19 22:36 ID:Heaven

> But dynamic pages run into problems without proper caching.

Which is a problem of ~90% of dynamic sites I've seen. They could take advantage of caching, but they don't.

I don't know what's wrong with those developers.

7 Name: !WAHa.06x36 2006-01-20 12:59 ID:4k/vbLWr

People start programming in an interpreted language, and suddenly they think there's no need to consider perfomance at all any more.

I've seriously seen people write loops like for(i=0; (a = document.getElementsByTagName("link")[i]); i++) in JavaScript. This is in widely-used (widely-copied, more like) code: http://manganews.net/styleswitcher.js

8 Name: CyB3r h4xX0r g33k 2006-01-20 15:57 ID:c0m6uToT

I was just thinking the other day, actually, about a way to treat dynamic websites as static ones. Basically, there's three programs: LazyServer, the web server; and ControllerGET and ControllerPOST, which implement the application. A client requests the "/foo" resource, and LazyServer checks to see if it has /foo cached. If not, it asks ControllerGET to supply the "/foo" resource. The response from ControllerGET is guaranteed to be good until further notice, so LazyServer can cache it, and just treat it like a static file.

When someone makes a POST request, LazyServer calls ControllerPOST, which in addition to changing some data internal to ControllerGET and ControllerPOST and returning a page as a response, also gives LazyServer a list of GET resources that may have changed as a result of the POST (including regular expressions, e.g. "/cafe/kareha/1147429007*" if someone posted to that thread). LazyServer would then purge these resources from the cache.

It seems like a simple way to reduce the CPU consumption of dynamic websites. But it won't work in general for GET requests with side effects (e.g. download counters) or where sessions are required. Thoughts?

9 Name: CyB3r h4xX0r g33k 2006-01-20 16:02 ID:Heaven

By the way: Holy fuck, >>3. I knew Dreamhost was overselling, but 1 TB a month for $8 is insane.

10 Name: Albright!LC/IWhc3yc 2006-01-20 16:03 ID:WonSepyJ

>>8: That's basically what http://smarty.php.net is doing for me in Thorn. It's generating and caching pages as they are requested. Then, when a post is made (or something else dynamic-y is done), I have it frag all the pages that were changed from the cache, to be regenerated the next time they're requested. I haven't done any serious in-depth benchmarking or anything, but it does the job.

11 Name: CyB3r h4xX0r g33k 2006-01-20 16:07 ID:c0m6uToT

>>10
But this just caches the pages in a PHP script, doesn't it? I'm talking about pages actually being static, so the web server can cache it (and also cache the gzipped version or whatever, as necessary). I don't think a PHP script should do the job of a server (cache pages).

12 Name: dmpk2k!hinhT6kz2E 2006-01-21 00:27 ID:Heaven

A number of sites do what >>8 mentions with caching proxies.

A mechanism to invalidate pages based on side-effects is new to me though. It would be a nice thing to have, but I'm not aware of any system that does this. Anyone?

13 Name: CyB3r h4xX0r g33k 2006-01-21 00:28 ID:yaH2BcdZ

>>8
Squid servers do that. See http://meta.wikimedia.org/wiki/Cache_strategy for what Wikipedia does.

>>9
Everyone is doing the same thing, actually. Dreamhost is just pushing ahead a little bit.

14 Name: CyB3r h4xX0r g33k 2006-01-21 16:07 ID:Heaven

>>13
They're pushing ahead more than "a little bit." My last host's offering was an order of magnitude less: 75 GB a month for $10, and that was about the best I could find (as of late 2004) from any place that wasn't obviously fly-by-night.

Of course, in the terms of service for these overselling hosts, they always make sure you can't actually use all that bandwidth, by forbidding "download" sites, using the account for storage, combining several accounts and using them for one site, etc. Some of them even forbid you to store porn. If the bandwidth was really yours, it wouldn't matter how you used it.

15 Name: dmpk2k!hinhT6kz2E 2006-01-22 09:27 ID:Heaven

"Overselling" is just an economic optimization. The telephone system is oversold, as is the electricity grid. Aircraft too.

Imagine how expensive it would be if they had to have a system that could support everyone calling at once. Add a little bit of statistical process control and you can drop the price dramatically, while retaining acceptable quality of service 99.9% of the time.

But I have to wonder about the amount Dreamhost is offering. I'm sure they've done their math, but it still seems hairy.

16 Name: CyB3r h4xX0r g33k 2006-01-22 17:56 ID:Heaven

But like I said in >>14,

> if the bandwidth was really yours, it wouldn't matter how you used it.

Suppose I want to run a download site, or store files that I link to from elsewhere. Then I need a host that expects me to use the bandwidth they sell me -- therefore, one that doesn't oversell.

This thread has been closed. You cannot post in this thread any longer.