Today I fight, to speed the site

Admin

Today I have been working on a few optimisations to try to speed up the site, by tweaking PHP, Drupal, Apache, and MySQL.

Here is what I've done:

Drupal:
* Installed a JavaScript aggregation system.
* Removed the "Next thread"/"Last thread" links on forum pages, which wasted a huge amount of time generating link that so far as I can tell never appeared on the page.
* Removed the check for URL aliases for anything beginning /user or /comment - we aren't going to alias those so there's no point checking hundreds of comments for them.
* Moved JavaScript to the end of the file rather than the head.
* Set JavaScript to have the defer="defer" property.
(The last two broke IE, so I rolled back. Bah.)

Apache:
* Enabled Apache Gzip compression for browsers that support it.
* Increased the Apache expiry time for caching for some filetypes from one day to three.
* Enabled ETags in Apache, though since we are a single-server thing, this will have minimal effect. ETags would be *so* cool if multiple sites could use them to share and cache the same resource...

MySQL
* Increased MySQL cache sizes.
* Disabled MySQL system-level file locking.
* Reduced a number of timeouts.

PHP:
* Disabled output buffering.
* Reduced a couple of timeouts.
* Increased the number of bits per character for the session ID

Most of these will have negligible or no impact even on speed, but taken together, I hope you find it at least a little noticeable. Please let me know if anything breaks.

[Edit: Stuff we could still improve:
Page load times for me have dropped from 12-18 seconds per page to 5-7 seconds, or only 1-2 seconds if I'm not logged in!

While I have it down to a single javascript file, this page still has 15 css files and 7 css background images. That's a lot of spurious http requests that could probably be reduced, with work. But since they are all cached, these aren't where the 6 second page load time is coming from.

Could save a few kb by obfuscating the javascript, but again javascript is cached, so this wouldn't reduce that 6 seconds.

Disabling Google Analytics and the WebSnapr images (the two things that I thought might be slowing down the page) didn't change load times.
Disabling the Javascript file from being loaded, however, did: load times dropped to about three seconds.
So it looks like it takes three seconds to load, parse and run the Javascript.
I'm working on getting that to load dynamically now...
]

--Yet another geek.

Site upgrade
Login or register to tag items

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Re: Today I fight, to speed the site

Druid

In the spirit of updating and making things better: I have slightly re-arranged the forums to be much easier and quicker to find your way around Smile if you spot a school boy error that i have made, please PM me.

Regards,
Ponder


--

+++divide by cucumber error+++please reinstall universe and reboot+++


Re: Today I fight, to speed the site

Admin

So I've been trying all day to get JavaScript to load dynamically, and failed.
I have a file /files/js/dewitest.js which at the end, calls the function dewi_loaded (as well as including all the other relevant Javascript code for the page).

This is the dewi_loaded function:

<script type="text/javascript">
function dewi_loaded() {
alert("loaded");
Drupal.extend( {
  settings: {
    "websnapr": {
      "previewBubbleSelector": "div.node a",
      "previewBubbleBG": "/modules/websnapr/images/bg.png" },
    "communityTags": {
      "tags": [  ], "url": "/community-tags/1747",
      "add": "Add"
    }
  }
});
}
</script>

Now, this works:

<script type="text/javascript" src="/files/js/dewitest.js"></script>

This does not. The alert is fired, so this code is loading and running the js file, which then calls dewi_loaded.
But the initialisation done by Drupal.extend is not fully working: websnapr mouseover does not work.

<script type="text/javascript">
  var fileref=document.createElement('script')
  fileref.setAttribute("type","text/javascript")
  fileref.setAttribute("src", '/files/js/dewitest.js')
  document.getElementsByTagName("head")[0].appendChild(fileref)
</script>

The following is worse. Again, the alert dialog shows, but the community tag input field does not appear, the websnapr mouseover does not work, and the dynamic menus don't work.
How? Why? I don't know. There are no errors, it's just... not working.

<script type="text/javascript">
function loadjscssfile(filename, filetype){
  var fileref=document.createElement('script')
  fileref.setAttribute("type","text/javascript")
  fileref.setAttribute("src", '/files/js/dewitest.js')
  document.getElementsByTagName("head")[0].appendChild(fileref)
}
setTimeout("loadjscssfile('/files/js/dewitest.js','js');",2000);
</script>

--

Yet another geek.


Re: Today I fight, to speed the site

Librarian

i am learning javascipt, any lessons????


--

So many penguins, so few recipes!!!


Re: Today I fight, to speed the site

Druid
heartofgold wrote:

i am learning javascipt, any lessons????

Java and Javascript are different Smile there are some good websites out ther, google is your friend.


--

+++divide by cucumber error+++please reinstall universe and reboot+++


Re: Today I fight, to speed the site

Admin

Not much I can suggest for learning Javascript really, other than to practice lots. w3schools.com is the best site I've found for reference.

I've made some more changes over the last few days to help speed the site further.

1) We have been subjected to a barrage of bots hammering all our scripts, trying to exploit them to attack other sites (and failing, it seems). This has been gobbling bandwidth, so I have made a "honeypot" - you will see the copyright symbol at the bottom of every page is now a link. This goes to a very simple HTML page which should under no circumstances be viewed by any bot, which in turn has two links on and a very stern warning. Bots ignoring the warning and following the links anyway, get themselves banned temporarily. Continuing to try gets them permbanned. Users clicking it by accident will need to email us to get unbanned. Users linking to the "ban" URLs in order to get other people banned, will be perm banned themselves when found out.

2) 30% of our traffic was bots (including perfectly legitimate bots) following the login/register links below every single post and comment. Some were also from malicious bots that had found our login/register pages from MSN's search. The login/register links have now been blocked in robots.txt so that bots cannot follow them, and people can't find them with searches.

Between the two, these measures should speed the site further, and mean we get "overloaded" a lot less (when the server goes down, it's not really going down completely, it's just getting completely clogged up like treacle, with all the requests from many bots).

3) I'm trawling the logs and just banning any badly behaved bots manually too. Because they suck.

4) I've also blocked any known bots from visiting /user/, just in case they are slow to read robots.txt.

5) Added /myuserpoints/ and /event/ alongside /user/ in the blocklist, and blocked all bots getting them. There's one particularly badly-behaved bot called Yandex, a Russian bot, which is as I type, trying to retrieve thousands of pages from those two, one every 2 seconds, not even respecting the "crawl-delay" set in robots.txt. But even yahoo, google and msn retrieve these pages sometimes.

6) As a counterpoint to all those, dropped the "crawl-delay" in robots.txt from 10 seconds to 3. Well-behaved bots are welcome and should be eencouraged. At 10 seconds/request it would take them a really long time to index the site.


--

Yet another geek.


Re: Today I fight, to speed the site

Admin

I'm working on improving security now.

1) To begin with, I've enabled the centos-plus yum repository. In human terms, this means I've changed the server from being very conservative indeed about what fixes it installed, only installing critical updates, to making it "reasonable", where we run fairly recent versions of software without me needing to mess about too much.

So, today, I installed 37M of updates across 18 updated packages and 5 new ones.

This means that we are now running newer versions of most stuff. THINGS MIGHT BREAK. Please let us know if they do.

2) I'm also planning to install a really rather cool taint module for PHP.

By default, PHP (which we use for almost all serverside stuff) is not "taint-aware". If you give it some information, it's happy to do whatever you tell it to, like maybe display it back to you in a web page. This seems like the right thing to do, but actually it's Really Bad. Because if, instead of something benign like a comment to a forum post, you instead sent some Javascript, you could do nasty things to people.

So what tainting does, is only let you show people information that has been "cleaned", first.

I expect this to take most of my day just hunting down and fixing all taints. I am hoping that most of my code is not going to be tainted: I am in the habit of putting "clean" stuff in variables beginning "$e_" and only printing those. But I'm sure I'll have slipped up in places.

It's the third party code that I'm expecting to take a long time. Drupal itself is really good in this regard, for a PHP program this size, because it uses a similar but more formal mechanism to my "$e_" system. But the non-Drupal stuff, of which there's a fair bit, could be awful.


--

Yet another geek.


Re: Today I fight, to speed the site

Admin

OK, things did break, and I am very sorry.

Won't do that again Sad


--

Yet another geek.


Re: Today I fight, to speed the site

Admin

I guess I should fess up and explain what the outage today was all about. But I'll hide it here in the knowledge that people never read these threads so I shan't be embarrassed.

Well, see, I was trying to compile this version of PHP with taint-checking. And it compiled fine, but the test suite that came with it, kept failing. So I thought, OK, I'll try compiling a copy of PHP without the taint-checking. But it still still failed, in the same way.

Looking at the error messages, I felt it was because I was compiling a new version of PHP, but with old libraries. I decided that if I installed the latest version of PHP on the server, that would come with all the updated libraries, and I'd be able to compile it.

Mistake 1 - doing development work on the live server. This is the kind of thing that needs to be done on a test machine.

So I found a site which had that version, a "yum repository", which is a server with copies of various applications that you can download and install automatically.

Mistake 2 So I updated. To do this, I typed "yum update -y". The "-y" bit means "don't ask me about things you aren't sure about - just do it". This prevented it asking me if I was really sure I wanted it to do what I'd just told it to do...

Mistake 3 I had typoed the line in the config for the repository, that said "only update PHP". So before my horrified eyes, it also updated any other applications that the new repository had a newer copy of. Fortunately this was only one other program. Unfortunately, that program was Apache, our web server. It upgraded it to a version which didn't work.

"Aagh!" I panicked, and hunted through the yum manual looking for a command that would let me downgrade again. I couldn't find one.

Mistake 4 So, I uninstalled Apache and PHP ("httpd" and "php" as they are known to Yum). But this in turn uninstalled anything that depended on them, taking out a whole bunch of other stuff. My eyes, growing ever more horrified, watched as more and more things uninstalled, a cascading tree of dependencies. (httpd, php, httpd-devel, mod_ssl, php-devel, php-pear, webalizer, php-gd, php-imap, php-mysql, php-odbc, php-pdo).

So I carefully copied down all the names, and went to reinstall them... but they wouldn't! There were "missing dependencies"! Files that, if they weren't there, the applications couldn't install, which had been removed when I uninstalled them, but were not part of the packages that had been uninstalled. So I ran the command to find what application package these missing dependencies should have come with. There were three applications, "apr", "php-cli" and "php-common" with missing dependencies in.

But those were already installed, and telling yum to reinstall them didn't replace the missing files. I wrestled with this for a time.

Mistake 5 Eventually finding no other way, I removed them, too, holding my breath in case this just made the problem worse.

Then I reinstalled the above list, along with all the new ones I'd deleted, and the tree of dependencies which had been deleted from them (apr, apr-devel, apr-util, apr-util-devel, php-cli, php-common). I breathed a sigh of relief as they all installed back safely, with no errors.

Then I fired up Apache... and none of the websites worked! It looked like Apache was ignoring its configuration file.

Mistake 6 It must, I reasoned, be looking in the wrong place for it. For long frantic minutes I wasted time trying to figure out where it was looking.

And then I realised that the config file had been replaced when it was being installed, and in fact I needed to restore from backups.

Mistake 7 What backups? User data, websites, stuff like that, yeah, I back those up regularly. But who backs up config files?

But after a moment of panic it turned out that I back up config files, and I had backups from about five minutes before I started this nonsense. Restoring those files fixed the site.

A deep breath or relief, and I lie back to contemplate how I can avoid these mistakes in future. In retrospect, only mistake 1 was serious, the others just compounded the terror. 2 was lack of proper caution, which is also a problem, but 3 was a typo and they can't really be avoided, and 4&5 I can't see any alternative to, really; 6 was a wrong guess, and 7 turned out not to be true after all.

So, time to invest in a development server, I think.


--

Yet another geek.


Re: Today I fight, to speed the site

Founding PatronLibrarianDruidThudmeister

Wow... I thought it was only me that tried experimenting on the live server Wink I think we need a big poster over our desks:

Poster wrote:

NO

You know what I'm talking about...

Just No.

MS


--

"LOOKS PERFECTLY LOGICAL TO ME"


Re: Today I fight, to speed the site

my my my it seems since i've been gone you guys have just gotten smarter.

oh dewi ms lee and all you awe me with your awesome awesome knowledge!!!


--

_O_
ll( )ll
_] [_


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.