Here I will mainly talk about PHP and MySQL related things since I deal with the two technologies all day long. I code for fun, and for food, so if you have PHP work that has to be done, feel free to contact me, I'm currently looking for freelance projects.
posted by kevin on October 24, 2009

Video - Installing Yii Framework (OSX)

I'm going to start doing video tutorials, I know I enjoy watching and listening to people explain things, I think I'll do some more things in this format.

posted by kevin on October 21, 2009

Microsoft Smart Quotes & PHP

It seems like I run into this issue for over half of the websites that I work on. The user wants to copy-and-paste their article, document, or whatever from MS Word, into a textarea and save it. The problem is word uses funky quotes, dashes, and other characters. Once it's submitted, PHP gets it and the characters are encoded differently and they display weird.

We've tried several different methods to try to eliminate the problem, but every time I googled for a fix, I never really found anything that worked well.

My boss mentioned that maybe we could fix it on the client-level, so I dove in and found what seems to be a promising fix. It's clean and simple. This function currently only replaces the single and double smart quotes, and then the strange dash character that MS word uses. Feel free to submit for character conversion codes and I'll add it to the function, I'll also add to this once I come across more problems.

Javascript function to replace Microsoft Smart Quotes with regular quotes.

 
function removeMSWordChars(str) {
    var myReplacements = new Array();
    var myCode, intReplacement;
    myReplacements[8216] = 39;
    myReplacements[8217] = 39;
    myReplacements[8220] = 34;
    myReplacements[8221] = 34;
    myReplacements[8212] = 45;
    for(c=0; c<str.length; c++) {
        var myCode = str.charCodeAt(c);
        if(myReplacements[myCode] != undefined) {
            intReplacement = myReplacements[myCode];
            str = str.substr(0,c) + String.fromCharCode(intReplacement) + str.substr(c+1);
        }
    }
    return str;
}
 

This is the jQuery that will run the filter on all textareas on your page when you tab away from the textarea (Assumes you have jQuery installed and running on the page.)

 
$(function(){
    $("textarea").blur(function(){
        $(this).val(removeMSWordChars(this.value));
    });
});
 

Removing smart quotes javascript example



Or if you don't use jQuery and you're a little green to javascript you can do this:

 
<textarea onBlur="this.value=removeMSWordChars(this.value);" name="a" rows=5 cols=10></textarea>
 
posted by kevin on October 18, 2009

Recently I tackled a query optimization problem, and I'm going to quickly go over the things we tried and the final result we went with.

I'll start with a little bit of background on the setup. Our database contains just under 700 tables and totals just over 100,000,000 records. We noticed that one part of our application was taking a little longer than we wanted it to and after some profiling we realized that we had a query that was taking approximately .83 seconds to complete.

The problem query:

 
SELECT rt FROM myTable
    WHERE origin_zip_low <= ?
    AND origin_zip_high >= ?
    AND dest_zip_low <= ?
    AND dest_zip_high >= ?
    LIMIT 1";

Now this table contains roughly 842,xxx rows and this zips are all integers up to 5 digits. I tried using a single index on:

 
ALTER TABLE `myTable` ADD INDEX ( `origin_zip_low` , `origin_zip_high` , `dest_zip_low` , `dest_zip_high` ) ;
 

Now using an explain on this query and index, it never seemed to want to use the index.

So then we indexed each separately and had 4 separate indexes.

 
ALTER TABLE `myTable` ADD INDEX ( `origin_zip_low` );
ALTER TABLE `myTable` ADD INDEX ( `origin_zip_high` );
ALTER TABLE `myTable` ADD INDEX ( `dest_zip_low` );
ALTER TABLE `myTable` ADD INDEX ( `dest_zip_high` );
 

We then ran the queries with many different zip code pairs and found that it sometimes used one of the indexes, sometimes it was dest_zip_low, sometimes origin_zip_low, etc, but most of the time it still didn't use the index at all. So without the index the queries took about .83 seconds and when it did use one of the indexes it ran in about .3 seconds.

Now you would think that MySQL would pick an index and use it, but sometimes MySQL will not use an index, even if one is available. The optimizer estimates that using the index would require MySQL to access a large percentage of the rows in the table.

After doing some reading online I found that doing queries with 'greater than or equal to' or 'less than or equal to' are best done using HASH table indexes. InnoDB does not support HASH table indexes and neither does MyISAM, however the MEMORY storage engine does.

When you initially look at the MEMORY storage engine you might be turned away due to the fact that your data lives in memory, and then once MySQL is restarted, your table will be empty. The good thing for us in this case though is our data essentially is static and it doesn't change, it's simply a lookup table so we can have our table myTable which is static and always contains our data, and then myTable_memory which we'll load the data from myTable.

Another thing with using the MEMORY storage engine is the data that goes into the table obviously will be loaded into your server memory, you must have the memory available to use. Our server we're using has 10 Gigs of memory currently, and the server is dedicated to this one application so after loading 'myTable' into memory it only takes up 80 Megs.

Configuration Changes

MySQL stores the tables using the MEMORY storage engine in a heap table, and since MySQL gives you a 16M 'max_heap_table_size', and since our table is larger than that we'll increase this value in our my.cnf. Also since every time our server is restarted that table will be empty, we want to populate the table when mysql starts up, so we'll set an init script to do just that:

 
#my.cnf
max_heap_table_size = 100M
init-file = /usr/local/www/vhosts/mysql_init.sql
 

Now in the mysql_init.sql we simply have this:

 
#mysql_init.sql
USE myDatabase;
INSERT INTO myTable_memory SELECT * FROM myTable;
 

The result, after converting the table to the MEMORY storage engine the queries never took longer than .09 seconds.

Conclusion

In this case using the MEMORY storage engine was huge since we were able to take advantage of the HASH index which aided in performing the query very quickly, however using MEMORY tables will not always save you time, we tried this approach on other queries and found absolutely no performance gain, but if you're doing searches using greater than or equal to is where it really shines.

By no means am I a MySQL DBA, so if you are, and you have more insight into this, please leave a comment!