Mysql partitioning & subpartitioning for MySQL scalability

Today in my article I want to describe how to scale MySQL database locally by using different features provided by MySQL. I will show how to define data and index locations for MySQL tables, how to create partitions and subpartitions and to define data and index location for them too, bot partitions and subpartitions. If you are interested to know more abouth other options to csale MySQL installation, you may be interested to read my earlier article about MySQL master-slave and MySQL master-master replication. I am not goingto touch MySQL database sharding here, as this is quite big topic itself and I am going to talk about it in one of my next articles. Just want to mention that for now database sharding is not supported natively by MySQL because of nature of relational databases.

Lets start with data and index locations. You may already know that this location is defined in /etc/my.cnf like:

datadir=/var/lib/mysql

MySQL database table location

So, why one may want to define separate location? There may be lot of reasons, but main one is that huge tables should be stored on separate drives due to OS file size limit or just to improve database speed. Here are some OS specific limitations on a file size:

Operating System File-size Limit
Win32 w/ FAT/FAT32 2GB/4GB
Win32 w/ NTFS 2TB (possibly larger)
Linux 2.2-Intel 32-bit 2GB (LFS: 4GB)
Linux 2.4+ (using ext3 file system) 4TB
Solaris 9/10 16TB
MacOS X w/ HFS+ 2TB
NetWare w/NSS file system 8TB

To be able to define index and data directory locations for the particular table or partition you should check that symlinks are enabled in your MySQL installation. To check either this feature is enabled or not you should run the following command:

mysql> SHOW VARIABLES LIKE 'have_symlink';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| have_symlink  | YES   |
+---------------+-------+
1 row in set (0.00 sec)

If it is not enabled, you should open /etc/my.cnf, add the following line to your comfiguration file and restart MySQL server.

[mysqld]
symbolic-links=1

Now you can create the table with custom data and index locations. Here is the table definition which defines the location for database index and data:

CREATE TABLE test1 (
id INT AUTO_INCREMENT,
note VARCHAR(500),
INDEX idx (id)
) ENGINE = MYISAM INDEX DIRECTORY = '/var/lib/mysql1' DATA DIRECTORY  = '/var/lib/mysql1';

As you can see syntax is quite simple and flexible. With simple changes you can improve your database structure getting better database speed and capacity. But sometimes it is not enough.

MySQL database table partitioning & subpartitioning

Reasons for database table partitioning are pretty mych the same as for defining separate location for different tables - speed and capacity. You should ensure that your MySQL installation supports partitioning:

mysql> SHOW VARIABLES LIKE '%partition%';

+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| have_partitioning | YES   |
+-------------------+-------+
1 row in set (0.00 sec)

You may also check that partition module is activated.

mysql> SHOW PLUGINS;
+------------+----------+----------------+---------+---------+
| Name       | Status   | Type           | Library | License |
+------------+----------+----------------+---------+---------+
| binlog     | ACTIVE   | STORAGE ENGINE | NULL    | GPL     |
| partition  | ACTIVE   | STORAGE ENGINE | NULL    | GPL     |
| ARCHIVE    | ACTIVE   | STORAGE ENGINE | NULL    | GPL     |
| BLACKHOLE  | ACTIVE   | STORAGE ENGINE | NULL    | GPL     |
| CSV        | ACTIVE   | STORAGE ENGINE | NULL    | GPL     |
| FEDERATED  | DISABLED | STORAGE ENGINE | NULL    | GPL     |
| MEMORY     | ACTIVE   | STORAGE ENGINE | NULL    | GPL     |
| InnoDB     | ACTIVE   | STORAGE ENGINE | NULL    | GPL     |
| MRG_MYISAM | ACTIVE   | STORAGE ENGINE | NULL    | GPL     |
| MyISAM     | ACTIVE   | STORAGE ENGINE | NULL    | GPL     |
| ndbcluster | DISABLED | STORAGE ENGINE | NULL    | GPL     |
+------------+----------+----------------+---------+---------+
11 rows in set (0.00 sec)

If you do not see have_partitioning set to YES or partition module is not ACTIVE, than your MySQL installation was compiles without partitioning support. You may read MySQL manuals to find out how to compile it properly or download latest version of MySQL packages from official website. But fortunately most of Linux distributions contain MySQL with enabled partitioning and we can move forward.

Here is an example of partitioned table creation:

CREATE TABLE test (
id INT AUTO_INCREMENT,
note VARCHAR(500),
INDEX idx (id)
) ENGINE = MYISAM
PARTITION BY KEY(id) (
PARTITION p0
 INDEX DIRECTORY = '/home/serg/mysql1'
 DATA DIRECTORY  = '/home/serg/mysql1',
PARTITION p1
 INDEX DIRECTORY = '/home/serg/mysql2'
 DATA DIRECTORY  = '/home/serg/mysql2'
);

As you can see, you can define locations for index and data files. If you will not define they partition files will be stored into the default MyDQL database folder. Each partition will have its own set of files, so file-related table size limitations may be overpassed tis way.

Attention! You should remember that several tables may share the same partition or partitions. So you should use different names for different partitions or be aware of this when you are designing the database. You may even find this useful in some cases, so you just have to remember that particularity.

Here is a fimle example of table with partitions and subpartitions:

CREATE TABLE ts (id INT, purchased DATE)
    PARTITION BY RANGE( YEAR(purchased) )
    SUBPARTITION BY HASH( TO_DAYS(purchased) ) (
        PARTITION p0 VALUES LESS THAN (1990) (
            SUBPARTITION s0,
            SUBPARTITION s1
        ),
        PARTITION p1 VALUES LESS THAN (2000) (
            SUBPARTITION s2,
            SUBPARTITION s3
        ),
        PARTITION p2 VALUES LESS THAN MAXVALUE (
            SUBPARTITION s4,
            SUBPARTITION s5
        )
    );

Attention! You should remember that each partition must have the same number of subpartitions, so if you defined subpartitions for one of the partitions, you should define them for all partitions. You should at laest define the names of subpartitions, in this case they will be created with the default settings.

Here is an example of the table with partitions and subpartitions, where some subpartitions have separate locations and some have default ones:

CREATE TABLE ts (id INT, purchased DATE)
    PARTITION BY RANGE(YEAR(purchased))
    SUBPARTITION BY HASH( TO_DAYS(purchased) ) (
        PARTITION p0 VALUES LESS THAN (1990) (
            SUBPARTITION s0a
                DATA DIRECTORY = '/disk0'
                INDEX DIRECTORY = '/disk1',
            SUBPARTITION s0b
                DATA DIRECTORY = '/disk2'
                INDEX DIRECTORY = '/disk3'
        ),
        PARTITION p1 VALUES LESS THAN (2000) (
            SUBPARTITION s1a
                DATA DIRECTORY = '/disk4/data'
                INDEX DIRECTORY = '/disk4/idx',
            SUBPARTITION s1b
                DATA DIRECTORY = '/disk5/data'
                INDEX DIRECTORY = '/disk5/idx'
        ),
        PARTITION p2 VALUES LESS THAN MAXVALUE (
            SUBPARTITION s2a,
            SUBPARTITION s2b
        )
    );

I hope I have covered main aspects of the problem, so this article will be useful. Sure some aspects may be missed, so let me know if I have missed something important and I will update this article or will write new one, if the subject will be out of the scope of this article. Adn follow my blog for more articles...

Posted by:
Enjoyed this post? Share and Leave a comment below, thanks! :)