Quite common task during schema review is to find the optimal data type for the column value - for example column is defined as INT but is it really needed or may be SMALLINT or even TINYINT will do instead. Does it contain any NULLs or it can be defined NOT NULL which reduces space needed and speeds up processing in most cases ?
These and similar tasks are often done by bunch of SQL queries while really MySQL has a native feature to perform this task - PROCEDURE ANALYSE
Here is sample run of PROCEDURE ANALYSE on drupal schema of Percona web site. The output contains row for each column in the table but I’ve omitted everything by few rows.
[sql]
mysql> select * from node procedure analyse() \G
*************************** 1. row ***************************
Field_name: percona_drupal.node.nid
Min_value: 1
Max_value: 194
Min_length: 1
Max_length: 3
Empties_or_zeros: 0
Nulls: 0
Avg_value_or_avg_length: 115.8345
Std: 45.5192
Optimal_fieldtype: TINYINT(3) UNSIGNED NOT NULL
*************************** 2. row ***************************
Field_name: percona_drupal.node.vid
Min_value: 1
Max_value: 194
Min_length: 1
Max_length: 3
Empties_or_zeros: 0
Nulls: 0
Avg_value_or_avg_length: 115.8345
Std: 45.5192
Optimal_fieldtype: TINYINT(3) UNSIGNED NOT NULL
*************************** 3. row ***************************
Field_name: percona_drupal.node.type
Min_value: blog
Max_value: webform
Min_length: 4
Max_length: 13
Empties_or_zeros: 0
Nulls: 0
Avg_value_or_avg_length: 8.7482
Std: NULL
Optimal_fieldtype: ENUM(’blog’,'contact’,'customer’,'image’,'lab_project’,'news’,'page’,'percona_faq’,'presentation’,'press_release’,'project’,'random_phrase’,’service’,'team_member’,'webform’) NOT NULL
[\sql]
Here is how this table was originally defined:
[sql]
mysql> desc node;
+———-+——————+——+—–+———+—————-+
| Field | Type | Null | Key | Default | Extra |
+———-+——————+——+—–+———+—————-+
| nid | int(10) unsigned | NO | PRI | NULL | auto_increment |
| vid | int(10) unsigned | NO | PRI | 0 | |
| type | varchar(32) | NO | MUL | | |
| title | varchar(128) | NO | MUL | | |
| uid | int(11) | NO | MUL | 0 | |
| status | int(11) | NO | MUL | 1 | |
| created | int(11) | NO | MUL | 0 | |
| changed | int(11) | NO | MUL | 0 | |
| comment | int(11) | NO | | 0 | |
| promote | int(11) | NO | MUL | 0 | |
| moderate | int(11) | NO | MUL | 0 | |
| sticky | int(11) | NO | | 0 | |
+———-+——————+——+—–+———+—————-+
12 rows in set (0.00 sec)
[\sql]
This example shows very well the output of PROCEDURE ANALYSE or any similar tool should not be taken for an instant action but rather used as a base to make your own decision.
For example we can see nid and vid which are defined as INT in original schema can fit to TINYINT. Should we change them ? Probably not - the site will continue to growth and we expect to have more than 255 items this type would allow.
We probably could have changed it to SMALLINT or MEDIUMINT but on the small database like this it does not really yields any significant benefits so it is better to stick to default.
The type field is suggested for conversion to ENUM though as we know new types can appear by inserts we can’t do it. In other drupal installation though when types never change and savings could be significant the change may make sense.
Another important piece of information for type column is maximum length which is really helpful in more cases than ENUM data type - maximum length of data actually stored often can tell us if we defined varchar field too wide. Comparing maximum field length to average length we can see if CHAR or VARCHAR type fits better - for example if column always have same width like hashes CHAR can be more efficient storage.
In general even though I would not just use this tool suggested data types to do an ALTER TABLE it provides very helpful information for performing schema audit.
Entry posted by peter | No comment
Dad: Well, Billy, another school year is coming to a close. No more college parties, just another summer here at home. What will you do all day?
Billy: Oh, I dunno. I’ll probably work on my blog or something.
Dad: You need more direction! That blog is just your generation’s answer to comic books.
Billy: On the contrary, Dad, working on my blog utilizes my skills in programming, design, writing, critical thinking, and all sorts of other liberal artsy things that you’re paying those professors to teach me.
Dad: If only there were a more practical application for those skills, one that could lead you to fame and fortune!
Billy: Where’ve you been living, Dad? My skills are totally in demand in today’s questionable economy. An awesome WordPress developer is worth his/her weight in gold. Lead, even.
Dad: What is this WordPress?
Billy: Only the greatest open source publishing platform ever created. It’s what runs my blog. I like to fiddle around with the code and come up with cool hacks that make mine better than the average College Joe’s.
Dad: I had no idea you were that capable.
Billy: Duh, Dad. I’ve been using WordPress for a couple of years now. I could practically teach a course on it, though there are definitely things I could learn from the lead devs. They are like kings.
Dad: Hm. That kind of ability ought to be worth something. Seems like there would be programs in place for kids like you to utilize your skills while being nurtured by people like these lead kings.
Billy: Lead devs, Dad. Not lead kings.
Dad: Maybe you could apply for an internship or something, earn a little money this summer instead of just spending mine.
Billy: Well, there is this one thing like that.
Dad perks up.
Billy: The Google Summer of Code lets college students work with lead developers on a bunch of open source projects, you can get college credit for it, and if you do a good job, you can earn up to forty-five hundred bucks over the summer. And WordPress is one of the participating projects.
Dad: !!
Billy: But it’s pretty competitive. My friend Joe applied last year and didn’t make the cut. I can improve my skills just by fiddling around on my own this summer without the rejection, thanks.
Dad: Don’t be lame! You said yourself you’re awesome. And that you could learn from the kings. And that you could earn over FOUR THOUSAND DOLLARS. Life is full of rejection, kid. Best way to get over that is to make yourself so awesome that no one wants to reject you. And know that even if they do reject you, there’s always next time.
Billy: I dunno, Dad.
Dad: Tell you what, if you apply, I’ll give you $500 toward that car you’ve been wanting, whether you’re accepted into the program or not. And if you get in and complete it successfully, I’ll match that $4500. I’d be so proud of you. And the bragging rights at work! My kid, a Google engineer!
Billy: I wouldn’t actually be a Google engineer, Dad.
Dad: Oh be quiet. Do you think Harold in shipping knows the difference?
Billy: Okay, Dad, I’ll do it!
Dad: That’s my boy.
…………………….
College students! Don’t wait for your parents to bribe you; apply to the Google Summer of Code program now! For the third year in a row, WordPress is participating, and this year we’ve got project suggestions ranging from core functionality to plugins and BuddyPress development. You name it, we want you to propose it. It’s true, competition is fierce, but hey, if you’re already hacking WordPress, you’re ahead of the pack as far as we’re concerned. Applications are being accepted as of today, and the deadline is on April 3, 2009. For more information, check out the WordPress Codex GSoC2009 page, where we suggest some projects and let you know who our kingly mentors will be this year. The GSoC FAQ is also a good place to get an overview of the program. To apply, head to the Google Summer of Code application site. Remember, we want *you* to work on WordPress this summer! And College Janes, this isn’t just for College Joes. Female applicants encouraged to apply!
三月 2009 | ||||||
一 | 二 | 三 | 四 | 五 | 六 | 日 |
1 | ||||||
2 | 3 | 4 | 5 | 6 | 7 | 8 |
9 | 10 | 11 | 12 | 13 | 14 | 15 |
16 | 17 | 18 | 19 | 20 | 21 | 22 |
23 | 24 | 25 | 26 | 27 | 28 | 29 |
30 | 31 |