求知欲也需要克制？Pure Pleasure - Reborn | 23 Dec 2008

FreeBSD 7.1delphij's Chaos » 车东's shared items in Google Reader

我这次不铁口直断了，估计明年见了，如果这周能够RC2的话，最终RELEASE估计要到明年的1月5日左右。

这次FreeBSD 7.1在软件工程角度是相当失败的案例，我想我们应该从里面总结一些教训出来。

首先，我们的目标是什么？

我想，用户对于一个发行版本的期待是：一个经过了大量测试的、阶段性的稳定版本。而开发人员对于发行版本的期待则是尽可能地将可用的功能交付给用户。作为OS，我认为这应该包括自动化的回归测试、性能改进、更新的驱动程序、文档的修订，以及更新的第三方软件等等。

FreeBSD目前的开发模式，是将分别开发的三大模块，即内核与基本系统(src/)、文档(doc/)、第三方软件(ports/)的人在一定的时候聚集在一起，通过代码冻结的方法来使他们从增加新功能转移到集中去修bug，最后发布一个版本。这个模式在过去运转的相当好，以至于我们没有发现其中存在的问题。

这次FreeBSD 7.1暴露出来了这个模式存在的很多问题。例如，由于安全小组发现了很多安全漏洞，而另一方面，安全小组的人手不够，限制了修正这些问题的速度。我本人撰写的一个安全公告等了一个月才在今天最终公之于众，而另一方面，安全小组对于发行版本拥有一票否决的绝对权力，导致6.4和7.1的发布都一再推迟。

而作为非常快节奏的开发的 ports/ 维护者，则对不断的推迟感到相当不满。FreeBSD目前只维护 ports/ 的 -HEAD，也就是说，在正式发行 -RELEASE 之前，ports/ 不能进行大量的、破坏性的修改。例如，我本人维护的 OpenLDAP 现在就必须等待 -RELEASE 之后才可以进行升级。不断地推迟新的发行版本，会导致 ports/ 的开发继续延迟。

多种因素作用在一起的结果是，这次7.1-RELEASE的发布让所有的人都不太满意。

我想，想要解决这些问题，需要从几个方面入手。

首先是功能冻结和代码冻结的时间。目前，FreeBSD的做法是src/和ports/几乎同时开始冻结，但是实际上 src/ 需要花费的测试周期要比 ports/ 多。举例来说，一次正常的 -STABLE 发行版本所需要的测试和除错周期大约是 8 - 10 周，而完成一次 ports/ 联编所需要的时间则只需要 48 － 160 小时，并且这个过程是持续进行的。我认为我们必须定义一个功能冻结点，这个点之后，不再允许在 -STABLE 分支中增加任何新的API/ABI，然后以这个版本（例如-RC1）为基础进行 ports/ 的联编。另一方面， ports/ 是否有必要随 src/ 冻结？我认为意义不大，因为多数用户并不会使用光盘附带的 package，甚至于使用 -STABLE 的 package 会成为阻碍在 -STABLE 分支中增加新的 API 的障碍，这样一来，在人手有限的情况下，人力不如去维护一个每季度创建一次的稳定 ports/ 分支（只升级其中的安全更新；只维护最多两个分支；确定交付时间），而 -RELEASE 则随同最近的一个这样的分支去发布。

在升级支持方面，目前FreeBSD发布新版本的周期显得过长。这会带来两个问题：其一，在无法预测何时会发布发行版本的前提下，用户如何安排升级？其二，在这样一段较长的时间内，如果出现新的硬件，发行版本光盘无法及时更新，而普通用户更新光盘中的驱动程序则比较困难。

为了因应这种情况，我认为我们应该将现有的 X.Y 发行版模式改为 X.Y.Z 发行版模式。其中，X每18个月发布一个新版本，Y每6个月发布一个新版本，而Z的发行周期则视需要而定，每一个新的Z版本均发布对应的ISO文件，而其中除了重要的安全和可靠性更新之外，也包含经过较长测试的、对于新硬件的支持改进，作为对 -STABLE 快照的补充。

最后的问题其实还是，我们的发行版本的目标用户人群是什么？我希望能够得到一些来自用户的反馈。还有一个好消息是，这个礼拜我们总算有希望能看到RC2了......

12:38 Welcome Back and Happy Holidays! » I got Spam?!

Photo courtesy of iStockphoto, Image# 6908589Hello everyone! I apologize for the sudden silence this...

11:48 High-Performance Click Analysis with MySQL » MySQL Performance Blog

We have a lot of customers who do click analysis, site analytics, search engine marketing, online advertising, user behavior analysis, and many similar types of work. The first thing these have in common is that they're generally some kind of loggable event.

The next characteristic of a lot of these systems (real or planned) is the desire for "real-time" analysis. Our customers often want their systems to provide the freshest data to their own clients, with no delays.

Finally, the analysis is usually multi-dimensional. The typical user wants to be able to generate summaries and reports in many different ways on demand, often to support the functionality of the application as well as to provide reports to their clients. Clicks by day, by customer, top ads by clicks, top ads by click-through ratio, and so on for dozens of different types of slicing and dicing.

And as a result, one of the most common questions we hear is how to build high-performance systems to do this work. Let's see some ways you can build the functionality you need and get the performance you need. Because I've built two such systems to manage online ads through Google Adwords, Yahoo, MSN and others, it's easy and familiar for me to use the example of search engine marketing. I'll do that throughout this article.

Requirements

The words "need" and "want" are different. Do you really need atomic-level data? Do you really need real-time reporting? If you do, the problem is much more expensive to solve.

Start with the granularity of your data. What data do you need to make your business run? If you can't get access to the time of day of every click on every ad, will it hamper your ability to measure the ad's value? Is it enough to know how many times the ad was clicked each day? If so, you can roll all those events up into a per-day table.

Next, let's look at "real-time." None of the big three (Google, Yahoo, MSN) provides real-time reporting last time I was involved with them (and I suspect this is still true). It's too expensive. Consider your user expectations. For most applications I've been involved with, having day-old data is adequate, and users don't expect realtime. The trick here is that when you start out, realtime is possible because your data is small. "Hey, we do realtime reporting. Google doesn't even do that! We're better!" Then you get popular :) And if you've promoted your better-ness in the meantime, you might have to do some awkward backpedaling with customers, who now expect realtime data. The database giveth, and the database taketh away.

Finally, you should think a lot about how you need to query the data. It is a hard question to answer, and sometimes I've seen it evolve over time, especially as the growing data size forces it to. This goes back to what data you really need to make your business run. Anything else is gravy. If there are nice-to-haves, consider not building them in. Listen to some talks by 37Signals if you need inspiration to toss things out. Define the types of queries you absolutely have to have, if possible, and note the ways and types of aggregation (by-ad by-day, for example).

Sometimes I ask a customer "what kinds of queries do you have to run?" and they say "we can't decide, so we want to just store everything." If you can't decide yet, then don't store everything in the database. Instead, store the source data in some fashion that you can reload later, such as flat files, and build support in the database for one or two capabilities you absolutely need now; then add the rest later, reloading the data if needed.

Aggregate

Aggregation is absolutely key for most people. There are special cases, and there are ways to do general-purpose work without aggregating (see the section below on technologies), but if you're doing this with vanilla MySQL, you will need to aggregate your data.

What you want to do is aggregate in ways that optimize the most expensive things you'll do. And then, you might super-aggregate too. For example, if you aggregate by day and then you do a lot of queries over 365-day ranges for year-over-year analysis, aggregate again by month. Then write your queries to use the most aggregated data possible to save work.

Avoid operations that update huge chunks of aggregated data at once. Among other things, you'll make replication lag badly. More about this later.

Another way to say "aggregate" is to say "pre-compute." If you have time-critical queries for your app to do its work, can you do the work ahead of time so it's ready to get when needed? This might or might not be aggregation.

Denormalize

Pre-computing and careful denormalization need to go together. Figure out what other types of data you'll need in those aggregate tables, and include columns to support these queries. But beware of denormalizing with character data; try to make your rows fixed-length.

One reason denormalization is important is that nested-loop joins on large data sets are very expensive. If MySQL supported sort-merge or hash joins, you'd have other possibilities, but it doesn't, so you want to build your aggregate tables to avoid joins.

Watch Data Types

Does your ad ID look like "8a4dabde-1c82-102c-ab13-0019b984eacd" and is it stored in a VARCHAR(36)? When tables get big, every byte matters a lot. Use the smallest data types you can, the simplest character sets you can, and watch out for NULLable columns. Use smallint unsigned or tinyint unsigned if you can. You can save very large amounts of space. Choose primary keys very carefully, especially with InnoDB tables -- don't use GUIDs. Which brings me to my next point:

Use InnoDB

Assuming that you will use the stock MySQL server, InnoDB is usually your best bet. (Actually, XtraDB might be very interesting for you, but I digress). Due to the cost of repairing huge MyISAM tables and taking downtime, I would not use MyISAM for anything but read-only tables when things get big. And even if it's read-only, there's still another reason to use InnoDB/XtraDB tables...

Optimize For I/O

It is pretty much inevitable: if you do this kind of data processing in MySQL, you're going to end up heavily I/O bound. Listen to any of the talks at past MySQL conferences from people who have built systems like yours, and there's a fair chance they will talk about how hard they have to work on I/O capacity.

What does this have to do with InnoDB? Data clustering. InnoDB's primary keys define the physical order rows are stored in. That lets you choose which rows are stored close to each other, which is very beneficial in many cases. Especially on huge tables, it lets you scan portions of a table instead of the whole table if you a) choose your aggregation to match the order of your common queries and b) choose your primary key correctly.

Let's go back to the ad-by-day table. If you query date ranges most of the time, you should define the primary key as (day, ad). Don't use an auto-increment primary key, and don't put ad first. If you put ad first, then you're going to scan the whole table to query for information about yesterday. If you put day first, then yesterday will all be stored physically together (within the page -- the pages themselves may be widely separated, but that's another matter).

Don't Store Non-Aggregated Data

I've been talking a lot about aggregated data. What do you do with the non-aggregated data? My answer is usually simple: just don't store it in the database. Instead, pre-aggregate. Suppose your data is coming from some Apache log or similar source. Write a script to rip through the file and parse it 10k lines at a time, aggregating as it goes. When each chunk is done, make it write out a CSV file and import that with LOAD DATA INFILE. Keep those big fat log files out of the database. The database is usually the most expensive and hardest-to-scale component in your system -- don't waste resources.

Another benefit of this is the chance to parallelize. As you know, MySQL doesn't do intra-query parallelization, so ETL jobs written to rely on SQL tend to get really bogged down. In contrast, moving the processing outside the database lets you parallelize trivially.

If you need to analyze the non-aggregated data, you can store it on the filesystem and write custom scripts to do special-purpose tasks on it. Storing a little meta-data about each file can help a lot. Store the ranges of values for various attributes, for example; or the presence or absence of values. You can put these into the database in a little meta-table. Then your script can figure out which files it can ignore. What we're doing here starts to look like a hillbilly version of Infobright, which I'll talk about later.

Alternately, you can store the atomic data as CSV files and use the CSV engine so you have an SQL interface to it (the meta-tables are still a valid approach here!). This is an easy way to bypass the hard-to-scale database server for the initial insertion, because you can write CSV files with any programming language. Naturally, CSV files don't store as compactly on disk as [Compressed] MyISAM or Archive.

These are just some ideas I'm throwing around -- the point is to think outside the box, even to think of things that seem "less advanced" than using a database.

Sharding and Partitioning

Sharding is inevitable if your write workload exceeds the capacity of a single server (or if you're using replication, the capacity of a single slave). Sharding can also help you avoid massive tables that are too big to maintain. If you know you'll get there, it can change the lifecycle of your application in advance.

What about partitioning in MySQL 5.1? I know there are some cases when it can help a lot, and we've proven that with our customers. But you still have to think about how to avoid enormous tables that are hard to maintain, back up, and restore. And the partitioning functionality is not done yet and not fully integrated into the server, so I expect to find a lot more bugs and annoyances. There are already inconvenient limitations on some key parts of partitioning, such as maintenance and repair commands, that essentially negate the benefits of partitioning for those operations. An finally, it doesn't save you from the downtime caused by ALTER TABLE -- a typical reason to think about master-master with failover and failback for maintenance. As with anything, it's a cost-benefit equation. What are your priorities? Choose the solution that meets them.

Be Careful With Data Integrity

When you're storing several levels of aggregation, and there's denormalization, you need to be scrupulous about data cleanliness, because it's really hard to fix things up later. If your data is coming from a partner site, and you upload bad data there, you'll be getting bad data back for a long time. And every time you have some incremental job to update the aggregates, you're exposed to that bad data again.

Any inconsistencies in the atomic data tend to get magnified as it gets aggregated, because you suddenly have a single row created from many rows, and if the many rows don't match completely, the single one doesn't know what data should live in it. And this only gets harder to resolve as you get more levels of aggregations.

Watch Out For The Long Tail

People talk about the long tail and how you can focus on optimizing the short head. It's the classic 80-20 rule. Maybe 80% of your ad impressions are on 20% of your ads! Hooray! But don't forget that if you're aggregating per-day, an ad that gets a million impressions takes one row, and an ad that gets one impression takes exactly the same: one row. An impression per day becomes a fixed overhead of storage size. So, you actually have as many rows as you have unique ads per day. Viewed this way, suddenly you start to hate the ads that occasionally get an impression. They're so wasteful!

It's easy to flip back and forth between viewpoints on this and get distracted into making a mistake. Watch out when you do your capacity planning. Don't get fooled into calculating the wrong thing.

Be Creative With Table Structures

Suppose you have some yes/no fact about an ad impression, such as whether it was a blue ad (whatever that means.) You start out with this:

PLAIN TEXT

SQL:

CREATE TABLE ads_by_day_by_blueness (
day date NOT NULL,
ad int UNSIGNED NOT NULL,
is_blue tinyint UNSIGNED NOT NULL,
clicks int UNSIGNED NOT NULL,
impressions int UNSIGNED NOT NULL,
....
PRIMARY KEY(day, ad, is_blue)
);

What can we improve here? Especially assuming that there are indexes other than the primary key, we can shrink the primary key's width:

PLAIN TEXT

SQL:

CREATE TABLE ads_by_day_by_blueness (
day date NOT NULL,
ad int UNSIGNED NOT NULL,
clicks int UNSIGNED NOT NULL,
impressions int UNSIGNED NOT NULL,
blue_clicks int UNSIGNED NOT NULL,
blue_impressions int UNSIGNED NOT NULL,
....
PRIMARY KEY(day, ad)
);

There are a couple of ways to handle this now. You can have the clicks column record the total, and the blue_clicks column record only blue clicks; to find out non-blue clicks you subtract one from the other. Or you can have the blue clicks and non-blue clicks stored, and to get the totals you add them.

Did this gain us anything? We dropped one column, and we just moved those other values around to store them "next, to in the same row" instead of "below, in the next row." So we're storing all the same data, right?

Logically, yes; physically, no. Those values that we pivoted up beside their neighbors will share a set of primary key columns. And not only will every index be a little narrower, the table will now contain only half as many rows. That will make the indexes less than half the size. In real life this technique often makes the table+index much less than half the size. You have to write a little more complex queries, but that's often justified by a large reduction in table size.

I sort of stumbled upon this idea one day. I have no idea what this technique might be called, so I call it dog-earing the table (somehow the image of putting columns next to each other makes me think of putting cards next to each other and shoving).

Archive

If you don't need data anymore, move it away or get rid of it. I wrote a three-part article on data archiving on my own blog a while back. The benefits of purging and archiving data can be dramatic.

Take It Easy On Replication

Building aggregated tables is hard work for the database server. If you do it on the master with INSERT..SELECT queries, it will propagate to the slaves and it'll be hard work there too, assuming you use statement-based replication.

You can save that work by either using MySQL 5.1's row-based replication, or in MySQL 5.0 and earlier, doing the work on a slave, then piping the results back up to the master with LOAD DATA INFILE, which kind of emulates row-based replication in a way.

When you're updating big aggregate tables, don't work with giant chunks of them at once. If there's any possible way, do it in manageable bits. A day at a time, for example.

There are a lot of other ways you can make replication faster. I wrote a lot about this in our book, which is linked from the sidebar above.

Don't Assume Traditional Methods Will Save You

What you're really doing here is building a data warehouse. So you may think you should use traditional DW methods, like star schemas. The problem is that MySQL doesn't tend to perform well on a data warehousing workload. The nested-loop joins are not all that fast on big joins; the query optimizer can sometimes pick bad plans when you have a lot of joins between fact and dimension tables, and so on. With careful tweaking, many of these things can be overcome, but how much time do you have? And the gains are simply limited by some of MySQL's weaknesses in some cases.

Not only that, but star schemas are not intended to be fast. The star schema is essentially "I admit defeat and accept table scans as a fact of life." Table scans can be better than the alternative, if the alternatives are limited, but they're still not what you need unless you're okay with long queries that read a lot of rows -- MySQL can't handle too many of those at once.

Aside from star schemas, another tactic I see people try a lot is to build "flexible schemas" with tables that contain name-value pairs or something similar. The thought is that you can make the application believe it has a custom table, which is really constructed behind the scenes from the name-value tables in a complex query with many joins. I have never seen this approach scale well.

Use The Best Technologies You Can

MySQL is not the end-all and be-all. If you're familiar with it and it can serve you reasonably well, it's fine to use it for things that it's not 100% optimal for. But if the costs of doing that are going to outweigh the costs of using another solution, then look at other solutions.

One that holds promise is Infobright. While I have not evaluated their technology in depth, I think it merits a good look. I had the chance at OpenSQL Camp to talk to Alex Esterkin and see him present on it, and based on that exposure, I think they are doing a lot of things right. When I know enough to have a real opinion (or when other Percona people get to it before I do!) you'll see results on this blog.

Another is Kickfire -- also something I have not had a chance to properly evaluate. And there are others, and there will continue to be more. Finally, PostgreSQL is clearly better for some workloads out-of-the-box than MySQL is, especially for more complex queries. Percona is not tied to MySQL, although we're most famous for our knowledge about it. When another tool is the right one, we use it.

Have you thought about using something besides a database? You have your choice of buzzwords these days. Hadoop is a big one. But beware of falling into the trap of brute-forcing a solution that really needs to be solved with intelligent engineering, instead of massive resources.

Conclusion

This article has been an overview of some of the tactics I've used to successfully scale large click-processing and other types of event-analysis databases. In some cases I've been able to avoid sharding for a long time and run on many fewer disk drives with much less memory, or even with 10-15x fewer servers. Clever application design, and a holistic approach, are absolutely necessary. You can't look to the database to solve everything -- you have to give it all the help you can. Hopefully it's useful to you, too!

Entry posted by Baron Schwartz | No comment

Add to: | | | |

10:23 Goal driven performance optimization » MySQL Performance Blog

When your goal is to optimize application performance it is very important to understand what goal do you really have. If you do not have a good understanding of the goal your performance optimization effort may well still bring its results but you may waste a lot of time before you reach same results as you would reach much sooner with focused approach.

The time is critical for many performance optimization tasks not only because of labor associated expenses but also because of the suffering - slow web site means your marketing budget is wasted, customer not completing purchases, users are leaving to competitors, all of this making the time truly critical matter.

So what can be the goal ? Generally I see there are 2 types of goals seen in practice. One is capacity goal this is when the system is generally overloaded so everything is slow, when you're just looking to see how you can get most out of your existing system, looking for consolidation or saving on infrastructure cost. If this is the goal you can perform general system performance evaluation and just fix the stuff which causes the most load on the system. MySQL Log analyzes with Mk-Log-Parser is a very good start for a ways to generally optimize MySQL load on the system.

Latency Goal is another breed. The system may not look loaded but some pages still may want to be loading much slower than you like. These goals are not system wise but they are much more specific to the different user interactions or even types of users. For example you may define goal also "Search pages have to have response time below 1 second in 95% cases and below 3 seconds in 99% cases". Note We're specific to the user interaction - people are used to Search taking longer time than other interactions for many applications, and also we speak about percentile response time rather than "all queries". It is surely good all search queries complete in one seconds but it is too not practical. The goal description may be more specific too - for example you may have different response time guidelines for pages which are requested for real humans vs search engine bots (which are often quite different in their access pattern) or you may define "large users" as users having more than 100.000 images uploaded and measure the response time for them specifically because this group has its own performance challenges.

Looking at Latency it is also much more practical to look from the top of the stack. If you look at MySQL log you may find some queries which are slow but it is hard to go back from them to what is really important for the user and so the business - the page response times. Furthermore. It is not enough in many cases to focus only on Server Side optimization - the Client Side Optimization is also quite important in particular for aggressive performance goals and fast back-end. This is why we added this service to Percona offerings.

If Server side or Client Side performance optimization is going to be more important for your application depends on the application performance a lot. The better your application is the more Client Side optimization you will need. For example if it takes you 30 seconds to generate the search results and 3 more seconds to load all style sheets images and render the page server side optimization is more important. If you have optimized things and now HTML takes 0.5 seconds to generates an extra 3 seconds become the main response time contributer which has the highest performance optimization potential.

But let us get back to the Server Side Optimization. Lets assume our performance goal applies to the HTML generation rather than full page load on the client. So meet our goal we should look at the pages which do not meet our goal, which is pages which take more than 1 second to generate in given example.

For goal driven performance optimization it is important there is enough instrumentation and production performance logging in place so you really can focus on hard data in your work. For small and medium size applications you can log all requests to MySQL table for larger ones you can log only small portion of them. I usually keep one table per day so it is easy to copy the data to a different box for data crunching and remove the old ones.

The log table should contain URL, IP and all the data you need to be able to repeat request if you need to. It may include cookie data, post data, logged in user information etc. But the real thing is number of times which are stored for request. wall clock time - is the real time it took to generate the page by server backend. CPU Time This is the CPU time needed to generate request (you can split it to user and system if you want) and when there come various wait times - mysql, memcache, sphinx, web services etc.

For web applications doing processing in a single thread the following simple formula applies wall_time=cpu_time+sum(wait_time)+lost_time The lost time is the time which was lost for some reason - some waits we did not profile or waits we do not have control of, for example when processing had to wait for CPU available to do processing. For multi-thread application it is a bit more complicated but you still can analyze critical path.

If you have such profiling in place all you have to do is to run the query to see what are contributing factors to the response time of the problematic pages:

PLAIN TEXT

SQL:

mysql> SELECT count(*),avg(wtime),avg(utime/wtime) cpu_ratio, avg(mysql_time/wtime) mysql_ratio ,avg(sphinx_time/wtime) sphinx_ratio, avg((wtime-mysql_time-sphinx_time-utime)/wtime) lost_ratio FROM performance_log_081221 WHERE page_type='search' AND wtime>1;
+----------+-----------------+------------------+------------------+------------------+------------------+
| count(*) | avg(wtime) | cpu_ratio | mysql_ratio | sphinx_ratio | lost_ratio |
+----------+-----------------+------------------+------------------+------------------+------------------+
| 112376 | 6.0645327150223 | 0.11126040714778 | 0.17609498370795 | 0.54612972549309 | 0.16651488365119 |
+----------+-----------------+------------------+------------------+------------------+------------------+
1 row IN SET (2.29 sec)

Why looking only at such pages is important ? This is because if you look at all pages rather than problematic subset it may lead you away from your goal. For example it is very possible among all pages we would see CPU usage as the main factor because sphinx and MySQL respond from cache.

We however see for pages which have the problem it is Sphinx which accounts for most of the time.

Looking at the data such way we have two great benefits. First we really understand what is the bottleneck. Second we know what performance gain potential is. For example in this case we could spend a lot of time optimizing PHP code but because it takes only 10% of response time in average even speeding it up 10 times we would not get more than 10% response time reduction. At the same time if we find a way to speed up Sphinx we can reduce response time to its half.

Note in this case there is some 16% of response time which is not accounted for. Large portion probably comes from memcache accesses which are not instrumented for this application. In this case this portion is not the biggest part yet but if we'd speed up Sphinx and MySQL dramatically we would have to go and look into better instrumentation so we can look inside this black box.

Once we know it is Sphinx which causes the problem we have to go and find what queries exactly are causing it - this can be done by adding request ID as comment to Sphinx log so you can profile it carefully or you can add tracing functionality to the application. All the same. Once you found the queries causing the problem you see the ones which cause the most impact and focus on optimizing them.

There are multiple ways to optimize something, my checklist is usually get rid of it, cache it, tune it, get more hardware in this order. It is often it is possible to get rid of some queries, cache them, tune them so they are faster (often at the same time changing semantics a bit) and if nothing helps or can be done quickly we can buy more hardware, assuming application can use it.

Once you've performed optimizations you can repeat analyzes again to see if performance goals are met and where is the bottleneck this time.

As a side note I should mention looking at performance statistics for the day overall is often not enough. Application performs as good as it performs during its worst times so it is very good to plot some graph over time. Sometimes an hour base may be enough but for large scale application I'd recommend to looking down to 5 minutes or even 1 minute intervals and making sure there are no hiccups.

Check the stats from the application above for example:

PLAIN TEXT

SQL:

mysql> SELECT date_format(logged,'%H') h,count(*),avg(wtime),avg(sphinx_time/wtime) sphinx_ratio FROM performance_log_081221 WHERE page_type='search' AND wtime>1 GROUP BY h;
+------+----------+-----------------+------------------+
| h | count(*) | avg(wtime) | sphinx_ratio |
+------+----------+-----------------+------------------+
| 00 | 5851 | 3.0608555987602 | 0.49142908242509 |
| 01 | 6639 | 2.9099249532198 | 0.48133478800683 |
| 02 | 5406 | 3.3770073273647 | 0.49140835595675 |
| 03 | 5397 | 2.9834221059666 | 0.53178056214228 |
| 04 | 4820 | 3.8182240369409 | 0.53530183347988 |
| 05 | 3720 | 13.025273085185 | 0.61126549080115 |
| 06 | 1606 | 60.624889697559 | 0.89123114911947 |
| 07 | 2699 | 38.821067012253 | 0.90885394709571 |
| 08 | 2419 | 45.388828675971 | 0.9226436892381 |
| 09 | 4810 | 6.330725168364 | 0.60329631087965 |
| 10 | 5445 | 3.8355732669953 | 0.53918653169648 |
| 11 | 5283 | 3.0498331333457 | 0.5512679788082 |
| 12 | 4147 | 2.9050685487542 | 0.52802563348716 |
| 13 | 2313 | 3.1297905412629 | 0.47887915792732 |
| 14 | 4155 | 2.9788750504185 | 0.53700871350403 |
| 15 | 4081 | 4.4940078389087 | 0.67605124513469 |
| 16 | 3720 | 3.1698921914062 | 0.54566719123393 |
| 17 | 4210 | 2.7616731525034 | 0.47537024159769 |
| 18 | 6735 | 2.639767089152 | 0.5204920072653 |
| 19 | 5581 | 2.6058266677645 | 0.42959908812738 |
| 20 | 4990 | 2.4441354725308 | 0.44270882435635 |
| 21 | 6305 | 2.6316682707403 | 0.5236776389174 |
| 22 | 6774 | 2.4394227009732 | 0.53342757714496 |
| 23 | 5270 | 2.3949674527604 | 0.51381316608346 |
+------+----------+-----------------+------------------+
24 rows IN SET (2.37 sec)

As you can see in this case during certain hours the average type of bad queries skyrockets and it becomes 90% or so driven by Sphinx. This tells us there is some irregular activity (cron jobs?) is happening and it affects Sphinx layer significantly.

Such goal based from top to bottom approach is especially helpful for complex applications using mutliple components (like sphinx and MySQL) or multiple MySQL Servers because in these cases you often can't easily guess the component which needs attention. Though even for less complicated single MySQL server application there is often the question if it is MySQL server causing the problem or if application code needs to be optimized.

Entry posted by peter | No comment

Add to: | | | |

10:11 西联汇款最新更新 » Inside AdSense-中文

作者 AdSense 支持小组

西联汇款开始以后，就不少发布商因为出差等原因错过了取款有效期，而不得不再等一个月才能取款。现在我们有一个好消息告诉各位！

最近，我们对西联汇款支付系统进行了更新，发布商取款的有效期由35天延长到了60天。这样您有更灵活的时间领取付款，希望这一更新能够给各位发布商带来方便：）

同样，如果您不能在有效期内取款，您的收入将会自动退回帐户，并且为了安全帐户会自动付款保留。这时您就会在帐户里看到一个粉红色的提示框。如果您希望在第二月继续支付，就需要按照提示在15号之前解除付款保留。

求知欲也需要克制？Pure Pleasure - Reborn » 车东's shared items in Google Reader

TopLanguage (Google Group)上的一封信中提到：

人对于未知的事物总是充满了好奇心，然而在知识爆炸的今天，一个人由于其精力的有限性，必须对想要学习的知识进行一定的取舍，分清主次而避免在一些琐碎的，无关的主题上消耗太多精力。

但是现在的信息传播方式也是具有一定侵入性的，比如你订阅的RSS, 邮件列表等，它们会常常通知你："嘿，伙计，我这里又有新东西了"，于是很有可能某个新鲜的主题又勾引起了你求知的欲望。即是这个主题与你的学习，工作，甚至兴趣一点关系也没有，仅仅是因为"这个东西还蛮有意思，我居然今天才知道，嗯，要学习学习"这个念头就有可能花掉大量的时间和精力。人即使在求知上也是很贪婪的，一个问题未能完全弄懂总是未免有点意兴阑珊的味道。我的经验是，如果思考一个问题而没有得到结果，那么在随后的几天时间里大脑都会有意识或无意识地给这个问题分一定的时间片。毫无疑问，这样会干扰正常的工作和学习，因为只有专注才能获得最高的效率。如何克制住自己对某些知识主题的求知欲？我想听听各位牛人们的意见。

以下是我的看法：

WHAT:

所谓“兴趣广泛”有的时候并不见得是好事儿。

所谓“学习”是“学”之后再加上大量的“习”（练习）。而常见的学习曲线如下图所示：

一般来讲，任何人在任何时候学习任何知识的时候，学习曲线不大可能是一个完美光滑的曲线——就好像第一排第一幅那样；真实的学习曲线可能更像第二排的那一幅。

WHY:

对学习曲线的最重要理解是：“迈出第一步之时获得的进步最大”——（t2-t1）vs.(p2-p1）。而随后，进步的难度越来越高——并中途伴随着“停滞”、“退步”等常见现象。(t3-t2)远远大于(t2-t1)，但是，(p3-p2)却远远小于(p2-p1)。

这就很容易解释为什么有些人表现得“聪明”，在不停地“学”新东西，而最终竟然一无所成。

因为他们对每一样东西都是“浅尝即止”，缺乏深入，所以一无所成。但，自己却并不知道，因为他们每次都（自以为）获得最大的“成就感”、“征服感”，每次都（自以为）“很有收获”……他们已经不知不觉开始迷恋“最初的那个巨大的成功”（实际上只有继续走下去很久之后才会明白那只不过是起点而已）。

其实，这也是“逃避困难”的另外一种表现，只不过，更加隐蔽。但也因为更加隐蔽而危害更大。

HOW:

两个小建议能够用来解决问题：

必须至少精通一项技能。任何一项技能的高度精通，会使得拥有这项技能的人对关于学习的方方面面拥有深刻的认识与体会。进而对自己的有关学习的决定能够真正做到“深思熟虑”
拒绝相信“速成”，相信“精通任何一项技能至少需要十年时间”。这种常识（当然是很多人不愿意接受的常识）会让相信它的人沉着、冷静，不易受到外界影响——至少在学习方面。

貌似不相干的补充：

如果有机会，可以仔细观察那些工作没多久就被解雇了的人。他们几乎会无一例外地如此安慰自己：“毕竟还是学到了很多东西……”可事实上，那只是因“初始状态下进步最容易获得”而造成的错觉，那个感觉上“很大”的进步实际上在终点上观望的话只不过是“起点”而已——只不过是一个“点”而已。

只是，解雇他们的人抑或没必要做出如此清楚的解释，抑或也不懂得如何做出如此清楚的解释，抑或就算做出了如此清楚的解释对方也无能力理解或者不愿意理解而已。

	十二月 2008
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31