10:56 How to find wrong indexing with glance view » MySQL Performance Blog

Quite common beginners mistake is not to understand how indexing works and so index all columns used in the queries…. separately. So you end up with table which has say 20 indexes but all single column ones. This can be spotted with a glance view. If you have queries with multiple column restrictions in WHERE clause you most likely will need to have multiple column indexes for optimal performance. But wait. Do not go ahead and index all combinations. This would likely be poor choice too :)


Entry posted by peter | 8 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

09:35 新发布商专题日:AdSense 与 AdWords 相辅相成 » Inside AdSense-中文


我们认为,无论是经验非常丰富的发布商还是刚刚加入的发布商,都非常有必要了解他们在 Google 的广告网络“生态系统”中所扮演的角色。AdSense 一方面可以让发布商通过在自己的网站上展示 Google 广告获取收益,另一方面还可以向 AdWords 广告商提供与 Google 搜索结果页效果相近的广告资源,帮助广告客户扩大广告的受众范围。

如果广告商对您网站上的广告效果感到满意,他们就可能会投放更多广告。这就意味着我们可以投放的广告更多,因此,您也可能会在自己的网站上看到与您的网站内容相关性更强、更符合您网站用户兴趣的广告。在这种情况下,来自您网站用户的广告点击次数可能会更多,同时可能会有更多以展示位置定位的广告系列定位到您的网站,广告客户的出价也可能会提高。总而言之,如果广告客户认为其广告在您网站上的投放效果非常好,您就有可能获得更多收入。

相反,如果广告客户认为其广告在您网站上的投放效果非常差,他们在您的网站上投放广告的意愿就会降低。这就意味着我们的系统向您的网站投放的广告与您的网站内容及用户兴趣的相关度可能会下降,并由此导致您获得的点击次数减少、广告客户的出价降低。也就是说,如果广告客户认为其广告在您网站上的投放效果非常差,您获得的收入就可能会减少。

如果您希望通过在您的网站上展示广告的广告客户获得最大收益,我们建议您全力制作包含丰富原创内容的高品质网站,为您的用户带来实实在在的好处。如需更多相关信息,请参阅我们的计划政策网站管理员质量指南
04:30 Rendundant Array of Inexpensive Servers » MySQL Performance Blog

So you need to design highly available MySQL powered system… how do you approach that ?
Too often I see the question is approached by focusing on expensive hardware which in theory should be reliable. And this really can work quite well for small systems. It is my experience - with quality commodity hardware (Dell,HP,IBM etc) you would see box failing once per couple of years of uptime which is enough to maintain level of availability needed by many small systems. In fact they typically would have order of magnitude more availability issues caused by their own software bugs, DOS attacks and other issues.

However as your system growths the reliability goes down. If you have 100 servers with each failing every 2 years this is about a server a week which is bad and if you’re into thousands and tens of thousands of servers server failures are becoming common place so it is important to make sure failing server does not affect your system and also what you can recover from server failure easily

So you should assume every component in the system can fail (if it is Server,Switch,Router,Cable, SAN) etc and you’re ready to deal with this. It does not mean you always have to ensure you stay fully operational after any failure but at least you should understand the risks. For example you may want to choose to keep single Cisco router because it has its own internal high availability on the component level which makes it extremely unlikely to fail, because you have 4 hour onsite repair agreement and because it is just freaking expensive. Though may be redundant less expensive systems could be better choice.

I would highlight again every component can fail it does not matter how redundant it is inside. The SAN is very good example - I’ve seen Firmware glitches causing failure in the SAN which was fully redundant on the component level. It is not every hardware component but also any code may fail as well. This is actually what makes your own code often the weakest link in availability.

Depending on failure rate you also should be thinking about automation - for frequent failures you want to recovery (like getting spare Web server and putting it online) to be automatic or done with simple manual command. For complex and rare failures you may have less automation - if certain type of failure happens once per couple of years for many evolving systems there is very high chance the old automation tools may not work well (this is of course unless you always test all automated failure scenarios regularly).

So if we’re designing the system so it can tolerate hardware failures should we bother about hardware quality at all ? The answer is yes in particular for classic database/storage systems. Few systems are design with so much error detection and automated handling in mind as Google File System.

In particular you want to make sure Error Detection is on the good level. For example if you’re running the system without ECC memory chances are your data will melt down and you will not notice it for long time (in particular if you’re using MyISAM tables) which can cause the error to propagate further in the system and make recovery much more complicated than simply swapping the server. This is exactly one of the reasons many high scale installations prefer Innodb - it is paranoid and this is how you want your data storage to be. This is also why Sun is so proud about checksums on the file system level in ZFS.

What is about RAID when ? As strange as it may sound but you should not relay on RAID for your data safety. There are many ways to loose data on RAID system even if you’re running RAID6 with couple of hot spare. The RAID is just dramatically reduces chance of data loss in case of hard drive failure and this is good because recovering database servers is not fully automated in most cases. Plus there may be system performance impact and (in particular if you use MySQL Replication for HA) the switch to the new server may not be 100% clean with few updates lost. RAID, especially with BBU also makes a good sense to get extra performance out of the box.

Some installations are using RAID0 for slaves - in these cases there are typically many slaves and recovery of the slave is very easy and causes no business impact. This is fine assuming you do the math and the performance gains or cost savings are worth it.

Another good RAID question is if Hot Spare should be used. I normally do not use it because it a large waste, especially as most of systems have even number of drives, so if you’re looking for RAID10 setting up hot spare costs you 2 drives. Having hot spare does not add a lot to high availability - if you have proper RAID monitoring in place and keep couple of spare hard drives on the shelf in the data center we’re speaking about couple of extra hours running in degraded mode. Even if you do not have spare hard drive you can often pool the one from the spare server and have the “warranty man” to replace it instead.

It is also a good question if you need redundant power supplies. In my experience they rarely fail so having redundant power supplies does not increase availability when it comes to hardware failures that much and so if you just look from this angle it may be justified only for the most critical servers. Do not forget redundant power supplies also increase server power usage a bit. Redundant power supplies however are helpful if you have multiple power feeds, so server can stay up if one of the phases has a power loss. Another benefit is - in redundant power supply will often allow to do some power work (like moving server to different circuit) without downtime which may be or may not be something important for you.

Finally I should mention about spare component. These are paramount if you’re designing highly available system. Having spare drives on the shelf, spare switches, spare servers (which are same as better as servers which are in production) is paramount. It is important promotion happens easily and there are no performance gotchas (ie 8 core server can be slower than 4 core with MySQL). It is best if you just put couple of spare servers in each purchase batch so they are absolutely same configuration but I know it is not always possible. Dealing with spares is yet another reasons to avoid the “server zoo” and have limited set of purchased configurations which are reviewed yearly (on other regular interval) rather than finding different best configuration each week.

Having spare servers also means you often do not need most expensive support agreements and Business Hours Monday-Friday is good enough for you - you’re not waiting for support for production anyway just fall back to another server and use it. Of course you can imagine cases when problem could affect all servers of the same type but it is not that frequently seen in practice.

To avoid multiple servers failing at the same time it is of course important to QA/Stress test servers before you put the load on them. I’ve seen multiple cases when something would go wrong and all servers of same configuration will experience the same problem. Proper QA/Stress test reduces the chance of this but you better to be testing with load similar to what you expect in production.

Requirement to have Spare hardware is also the reason why commodity inexpensive hardware is often better choice. If you have couple of $1M in production you need another $1M server as a spare and this is expensive. If instead you have 10 pairs $10K boxes having couple of spares would cost you only $20K plus I found it in many cases much easier to convince “finance” people to buy something cheap which is not used most of the time when to spend a lot of money on the server which will be where sitting doing idle.

How many spare servers you need - you would see it in practice. As I mentioned at least one for any hardware class you have. If you have many failures you need more of course. You may also decide to keep more spare systems when you can use them to help capacity management, especially if you have multiple applications which do not share hardware but share the data center. You may have “spares” to provide extra on demand capacity for web servers or memcached quite easily, or say increase number of slaves if you have unexpectedly high number of reports launched by users etc.


Entry posted by peter | 5 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

01:58 生活在豆瓣 打开你的衣橱 » 豆瓣blog

在豆瓣的小组中,有关“衣、食、住、行”的讨论每天都在进行,这些讨论让生活中微小的事情变得生动、有趣。虽然豆瓣上很多人都是书、影音的行家,但是千万不要觉得他们只流连于对文艺作品的指指点点,在生活中的每个小细节,他们都是达人的等级!因此我们将陆续将这些小组分类呈现,希望帮你发现豆瓣上更多你所感兴趣和你可能感兴趣的东西,将简单的日常生活妆点的更有品质。无论如何,希望他们能够真正的有益于你的生活。

那么,今天我们先打开达人们的衣橱,看看现在的流行指标是什么吧!

生活在豆瓣(一)打开你的衣橱

达人们说:在豆瓣你需要拥有一双Converse,这儿已经有9000多人驻扎,当然,这只是加入的第一步,接下来你还需要知道各系列的由来和特征,甚至将对C的热爱无止境的发扬光大,比如:那些电影里有C的身影。因为,“匡威不仅仅是一双帆布鞋”。

达人们还说: 别问我们为什么5000多人聚集在这里?答案是,我们就是无厘头的喜欢“大、小IT”,他们旗下的品牌我们如数家珍,所有大城市的店铺位置、打折信息我们都了如指掌。

Comme des Garçons 知道这个品牌的人有多少呢?先和达人们学下发音,现在798正在进行CDG的展览,有兴趣的话可以去感受一下川久保玲独特的、蕴涵着属于东方禅机和思想的创意风格。

如果你问H&M的达人“大减价的时候,哪个牌子的衣服会让你不顾一切地血拼到底?”他们肯定想都没想,冲口而出:“HM”。 喜欢原因,很简单,cheap and chic。

达人们的衣橱多姿多彩,口味也大不相同。喜欢三叶草的人,会搭配上Levi’s BAPEZARA的fans会同时钟爱 UNIQLOMUJI回力海魂衫是最经典的国货范,大嘴猴 Paul Frank搭配上 ROXY的人字拖简直是天下无敌的可爱…

总之,达人的衣橱总有新鲜的东西出炉,一起加入,先睹为快啦:
(以下小组是按照豆瓣上服饰品牌小组的参与人数进行排列)

coverse小组 Converse (9168)

I.T小组 I.T(5415)

三叶草小组 三叶草(4117)

UNIQLO小组 UNIQLO(3163)

Levi's小组 Levi’s(2581)

>>>> 查看更多品牌达人小组


^==Back Home: www.chedong.com

^==Back Digest Home: www.chedong.com/digest/

<== 2008-08-21
  八月 2008  
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
==> 2008-08-23