Strategy: Facebook Tweaks to Handle 6 Time as Many Memcached RequestsHigh Scalability - Building bigger, faster, more reliable websites.

22:38 网络广告的ARPU(美法中) » laolu's blog: Blog

网络广告的ARPU，美国法国中国，2008年
	网络广告规模	网络用户数	ARPU
美国*	259亿美元	2.2亿	117.7美元/人
法国**	13.6亿美元	0.36亿	37.7美元/人
中国***	111.7亿元人民币	2.5亿	46.8元人民币/人

注：
* 美国网络广告数据见：美国网络广告明年现低点后再加速？（还可以参见这里），网络用户数据见：North America Internet Usage Stats - Population Statistics。
** 法国网络广告数据见：法国网络广告仍在增长，网络用户数据见：Europe Internet Usage Stats and Population Statistics。
*** 中国网络广告数据来自易观见“报告称2008年中国网络广告市场规模达111.2亿元”（易观网站也能搜索到相关数据），网络用户数据来自CNNIC的调查。

另：
2008年，就网络广告占所有广告的比例，美国是8.8%（来源），中国大约3%（前三季度内地广告市场达2604亿，粗略推测全年约3500亿人民币）。

Update:
就所有媒体广告收入分摊到总人口的人均数据方面，美国达977美元/人（广告收入见这里，人口数见搜索），中国约365元人民币/人（人口数见搜索）。

21:37 六周年|博客大巴 » 车东@博识传播|互动·娱乐·媒体

12日的节目：

跨界·合谋——YOU时代新媒体营销论坛

13日的节目：

下午创意集市上的收获：

《城客》杂志；
话剧《阿尔法女郎》
6周年明信片；
卡贴若干；

更多照片稍后释出视频……

晚上Party拍下的作品，红色的安卡拉（以前一直以为土耳其的首都是伊斯坦布尔）

原作者为：朝天啸

今天的《叶问》预首映：

我打赌：金山找同学在下集中会变好；

随机文章：

老衲手捧《城客》到此一游 2008年12月10日

从北京到上海：美食篇先和一次性筷子说Bye Bye 2008年11月20日

博邻功能公测中+博客大巴编号第3000,0000篇文章 2008年10月10日

博客大巴新版后台上线：Blogbus V5 2008年09月11日

博客大巴团队纵横拓展 - 五泄篇 2008年09月03日

奇伟鞋油“炫亮成功路”博客大赛——职场关键时刻，成功达人如何塑造形象、炫亮成功？

收藏到：Del.icio.us

14:16 MySQL QA Team Benchmarks for MySQL 5.1.30 » MySQL Performance Blog

As you might have seen MySQL QA Team has published their benchmarks for MySQL 5.0.72 and 5.1.30.

It is interesting to compare with results I posted previously

The quote from the results mentioned above:

“Maybe you’ve seen some claims by others in the MySQL community that MySQL 5.1 runs slower than MySQL 5.0. Maybe you’ve also seen some claims by others in the MySQL community that MySQL 5.1 runs faster than MySQL 5.0.

Guess what? They’re both right. “

But is it really what results are telling us ?

I do not think so. When you’re doing benchmarks you should be comparing best performance settings for given application and conditions. For example it is unfair to compare results with different innodb_buffer_pool_size or innodb_flush_log_at_trx_commit but for innodb_thread_concurrency - you should be picking the value which makes your workload to run fastest.

Lets look at the graphs provided in these benchmarks and see what value is best for MySQL 5.1 and for MySQL 5.0 respectively ?

Both versions do their best with innodb_thread_concurrency=0 and 5.0 is slightly but consistently faster. Same as in my results.

So I would interpret these results differently.

MySQL 5.0 is faster if you configure it right. If you configure it wrong the regression is going to be worse than for MySQL 5.1.

You can’t really use this results to tell MySQL 5.1 will be winner in cases when small values of innodb_thread_concurrency get best performance. The things can be completely different in this case. They may or may not, there is simply no data at all.

It is also very interesting to see benchmark run with innodb_thread_concurrency=1000 - This is exactly the value which you should never use. The limit of 1000 threads inside the kernel is by far too large (so it is same as 0 - unlimited) but it will add another mutex to deal with for the queue implementation.

What would be really interesting to learn though is why MySQL 5.1 gets so better when threads get queued up, what kind of changes in MySQL 5.0 result in this behavior. It also would be good to run profiling for these results to see where these few percent are lost for MySQL 5.0 to see if they are possible to be reclaimed.

Finally - if I would be doing QA, and benchmarks as part of them I would try to use options which are close to what people would use in production. Or at least explain why they are set so. Is it because 5.1 does not show too good results with standard settings or is it just omission ? Again quite possibly nothing will change but may be not.

In particular: innodb_support_xa=0, innodb_doublewrite=0 are not normally used in production and they do add some overhead.

When looking for results more relevant for production I also would keep binary log enabled - most big installations of MySQL use replication or at least binary log to get point in time recovery from backup. Also Innodb log files of 2*650M are larger than practical for most applications because of too large recovery time.

Anyways. It is great to see MySQL QA Team has published some benchmarks now and I can’t wait to see more. If we have data and the good disclosure (settings, versions, hardware) as we have in this case we can make our own mind of results and draw our own conclusions.

Also indeed it is a good time to try MySQL 5.1 for your environment. If you spot any regressions from MySQL 5.0 it will likely take a time to get them fixed.

Entry posted by peter | No comment

Add to: | | | |

底气十足的FengMake Difference » 车东's shared items in Google Reader

今天早上爬起来，看了一下To be continued的更新，看到Feng的InfoThinker创业宣言。跟Feng同事过，同事的时候不太熟，后来不做同事了，才慢慢熟悉。

在我的印象中，Feng是一个沉默寡言的人，不善言辞，可以说其声不张，其貌不扬。知道他在倒腾一些玩意，但没怎么深入了解过他。直到有一天明白自己的FIT输入法就是他开发的，我才决定深入一点去了解他。

可能有人对FIT并不怎么感冒，但我深知它的到来，给我带来的极大方便。对他开始有些崇拜，有时间的时候逛逛他的博客，看到他写的文章，给我最深刻的印象就是：一个不善言辞的人，为何如此文思敏捷，条理清楚，言之有物，且知行合一呢。

后来听说他的两个小故事，一个是他在Apple广州上班的时候，曾经申请加入Apple北京的开发部门，遭到拒绝；一个是他打电话给丁磊，希望丁磊开发POPO苹果版，也遭到拒绝，并知道当时的丁磊认为喜欢苹果的人：都是一帮只会吹不会做的家伙。

我想，这两个经历对于Feng之后的小小成就是有很大的帮助的。今天他宣布成立自己的公司，可能与丁磊对苹果粉丝的批评也有一定关系。他成立公司的构想，应该是酝酿许久的事情了，但我看到博客才知道，让我觉得有点意外，就象他写的文章一样。

InfoThinker创业宣言写得行云流水，一气呵成。从这中间，我看到的是，一群有理想、有抱负的人，找到了实现理想和抱负的方法，那就是文中的一句：

“我们犯了很多错误，交了很多学费才知道了这个世界没有神话，只有一些很朴素的道理：便宜的打败贵的，质量好的打败质量差的，认真的打败轻率的，耐心的打败浮躁的，勤奋的打败懒惰的，有信誉的打败没信誉的……”

这句话，需要有多大的底气，才能如此自然地流露出来！

而他确是有底气的，因为他是个“理想主义实干者”，做出了好几件让人惊讶的事情，iXpenseIt在App Store里面的排名就是一个例子。或许很多人对此不以为然，我会用愤青的方式告诉他们：在中国，最不缺的是眼高手低的人。我看过国外用户对iXpenseIt的评论，这些评论不是政府的专家和学者，不是所谓的权威人士，更不是那些自以为是的编辑记者，而是普通用家对于这个软件的真挚的评价。正是这些用家真金白银的支持，成就了它的排名。

Feng用它的努力证明了他所说的朴素的道理。创业路上艰难重重，祝福InfoThinker成功，且盼望着Feng继续坚持这些朴素的道理，做出更多让人惊讶的东西来。

Strategy: Facebook Tweaks to Handle 6 Time as Many Memcached RequestsHigh Scalability - Building bigger, faster, more reliable websites. » 车东's shared items in Google Reader

Our latest strategy is taken from a great post by Paul Saab of Facebook, detailing how with changes Facebook has made to memcached they have:

...been able to scale memcached to handle 200,000 UDP requests per second with an average latency of 173 microseconds. The total throughput achieved is 300,000 UDP requests/s, but the latency at that request rate is too high to be useful in our system. This is an amazing increase from 50,000 UDP requests/s using the stock version of Linux and memcached.

To scale Facebook has hundreds of thousands of TCP connections open to their memcached processes. First, this is still amazing. It's not so long ago you could have never done this. Optimizing connection use was always a priority because the OS simply couldn't handle large numbers of connections or large numbers of threads or large numbers of CPUs. To get to this point is a big accomplishment. Still, at that scale there are problems that are often solved.

Some of the problem Facebook faced and fixed:

Per connection consumption of resources. What works well at low number of inputs can totally kill a system as inputs grow. Memcached uses a per-connection buffer which adds up to a lot of memory that could be used to store data. Nothing wrong with this design choice, but Facebook made changes to use a per-thread shared connection buffer and reclaimed gigabytes of RAM on each server.

Kernel lock contention. Facebook discovered under load there was lock contention when transmitting through a single UDP socket from multiple threads. Sockets are data structures too and they are subject to the usual lock contention issues. Facebook got around this issue by maintaining separate reply sockets in different threads so they would not contend with the receive sockets. They found another bottleneck in Linux’s “netdevice” layer that sits in-between IP and device drivers. They changed the dequeue algorithm to batch dequeues so more work was done when they had the CPU.

Application lock contention. Nothing brings out lock issues like moving to more cores. Facebook found when they moved to 8 core machines a global lock protecting stats collection used 20-30% of CPU usage. In application that require little processing per request, as does memcached, this is not unexpected, but doing real work with your CPU is a better idea. So they collected stats on a per thread basis and then calculated a global view on demand.

Interrupt floods and starvation. With so much traffic directed at a single server the hardware can flood the CPU(s) with interrupts and keep the CPU from doing "real" work. To get around this problem Facebook implements some complicated strategies to load balance IO across all the cores. As I am less clever I might try more network cards with a TCP Offload engine.

When you read Paul's article keep in mind all the incredible number of man hours that went into profiling the system, not just their application, but the entire software hardware stack. Then add in the research, planning, and trying different solutions to see if anything changed for the better. It's a lot of work. Notice using a nifty new parallel language or moving to a cloud wouldn't have made a bit difference. It's complete mastery of their system that made the difference.

A summary of potential strategies:

Profile everything. Problems are always specific. The understanding of the problem must be specific. The fix must be specific.

Burn profiling into your regression tests. Detect when and where performance tanks as a regular part of your build.

Use resources in proportion to what grows slowest. This requires multiplexing, but at least your resource usage is more predictable and bounded.

Batch work. When you have the CPU do all the work you possibly can in the quantum or the whole system grinds to a halt in processing overhead.

Do work and maintain resources per task. Otherwise locking for shared resources takes more and more time when there's less and less time to do the work that needs to be done.

Change algorithms. Sometimes you simply need to do things differently. Tweaking will only get you so far.

You can find their changes on github, the hub that says "git."

	十二月 2008
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31