Tim O'Reilly
2008-08-08
As I wrote last month in What Good is Collective Intelligence if it Doesn't Make Us Smarter?, at this year's Web 2.0 Summit, we're focusing on how what we've learned from the web over the past decade can be applied to solve the world's hard problems. That's why I'm really excited to see that John Battelle has persuaded Al Gore to join us. One of those hard problems that requires all the intelligence we can throw at is global warming. And there's no one who deserves as much credit as Al Gore for getting it on our collective radar. Through persistence, vision, and hard work, and a real mastery of the new tools of global media, he made all of us pay attention. His work has been a textbook demonstration of the power of media to change the way people think. That's Gore's continuing focus, with his role at Current TV. He's also joined Kleiner Perkins as a partner involved in cleantech investing. When I first saw Gore talk about climate change at the TED conference in early 2006, everyone wanted to know what we could do about it. People are still struggling to answer that question, but it's clear that technology can play a large role: helping us to monitor and measure the rate of change in crucial environmental variables, creating feedback loops that change behavior at both macro-levels (like carbon markets) and personal levels (like home energy monitoring); creating green data centers and low-power devices; creating new forms of renewable energy generation or storage, new materials that require less energy to create; alternative fuels and vehicles. The list goes on and on. (Reminder: we're looking for innovative "web meets world"startups for the Web 2.0 Summit Launchpad.) Of course, global warming is far from the only "web meets world" theme that we're exploring. The conference will cover everything from the latest trends on the web (the rediscovery of e-commerce as a business model, cloud computing, social networking, mobile applications, and the inevitable platform wars) to politics, global disease detection, personal genomics, private space industry, and even military infotech. Speakers I'm particularly excited to see, in addition to Vice President Gore, include Tony Hsieh (@zappos, for those of you who see him continually on twitter), Elon Musk (who's got to have the coolest portfolio of investments since retiring from PayPal, with SpaceX, SolarCity, Tesla Motors all under his wing), and Michael Pollan, who's completely changed the way many of us think about food. Check out the confirmed speaker list, but keep in mind that there are more yet to come as John and I firm up the program. |
翻译:Yuwen 正如我上个月在“如果集体智慧没让我们更聪明那它还有哪些益处?”中写的,今年Web 2.0峰会重点放在过去十年我们从Web领域学到的一切将如何用来解决那些世界性难题。所以我非常兴奋John Battelle成功邀请到阿尔.戈尔加入到我们中来。 这些世界性难题之一就是全权变暖问题,它要求所有人集思广益。而世界上没有人能比阿尔.戈尔对此贡献更大了,他让所有人认识到这一问题。通过坚持不懈、远见卓识、艰苦的努力以及对全球媒体的掌握,他让所有人关注全球变暖问题。他的工作堪称展示传媒力量改变人们思考方式的教科书。 那是戈尔作为在Current TV的角色持续关注的焦点。他也作为合伙人参加了Kleiner Perkins,参与清洁能源投资。 我2006年第一次在TED会议上见到戈尔讨论气候变化,所有人都想知道我们对此能做什么。这样的谈论一直在继续但有一点很清楚——技术能够发挥非常大的作用:帮助人们通过一些关键的环境参数监测变化程度,构建一个反馈渠道在宏观层面(比如碳市场)和个人层面(比如家庭能源监测)改变我们的行为;建立绿色数据中心和低能耗设备;创建新形式的可再生能源开发和储存,低能源消耗的新材料;替代燃料和交通工具。不胜枚举。(提醒:我们一直在寻找创新的“Web拥抱世界”创业公司参加Web 2.0峰会的Launchpad。) 当然全球变暖不是我们“Web拥抱世界”主题探讨的唯一问题。本次会议将覆盖从Web最新趋势(电子商务作为商业模式的重新认识、云计算、社交网络、移动应用以及不可避免的平台之争)到政治、全球化疾病监测、个人基因组、私有太空产业甚至军事信息技术诸多方面。我们期待的演讲嘉宾——除副总统戈尔——包括Tony Hsieh(@zappos,很多人一直在Twitter上关注他)、Elon Musk(从PayPal退休后他有最酷的投资组合,包括SpaceX、SolarCity、Tesla Motors)以及Michael Pollan(他完全改变了我们中很多人对食物的看法)。请参考已证实演讲嘉宾名单,不过一定注意后续还会有更多嘉宾参与,我和John还正在准备会议日程。 |
One common question I guess is how much should I see performance improved in case I increase memory say from 16GB to 32GB. The benefit indeed can be very application dependent - if you have working set of say 30GB with uniform data access raising memory from 16GB to 32GB can improve performance order of magnitude by converting very IO bound load to CPU bound, it is well possible to see limited gains - if your working set already fits in 16GB you may not see any significant gains upgrading memory to 32GB. Interesting enough similar can happen for very large working set - for example if your main queries do full table scan of 100GB table it does not matter if you have 16GB or 32GB the load is going to be way too much IO bound anyway.
Interesting enough because of MySQL scaling issues it is also possible to see performance to go down as you increase buffer pool size. Some threads which would be safely sleeping waiting on IO completion are now finding their data in buffer pool so they start to compete on hot latches and performance go down.
Now back to original question - how do we predict the benefit from increasing the memory and so cache sizes ? I typically start by looking at the type of load we’re dealing with. If it is CPU bound and there is little IO wait we typically do not expect to gain much by increasing the memory.
This however have to be watched carefully. Performance does not always stays the same and the goal may not be optimizing average performance. It may be heavy data processing batch job which is IO bound and which runs too slow (and affects other transaction) and may be increasing memory is helpful to solve this problem.
If it is IO bound (high IO system utilization, low CPU usage) one should think about how much CPU capacity is available. If your CPU is 25% busy you will unlikely get more than 4x even if you eliminate all IO completely (unlikely because there is increased CPU overhead going IO path as well), so account for that.
Besides pure CPU based computation you should account for locking. Consider for example bunch of transactions updating single row in the table. Having such workload you would likely see no IO and a lot of CPU idle and not because of internal Innodb limits but because your application logical serialization problems.
So what if we have very IO bound application without serialization issues (say reporting slave) which is very IO bound showing 100% IO subsystem utilization on 5% of CPU usage ? This is the true challenge becomes because MySQL has no tools to analyze working set (we have per query working set in our patches but it is not enough). We have couple of ideas how to do global working set profiling but it should wait for now.
At this point I typically use my intuition to try to guess how much data application to get some ballpark figure and often it is enough.
If you would like to be more scientific there are couple of other things you can do. First - you can test by scaling down the data. If you have data for say 500.000 users on the 16GB server get it down to the half of that and you will often be close to seeing performance 32GB server would have. You however have to be careful and understand how data is used in your application. If you say have data for 10 years and load data for 5 years only to compare performance you may get misleading results if reports are typically done for last few months. Basically in such exercise your goal is load data so the working set would be half of original so it would have cache fit similar to one on the larger system you’re trying to compare to. Using this approach you also should be careful with your estimations and take IO subsystem into account - even with same cache hit ratio more data and more load means there are higher demands for IO subsystem performance.
By far the best method is trying, if you can afford it - just get memory upgrade and see how it affects performance. With many vendors you can get the memory upgrade or the whole system to try and return it back if it does not fit your needs. This approach especially works well if you have many slaves (or many shards) in which case you can see performance or capacity improvements quite easily.
Entry posted by peter | 2 comments
Jesse Robbins
2008-08-07
Dan Kaminsky has posted the details of the widespread DNS vulnerability. Clarified Networks created this visualization of DNS patch deployment over the past month: Red = Unpatched |
八月 2008 | ||||||
一 | 二 | 三 | 四 | 五 | 六 | 日 |
1 | 2 | 3 | ||||
4 | 5 | 6 | 7 | 8 | 9 | 10 |
11 | 12 | 13 | 14 | 15 | 16 | 17 |
18 | 19 | 20 | 21 | 22 | 23 | 24 |
25 | 26 | 27 | 28 | 29 | 30 | 31 |