This article on scaling cookie baking recipes showed up in one my key word alerts. Lots of weird things show up in alerts, but I really like cookies and the parallels were just so delicious. Scaling in the cookie baking world is: the process of multiplying your recipe by many times to produce much more dough for many more cookies. It’s the difference between making enough dough in one batch to make two dozen cookies, or 2000 cookies.
Hey, pretty close to the website notion. Yet as any good cook knows any scaled up recipe must be tweaked a little as things change at scale. Let's see what else we're supposed to do (quoted from the article):
With a little creativity you can make all sorts of interesting parallels between scaling websites and scaling cookies. I'll leave that to your ample imagination as mine has been crushed by a virtual sugar buzz. But my afternoon snack sized thought for the day is: Relax. Eat more cookies.
CloudCamp is an interesting unconference where early adapters of Cloud Computing technologies exchange ideas. With the rapid change occurring in the industry, we need a place we can meet to share our experiences, challenges and solutions. At CloudCamp, you are encouraged you to share your thoughts in several open discussions, as we strive for the advancement of Cloud Computing. End users, IT professionals and vendors are all encouraged to participate.
CloudCamp Silicon Valley 08 is scheduled for Tuesday, September 30, 2008 from 06:00 PM - 10:00 PM in Sun Microsystems' EBC Briefing Center
15 Network Circle
Menlo Park, CA 94025
CloudCamp follows an interactive, unscripted unconference format. You can propose your own session or you can attend a session proposed by someone else. Either way, you are encouraged to engage in the discussion and “Vote with your feet”, which means … “find another session if you don’t find the session helpful”. Pick and choose from the conversations; rant and rave, or sit back and watch.
At CloudCamp, we tend to discuss the following topics:
* Infrastructure as a service (Joyent, Amazon Ec2, Nirvanix, etc)
* Platform as a service (BungeeLabs, AppEngine, etc)
* Software as a service (salesforce.com)
* Application / Data / Storage (development in the cloud)
变革的种子开始播下,但是非洲的“绿色革命”应该追求什么模式?艾伦•贝蒂认为,贫穷与多样,使得这片大陆需要一系列的解决方案。
求变:在全球最贫困的大陆,一个最复杂问题的快速转变,将在粮食危机中发生。
第一次“绿色革命”改变了亚洲和拉美的农业,新的品种和充足的肥料使农民摆脱了生存陷阱。三十年后,非洲正设法予以效仿。
整个1980年代和1990年代,非洲的农业生产率未能跟上人口增长速度。任何的增长,都是因为耕种增加,而不是产量提高。农民、农学家和发展问题专家称,仅有新技术并不会带来根本的转变,特别是在短期内。改善市场和运输将有助于拓展现有的、未充分利用的技术,借此可以更快地获得收益。
然而,非洲是该寻求基于大型商业农场的农业企业模式,还是该集中改善数百万小农户的条件,人们意见不一。而且,改造非洲农业的困难是多方面的。有些是地貌问题:非洲的土壤和气候千差万别,从马格里布的地中海气候,到南非的热带环境和温带地区。在此地种植的作物和使用的技术,往往不能照搬到其他的地方。
那么,改善的前景如何,将有助于喂饱近十亿的人口吗?在洛克菲勒基金会和盖茨基金会1.5亿美元的资助下,非洲绿色革命联盟(AGRA)——一个农民、农业企业、科学家和研究机构的联盟——于2006年成立,这项事业得以向前推进。洛克菲勒基金会在资助第一次绿色革命中,也发挥了关键的作用。
设在内罗毕的非洲绿色革命联盟主席纳曼加·恩贡吉称,亚洲农业系统的主体是品种类似的小麦和水稻,非洲作物的品种范围更广,包括木薯、高粱、稷和玉米。“一个尺码不会适合所有人。” 恩贡吉说。
同样设在内罗毕的公私合作研究机构非洲农业技术基金会(AATF)执行董事姆波科·柏康加指出,即使在同一个国家,情况反差巨大。“在西肯尼亚,在东非大裂谷的北部,有着非常肥沃的地区,农场生产率高,商业潜力得到了很好的开发,” 柏康加说,“而50公里以外的地方就是被遗忘的地区,农场的产量只有上述地区的三分之一或四分之一。”
尽管有一些大的水系,有些地区降雨量丰富,但大多数地方靠天吃饭:在非洲,不到5%的耕地得到灌溉,而南亚为40%。
发展新技术需要一段时间。近几十年来,非洲的农业研究能力几乎跟其土壤一样,极少受到关注,政府经费捉襟见肘,基础科学经费遭到大幅削减。鉴于非洲农艺条件和其他地方不同,很难借用针对其他市场的科学突破。
例如,非洲农业技术基金会有一个 “节水玉米”开发项目,这种玉米能经受住更长时间的干旱,如果气候变化使得雨水变化无常——看来是这样,这一特征将变得愈发重要。该基金会将获得美国农业企业集团孟山都公司免费提供的基础研究。墨西哥国际玉米小麦改良中心(CIMMYT),一个在第一次绿色革命中发挥了重大作用的非营利研究机构,之后会将其移植到可以在热带环境中茁壮成长的高产玉米品种上。然后,这些品种将分送到非洲的种子公司,而无须支付专利使用费。不过,柏康加表示,在获得品种在地里进行试验之前,将需要五六年的时间。
接下来,绿色革命将不得不面临全球农业最具争议的问题之一:转基因作物。非洲国家接受转基因作物进展缓慢,南非是唯一已批准一个转基因品种的国家,尽管布基纳法索可能正要批准一个棉花品种——该品种已在印度广泛种植,埃及正在考虑转基因玉米。
在非洲,政府和活动人士之中产生的对转基因的一些反感,是发自内心的。2002年,处于粮食危机之中的赞比亚拒绝接受作为紧急援助的转基因谷物,担心本地农业受到污染。政府甚至拒绝本身对转基因深持保留态度的欧盟的援助,在分发之前将谷物磨碎,以防进入农业系统。
不过,柏康加表示,反对被夸大了,农民知之甚少,而不是坚决反对。“不是所有非洲政府都反对接受转基因,”他说,“有很多生物技术的反对者,他们制造很多的噪音,挟持当地的媒体,然后外面的媒体就认为,农民是反对的。大多数农民对转基因一无所知。”
对于关心环境影响的那些人,他指出,抗除草剂转基因玉米可以发展土壤友好型的“免耕农业”,无需翻土除草。但是,鉴于需要制定试验和安全协议——对非洲某些国家的能力是一次严重的考验——转基因被广为接受,特别是粮食作物,看来至少还得十年以后。
与此同时,扩大现有技术的应用还有很多工作可以做。在很多非洲国家,特别是更为贫困的国家,不是没有肥料或者更好的杂交种子,而是因为贫困、私营企业弱小和营销服务差等综合因素,到不了农民手里。
非洲政府过去在 1970年代的农业支持机构为国家购销管理局,收购农产品,提供一切的肥料和种子补贴,谷物战略储存以防粮食危机,通过官方干预确定目标价格。通常是在世界银行和其他援助提供者的敦促下,这个被视为浪费、滋生腐败或绝对有害的机构组织多被解散。(然而,类似机构在欧洲和美国农业中继续存在。)然而,国家退出批发业务之后留下的真空,通常没有被私人企业填补,造成农民与国内和国际市场脱节。
例如,非洲绿色革命联盟将花费4,050万美元,建立拥有10,000家经销商的网络,在农村地区出售肥料和其他农资。有些国家,如南部非洲国家马拉维,在试验“市场敏感”补贴,用来补充和激励而不是取代私人企业。
不过,绿色革命和粮食的获得对贫困的影响,不仅仅是提高产量。增长的形式,分享其好处的最佳方式,成为一些争论的话题。
英国投资基金经理人乔恩·马圭尔设立了一个“投资非洲”基金,此前,他访问了马拉维,发现村民们由于缺少雨水而无法获得收成,尽管他们生活在马拉维湖的岸边。他说:“我问他们为什么不在灌溉上投资,他们告诉我,各村庄已三年没有货币流通。”在自己对农业一无所知的情况下,他筹得了1,600万美元的资金,雇用当地的农场经理人,购买了价值350万美元的喷洒器及其他的灌溉设备。
他现在经营2,500英亩的农场,还有另外9,000户“外围种植户”家庭与之签订供应合同。他们向包括西班牙在内的全球市场出售红辣椒粉和朝天椒。他说:“西班牙人对红辣椒粉的质量感到很吃惊。”明年,他的农场计划购买他所宣称的马拉维第一台联合收割机。
马圭尔的解决方案是建立大型、出口型的农场,在灌溉方面投入大量资金。“农业发展的整个基础曾经是:我们如何帮助小农户?”他说,“在非洲,你永远不会解决那样的问题。你需要中小企业绕着大型的农场转,将他们带入全球经济。我们的外围种植户现在分享到了全球价格好处,实际上是从粮食价格危机中获益。”许多农学家不同意这种看法。肯尼亚千年发展目标中心主任格伦·丹宁称,在每一个像马拉维这样的贫困国家,第一步是提高产量和改善小农户的条件,在他们实现多种经营之前,需要获得自身的粮食保障。“小农户已证明,如果获得适当的投入,他们能够参与竞争。”他说,“亚洲绿色革命就是这样。”建立储备之后,农民便可以种植经济作物。小农户基本谷物产量的提高,通过增加供应和调节粮食价格,还将使无地者和城市贫民受益。
实际上,新的和现有技术与非洲经济和社会的互动方式至关重要,不仅仅是绿色革命是否技术上行得通,而且是否将给非洲穷人带来广泛的利益。伦敦大学亚非学院(SOAS)学者安德鲁·多沃德称,例如,对很多贫困家庭来说,接受抗除草剂的转基因作物将是灾难性的:这种作物无须人工除草,而对很多人而言,除草是一个巨大的收入来源。
对绿色革命想法的左翼批评者,不怀疑非洲能够通过新品种和投入来提高生产率,但是他们宣称,得到实惠的将是大公司和富裕的农民。美国左倾的粮食与发展政策研究所研究员拉杰·帕特尔最近向一个国会委员会表示,像非洲绿色革命联盟这样的项目,“尽管也许是出于好意,却是典型的不负责任和不可持续的技术投资”。他转而呼吁“进一步采纳和研究切合当地实际而且受民主控制的农业生态方法项目”。
在绿色革命的争论中,问五个不同的人,你会得到七种不同的答案。非洲需要私营的农业供应商,非洲需要水,非洲需要道路,非洲需要转基因作物,非洲需要大型农场,非洲需要小型农场。现实似乎是,在一个如此多样化的大陆,非洲可能需要上述所有一切,而且还不够。
来源:www.ft.com
金融时报有限公司2008年版权所有
首页图片由World Resources Institute Staff摄
The problem of MySQL Replication unable to catch up is quite common in MySQL world and in fact I already wrote about it. There are many aspects of managing mysql replication lag such as using proper hardware and configuring it properly. In this post I will just look at couple of query design mistakes which result in low hanging fruit troubleshooting MySQL Replication Lag
First fact you absolutely need to remember is MySQL Replication is single threaded, which means if you have any long running write query it clogs replication stream and small and fast updates which go after it in MySQL binary log can't proceed. It is either more than than just about queries - if you're using explicit transactions all updates from the transactions are buffered together and when dumped to binary log as one big chunk which can't be interleaved by any other query execution. So if you have transaction containing millions of simple updates instead of one large update to help MySQL replication lag it is not going to work.
This brings us to rule number one - if you care about replication latency you must not have any long running updates. Queries or transactions containing multiple update queries which add up to long time. I would keep the maximum query length at about 1/5th of the maximum replication lag you're ready to tolerate. So if you want your replica to be no more than 1 minute behind keep the longest update query to 10 sec or so. This is of course rule of thumb depending on differences in master/slave configuration, their load and concurrency you may need to keep the ratio higher or allow a bit longer queries.
What should you do if you need to update a lot of rows ? Use Query Chopping - this can be running update/delete with LIMIT in the loop, controlling maximum amount of values per batch in multiple row insert statement or Fetching data you're planning to update/delete and having multiple queries to delete it (see example below)
This brings us to yet another rule for smart replication - do not make Slave to do more work than it needs to do. It is crippled by having to do all of this in single thread already - do not make it even harder. If there is considerable effort needed to select rows for modification - spread it out and have separate select and update queries. In such case slave will only need to run UPDATE
Example:
This query will perform full table scan in MySQL 5.0 (even if there are no spam posts) which will load slave significantly. You can replace it with:
If there could be many ids matched on the first place you should also use query chopping and run update in chunks if application allows it.
In MySQL 5.1 with row level replication you will not have selection process running on SLAVE but it will not do the chopping for you.
In general this trick does not only work well for full table scan updates but in general for cases when there are much more rows examined than modified.
The next common mistake is using INSERT ... SELECT - which is in similar to what I just described but can be much worse as SELECT may end up being extremely complicated query. It is best to avoid INSERT ... SELECT going through replication in 5.0 for many reasons (locking, long query time, waste of execution on slave). Piping data through application is the best solution in many cases and is quite easy - it is trivial to write the function which will take SELECT query and the table to which store its result set and use in your application in all cases when you need this functionality.
Finally you should not overload your replication - Quite typically I see replication lagging when batch jobs are running. These can load master significantly during their run time and make it impossible for slave to run the same load through single thread. The solution in many cases is to simply space it out and slow down your batch job (such as adding sleep calls) to ensure there is enough breathing room for replication thread.
You can also have controlled execution of batch job - this is when they will check slave lag every so often and pause if it becomes too large. This is a bit more complicated approach but it saves you from running around and adjusting your sleep behavior to keep the progress fast enough and at the same time keep replication from lagging.
In many bad replication lags I've seen simply following these simple rules would avoid a lot of problems and often save massive hardware purchases or development efforts based on assumption MySQL replication can't possibly keep up any more.
Entry posted by peter | No comment
You want to have a scalable website. You want a website which can handle traffic spikes (think if you are getting on Digg, Slahsdot, Reddit, Techcrunch or other very popular websites frontpage).
Regular hosting companies (especially shared hosting) can offer only so much. The servers usually get crushed under the load in short time.
But there is hope. A new breed of hosting companies emerged recently. A new breed which can offer you the scalability you need at a fraction of the cost.
Welcome to the world of “cloud computing!” (or “grid computing” or “utility computing”, which are terms for the same thing).
Here's a website which compiled a list of cloud computing hosting companies (with short descriptions, prices and customer lists for each of them).
Read the entire article about Cloud computing, grid computing, utility computing list at MyTestBox.com - web software reviews, news, tips & tricks.
由于上一次活动受到了广泛欢迎,这里我很高兴地宣布我们将再举行一次这样的活动(译者注:本次活动的工作语言为英语)。您可以浏览和回复这篇帖子来提出您最关心的网站管理员相关的问题。让我们在那里相聚吧!
How do we scale datacenters? Should we build a few mammoth million machine datacenters or many smaller micro datacenters? Intuitively we usually go with a bigger is better economies of scale type argument, but it may not be so. What works for Walmart may not work for White Box World. Mega datacenters may actually exhibit diseconomies of scale. It may be better to run applications over many distributed micro datacenters instead of one large one.
This paper by Ken Church, Albert Greenberg, and James Hamilton, all from Microsoft, takes a look at the different issues and concludes:
Putting it all together, the micro model offers a design point with attractive performance, reliability, scale and cost. Given how much the industry is currently investing in the mega model, the industry would do well to consider the micro alternative.
九月 2008 | ||||||
一 | 二 | 三 | 四 | 五 | 六 | 日 |
1 | 2 | 3 | 4 | 5 | 6 | 7 |
8 | 9 | 10 | 11 | 12 | 13 | 14 |
15 | 16 | 17 | 18 | 19 | 20 | 21 |
22 | 23 | 24 | 25 | 26 | 27 | 28 |
29 | 30 |