生物燃料能拯救气候吗？ | 05 Feb 2009

23:52 新版谷歌Sitemap生成器 » Inside AdSense-中文

我们经常会接到来自站长这样的疑问，到底如何能让Google更快更好的收录我的网站，让网友能够更加方便地通过搜索引擎找到相关内容，并提升网站的访问量。其实，用好Sitemap工具，创建让搜索引擎易于抓取的sitemap文件，便可以达到这样事半功倍的效果，下面便是一篇十分有用的介绍文章，希望能够对您有所帮助。

转载自谷歌中文网站管理员博客

发表者: John Mueller, 网站管理员趋势研究员

原文: A new Google Sitemap Generator for your website
发表于: 2009年1月13日星期二, 上午5:12

2005年6月我们推出了Python Sitemap 生成器，距今已经三年有余了。在此期间，许多网络爱好者自己开发了第三方的Sitemap生成器，这些都有助于网站管理员们创建更好的Sitemap文件。大多数现有的Sitemap生成器要么依靠爬行相应的网站，要么是把一个服务器上的文件都列出来，而我们开发的Sitemap生成器不同于这些现有的Sitemap生成器，谷歌Sitemap生成器能采用多种方式搜寻您网站上的URL，并允许您自动创建和维护多种不同类型的Sitemap文件。

关于谷歌Sitemap生成器

新的谷歌Sitemap生成器是完全开放源代码的，通过分析您网站服务器的访问、日志文件和服务器上存在的文件，谷歌Sitemap生成器可以发现新的URL和最近发生过变动的URL。综合这些手段，谷歌Sitemap生成器能够迅速找到这些URL，统计相应的元数据，从而使您的sitemap文件能尽快生效。一旦谷歌Sitemap生成器成功采集到这些URL，它就能为您生成以下Sitemap文件：

基于sitemaps.org标准的，为网页搜索服务的XML Sitemap

为基于移动设备的网站而设计的移动Sitemap

为您已经提供给用户的源代码而设计的代码搜索Sitemap

不仅如此，谷歌Sitemap生成器还能通知谷歌博客搜索，您的网站有了新的或更新过的URL。您还可以把Sitemap文件的URL地址放在您的robots.txt文件中，并通知其他支持sitemaps.org 标准的搜索引擎。

之所以能够把URL发送给正确的Sitemap，要归功于基于网络的管理控制系统，这种控制系统使您能够利用各种各样的功能轻松管理您的网站，同时还能确保高度的安全性。

现在就开始使用吧

谷歌Sitemap生成器是一个服务器端的插件，既可以安装在基于Linux/Apache的服务器上，也可以安装在基于微软 IIS 视窗系统的服务器上。像其他的服务器端插件一样，您需要有该服务器的管理权限才能安装。您可以在谷歌Sitemap生成器帮助文档(英文)里找到更多有关安装的信息。

我们很高兴能够发布开源版的谷歌Sitemap生成器，我们希望籍此能够鼓励更多的主机服务提供商将此工具或类似的Sitemap工具加入自己的主机软件包里！

您还有与之相关的其它问题么？请访问我们的谷歌Sitemap生成器支持论坛 (英文)或在我们的网站管理员支持论坛中提出您的问题。

22:10 Product: HAProxy - The Reliable, High Performance TCP/HTTP Load Balancer » High Scalability - Building bigger, faster, more reliable websites.

Update: Load Balancing in Amazon EC2 with HAProxy. Grig Gheorghiu writes a nice post on HAProxy functionality and configuration: Emulating virtual servers, Logging, SSL, Load balancing algorithms, Session persistence with cookies, Server health checks, etc.

Adapted From the website:

HAProxy is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications. It is particularly suited for web sites crawling under very high loads while needing persistence or Layer7 processing. Supporting tens of thousands of connections is clearly realistic with todays hardware. Its mode of operation makes its integration into existing architectures very easy and riskless, while still offering the possibility not to expose fragile web servers to the Net.

19:55 Disaster: LVM Performance in Snapshot Mode » MySQL Performance Blog

In many cases I speculate how things should work based on what they do and in number of cases this lead me forming too good impression about technology and when running in completely unanticipated bug or performance bottleneck. This is exactly the case with LVM

Number of customers have reported the LVM gives very high penalty when snapshots are enabled (leave along if you try to run backup at this time) and so I decided to look into it.

I used sysbench fileio test as our concern is general IO performance in this case - it is not something MySQL related.

I tested things on RHEL5, RAID10 volume with 6 hard drives (BBU disabled) though the problem can be seen on variety of other systems too (I just do not have all comparable numbers)

O_DIRECT RUN

PLAIN TEXT

CODE:

/tmp/sysbench --test=fileio --num-threads=1 --init-rng=on --max-time=60 --file-num=1 --file-total-size=8G --file-extra-flags=direct --file-test-mode=rndwr run

The performance without LVM snapshot was 159 io/sec which is quite expected for single thread and no BBU. With LVM snapshot enabled the performance was 25 io/sec which is about 6 times lower !

I honestly do not understand what LVM could be doing to make things such slow - the COW should require 1 read and 2 writes or may be 3 writes (if we assume meta data updated each time) but how ever it could reach 6 times ?

It looks like it is the time to dig further into LVM internals and well... may be I'm missing something here - I do not have the good insight on what is really happening inside, just how it looks from the user.

Interesting enough VMSTAT confirms there should 1 read and 2 writes theory:

PLAIN TEXT

CODE:

0 1 0 24132256 73252 8248788 0 0 259 590 1271 531 0 0 92 8 0
0 1 0 24135976 73284 8244964 0 0 413 938 1427 761 0 0 87 12 0
0 1 0 24139572 73308 8241300 0 0 399 905 1412 736 0 0 87 12 0
0 1 0 24143416 73352 8237396 0 0 409 927 1416 739 0 0 87 12 0

As you can see there are about twice as many writes as reads.

SMALL FILE RUN

When I decided to check how things improve in case writes come over and over again in the same place - my assumption in this case would be to have overhead gradually going to zero as all pages become copied and so writes can just proceed normally.

PLAIN TEXT

CODE:

/tmp/sysbench --test=fileio --num-threads=1 --init-rng=on --max-time=60 --file-num=1 --file-total-size=64M --file-extra-flags=direct --file-test-mode=rndwr run

With this run I got approximately 200 ios/sec without LVM snapshot enabled while with snapshot I got:

33.20 Requests/sec executed
46.08 Requests/sec executed
70.79 Requests/sec executed
123.68 Requests/sec executed
157.66 Requests/sec executed
163.50 Requests/sec executed

(All were 60 second runs)

As you see the performance indeed improves though there is still significant overhead remains. The progress is much slower than I would anticipate it. Before last run there were about 400MB totally written to the file (random writes) which is 6x of the file size and yet still we saw some 20% regression compared to run with no snapshot.

NO O_DIRECT RUNS

As you might know O_DIRECT often executes quite special path in Linux kernel so I did couple of other runs. First run syncing after each request instead of O_DIRECT

PLAIN TEXT

CODE:

/tmp/sysbench --test=fileio --num-threads=1 --init-rng=on --max-time=60 --file-num=1 --file-total-size=8G --file-fsync-freq=1 --file-test-mode=rndwr run

This run gave 162 io/sec without snapshot and 32 io/sec with snapshot. The numbers are a bit better than with O_DIRECT but the gap is still astonishing.

The final run I did is emulating how Innodb would do buffer pool flushes - calling fsync every 100 writes rather than after each request:

PLAIN TEXT

CODE:

/tmp/sysbench --test=fileio --num-threads=1 --init-rng=on --max-time=60 --file-num=1 --file-total-size=8G --max-requests=100000000 --file-fsync-freq=100 --file-test-mode=rndwr run

This gets some 740 req/sec without snapshot and 240 req/sec with snapshot. In this case we get close to expected 3x difference.

The numbers are much higher in this case because even though we have one thread OS is able to submit multiple requests at the same time (and drives can execute them) - I expect if there would be BBU in this system we would see similar results for other runs.

So Creating LVM snapshot indeed could cause tremendous overhead - in the benchmarks I've done it is ranging from 3x to 6x. It is however worth to note it is worse case scenario - many workloads have writes going to the same locations over and over again (ie innodb circular logs) - in this case the overhead will be quickly reduce. Though still it takes some time and I would expect any system doing writes to experience the "performance shock" when LVM snapshot is created with greatly reduced capacity which when will improve as smaller number of pages actually need to be copied on writes.

Because of this behavior you may consider not starting backups instantly after LVM snapshot is creating but allowing it to settle a bit before further overhead with data copying is added.

The question I had is how could LVM backups work for so many users ? The reality is - for many applications write speed is not so critical so they can sustain this performance drop, in particular during some slow times (which often have 2x-3x lower performance requirements)

So we'll do some research around LVM and I hope to do more benchmarks - for example I'm very curios how good is read speed from snapshot (in particular sequential file reads).

If you've done some LVM performance tests yourself or will be repeating mine (parameters posted) please let me know.

Entry posted by peter | No comment

Add to: | | | |

18:23 新年的约定 » 妮妮

从上周开始

和妈妈约好

以后我们每周陪她去一次颐和园

围着昆明湖走一圈

呼吸新鲜空气+锻炼

看光影在湖边、桥上、树下，进来，再出去

然后带着脸上的红二团和一身的汗

回家

太阳下山了，手机随拍，将就用用。颐和园现在冰面还没融化，有很多外地游人惊喜的在冰面上跺来跺去；再远点儿，有爷孙二人在冰上放风筝，要再远，就能看见玉泉山的塔了，若隐若现的，透着神秘。

小宝从冰面上挑选了一根枯枝做手杖，孩子般的“皮”劲儿完全释放出来，穿着棉猴这戳戳那打打，再扭脸看他时，已经在冰面上打冰球了。有小宝当司机，我们上车就可以马上睡觉，很贴心的。

14:47 Announcing Percona Performance Conference 2009 on April 22 & 23 » MySQL Performance Blog

All of us here at Percona warmly invite you to Percona Performance Conference 2009 on April 22 & 23, 2009 in the Hyatt Regency in Santa Clara, California. The theme for the conference is Performance Is Everything. This conference is about application performance overall, not just databases. Attendance is free of charge for everyone. Experts in many types of technologies -- databases, search, cloud computing, massively parallel computing, client-side optimization -- will present their real-life experience.

In order to forestall speculations and prevent people from jumping to unwarranted negative conclusions, I'd like to take a moment and explain the story behind this event. Some of you have noticed that there were no sessions from Percona this year on the schedule for the 2009 MySQL Conference and Expo. This is not because we didn't propose any sessions this year; Peter, Vadim, and the rest of us at Percona submitted over a dozen session proposals, which were initially declined. As a result, we conceived and organized another conference. We reserved a hotel near the MySQL conference, planned the event, invited speakers, etc. Then O'Reilly and MySQL graciously and unexpectedly proposed that we bring our conference into the Hyatt and present it in the same venue with the MySQL conference (official announcement from Kaj). They also accepted three of our original session proposals into the MySQL conference. We appreciated this fair and generous gesture and accepted their offer. We will be participating in all aspects of the conference, including the community events such as the dot-org pavilion and the MySQL Camp that Sheeri Kritzer Cabral is organizing.

That's the back story -- now on to the Percona Performance Conference! This is not "another MySQL conference." It's a performance conference. It's true that we are among the world's foremost experts on MySQL, InnoDB, and XtraDB, and it's true that the database is usually harder to scale than other components in an application. But ultimately you don't only care that your database is fast -- you care that your application is fast as a whole. That's why we'll have sessions from experts on many aspects of high-performance applications, not just the database component. And we'll have experts on different databases too.

This is a technical event that's free to attend even if you're not attending the MySQL conference. We expect to see a lot of people who live locally, and we know that the easy accessibility for attendees of the MySQL Conference and Expo will add value to their trip to Santa Clara.

We will not be the only ones speaking at Percona Performance Conference; other experts will join us in making presentations too. You'll be able to hear sessions from such luminaries as Monty Widenius (creator of MySQL), Brian Aker (creator of Drizzle), Andrew Aksyonoff (creator of Sphinx), Mark Callaghan (database guru at Google), and many others. If you'd like to propose a session, please do so through the Percona Performance Conference website.

The Percona team looks forward to greeting you face to face this April in Santa Clara.

Entry posted by Baron Schwartz | One comment

Add to: | | | |

13:18 The Canonical Cloud Architecture » High Scalability - Building bigger, faster, more reliable websites.

I'm writing this post as a sort of penance. My sin was getting involved in another mutli-threaded mess of a program that was rife with strange pauses and unexpected errors. I really should have known better. But when APIs choose to make callbacks from some mystery thread pool it's hard to keep things straight. I eventually sobered up and posted all events to a queue so I could make sure the program would work correctly. Doh. I may never know why the .Net console output stopped working, but I'll live with it.

And that reminded me that I've been meaning to write a post on the standard Cloud Architecture. I've tried to hit all the common architectures at one time or another, but there have been some excellent sources lately on structuring programs in a cloud that people may "know" in the same way I knew what not to do, but when the code hits the editor those thoughts may have hidden like a kid next to a broken cookie jar.

The easiest way to create a scalable service is to compose the service from other scalable services. This is how Google AppEngine works and is largely how AWS works as well (EC2, S3, SQS, SimpleDB, etc), though AWS also functions as a blank canvas on which you can draw your own designs.

The canonical cloud architecture that has evolved revolves around dynamically scalable CPUs consuming asynchronous, persistently queued events. We talked about this idea already in Flickr - Do the Essential Work Up-front and Queue the Rest. The cloud is just another way of implementing the same idea.

Amazon suggests a few applications of the Cloud Architecture as:

09:14 生物燃料能拯救气候吗？ » 中外对话新鲜出炉

包括美国新总统在内的生物燃料热衷者称，与化石燃料相比，粮食作物能够提供更加清洁、绿色的能源。 杨方义呼吁需谨慎对待。

雄心勃勃的奥巴马提出了一系列应对气候变化的计划，其中包括大力开发生物燃料。但生物燃料虽能替代部分化石燃料，全球气候变化的解决关键仍是提高能源使用效率和减少化石燃料的使用，生物燃料不能起到主要作用，而且有可能给气候和世界经济带来负面影响。

生物燃料技术还不成熟

简单说来，生物燃料就是把生物质材料转化成燃料。目前第一代生物能源已大规模进入市场，第二代生物能源技术正在研发之中。

第一代生物能源技术就是将玉米、甘蔗、高粱等粮食作物转化为乙醇，或是把大豆、油棕榈和油菜籽等油类作物加工为生物柴油，目前麻风树（又名小桐子）也被广泛栽培，用于生物燃料的开发。第二代生物能源技术可以直接把木质纤维素转化为燃料，与第一代生物燃料相比更具市场前景，不过技术目前仍没有突破性进展。

第一代生物燃料并不经济

2007年，乙醇生产量已达530亿升，生物柴油产量也达到100亿升。但生物燃料的种植成本过高，绝大多数生物燃料并不盈利，缺乏与石油的竞争力。在全球原油价格创历史记录的2008年7月，美国无铅汽油价格创纪录地达到1.08美元/升，而玉米酒精的开发成本接近于1美元/升，价格上无法与汽油竞争。在金融危机后，汽油价格已跌至接近0.44美元/升。生物燃料成本过高，各国都不得不为生物能源投入大量补贴，目前只有巴西的甘蔗乙醇能盈利。

而最让人们担心的是，生物燃料对世界粮食安全和农产品贸易带来挑战，大量农田用于生物质能源开发，减少粮食供应量，直接导致了世界粮食价格的上涨，虽然这不是全球粮食价格上涨的唯一因素，却是关键因素。许多不发达国家无力购买粮食，贫困人口生存面临重大挑战，以至于一些政治家高呼：“把还满足不了人类食物需求的粮食用于燃料，是最大的犯罪。”

第一代生物燃料对二氧化碳减排效果不明显

从理论上讲，生物燃料来源于生物质，能量来自光合作用，碳在生物质种植和燃料释放之间进行循环，也就是形成了碳中和。但事实显然没有这么乐观，由于生物燃料作物种植、加工和运输都需要消耗能源，生物燃料并不能完全实现碳平衡。2008年经济合作与发展组织的报告显示，巴西的甘蔗酒精气候效应最好，相对化石燃料，能减少接近80%的温室气体排放，但美国的玉米减排效果只有30%。第一代生物能源的环境效益被高估了。

而即使联合国粮农组织的评估报告也是在一个假设的前提下：这些作物的种植不会带来新的温室气体排放。令人失望的是，这个假设并不是真的，作为地球重要碳库的热带森林，因为生物能源的开发，正在遭受着前所未有的创伤。巴西的亚马逊雨林正在萎缩，原因之一是砍掉雨林来种植大豆和甘蔗。印度尼西亚的雨林正在被油棕榈替代。《自然》杂志发表的一篇新的研究成果，作者约瑟在用“碳债务”的研究方法对毁林种植油料作物进行了温室气体排放的分析，结果显示，毁林种植油料作物带来的排放远远大于生物燃料与化石燃料相比减少的温室气体排放。更让环保组织担心的是，毁林带来的生物多样性的丧失速度惊人，世界生物多样性最丰富的巴西热带雨林和东南亚热带雨林正遭受巨大破坏。

生物燃料的种植，同农作物一样，仍会带来水土流失、水土污染等问题，大量化肥的使用导致的土地退化问题也十分突出，化肥带来的氧化亚氮的温室效应比二氧化碳要强很多倍，以至于生物燃料反倒成了众多环境问题的始作俑者。

生物能源并不能拯救气候变化

第一代生物能源的环境效益并没有想象理想，而带来的风险很大。经过多年的争论后，欧盟开始在生物能源政策上做出改变，欧洲议会正在试图把生物能源占能源消耗10%的目标削减至6%，并将设立生物能源的可持续认证制度，以避免生物质能源开发带来的毁林等环境问题。随着奥巴马的当选，美国开始重视生物能源开发，奥巴马所说的高级生物能源是第二代生物能源，第二代生物能源能否产生效益，前景并不明朗。所以，我们不应把减少温室气体排放的中心放在生物燃料的开发上，而是真正提高能源使用效率，开发太阳能、风能等绿色能源才是解决温室气体问题的关键。

杨方义德国哥廷根大学热带和国际森林学所2008年硕士研究生，2004-2008年在保护国际中国项目（CI）和山水自然保护中心从事生物多样性保护工作。

	二月 2009
一	二	三	四	五	六	日
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28