Introduction to Google RankingThe Official Google Blog » Che, Dong's shared items in Google Reader
Posted by Amit Singhal, Google Fellow

In May, Udi Manber introduced our search quality group, the group responsible for the ranking of search results. He introduced various teams within "Quality" (as we like to call the group) including Core Ranking, International Search, User Interfaces, Evaluation, Webspam, and other teams. In this post, I want to tell you more about one of these: the Core Ranking team.

Let me introduce myself. My name is Amit Singhal. I'm a Google Fellow in charge of the ranking team at Google. I've worked in the field of search for the past eighteen years, having been introduced to search in 1990 as a graduate student in computer science. In the academic world, the field of search is known as Information Retrieval (or IR). After spending a decade as an IR researcher, I came to Google in 2000, and have worked on Google ranking ever since.

Google ranking is a collection of algorithms used to find the most relevant documents for a user query. We do this for hundreds of millions of queries a day, from a collection of billions and billions of pages. These algorithms are run for every query entered into most of Google's search services. While our web search is the most used Google search service and the most widely known, the same ranking algorithms are also used - with some modifications - for other Google search services, including Images, News, YouTube, Maps, Product Search, Book Search, and more.

The most common question I get asked about Google's ranking is "how do you do it?" Of course, there is a lot that goes into building a state-of-the-art ranking system like ours, and I will delve deeper into the technology behind it in a later post. Today, I would like to briefly share the philosophies behind Google ranking:
1) Best locally relevant results served globally.
2) Keep it simple.
3) No manual intervention.
The first one is obvious. Given our passion for search, we absolutely want to make sure that every user query gets the most relevant results. We often call this the "no query left behind" principle. Whenever we return less than ideal results for any query in any language in any country - and we do (search is by no means a solved problem) - we use that as an inspiration for future improvements.

The second principle seems obvious. Isn't it the desire of all system architects to keep their systems simple? Well, as search systems go, given the wide variety of user queries we have to respond to in multiple languages, it is easy to go down the path where more and more complexity creeps into the system to serve the next incremental fraction of the queries. We work very hard to keep our system simple without compromising on the quality of results. This is an ongoing effort, and a worthy one. We make about ten ranking changes every week and simplicity is a big consideration in launching every change. Our engineers understand exactly why a page was ranked the way it was for a given query. This simple understandable system has allowed us innovate quickly, and it shows. The "keep it simple" philosophy has served us well.

No discussion of Google's ranking would be complete without asking the common - but misguided! :) - question: "Does Google manually edit its results?" Let me just answer that with our third philosophy: no manual intervention. In our view, the web is built by people. You are the ones creating pages and linking to pages. We are using all this human contribution through our algorithms. The final ordering of the results is decided by our algorithms using the contributions of the greater Internet community, not manually by us. We believe that the subjective judgment of any individual is, well ... subjective, and information distilled by our algorithms from the vast amount of human knowledge encoded in the web pages and their links is better than individual subjectivity.

The second reason we have a principle against manually adjusting our results is that often a broken query is just a symptom of a potential improvement to be made to our ranking algorithm. Improving the underlying algorithm not only improves that one query, it improves an entire class of queries, and often for all languages. I should add, however, that there are clear written policies for websites recommended by Google, and we do take action on sites that are in violation of our policies or for a small number of other reasons (e.g. legal requirements, child porn, viruses/malware, etc).

Stay tuned for my followup post, where I will discuss in detail the technologies behind our ranking and show examples of several state-of-the-art ranking techniques in action. Let me just conclude this post by saying, our passion for search is stronger than ever - and as a search researcher, I have the best job in the world :-).
23:48 全球应对气候变化下的欧盟与中国 » 中外对话新鲜出炉

中欧关系因双边贸易的增加和全球变暖政治的升温而在发生变化。吴昌华在此谈到了这一双边关系发展将面临的挑战。

作为全球最受关注的热点议题之一,全球变暖首先是一个全球性的环境问题,其次也是发展问题。 2007年以来气候变化问题更是引人注目,联合国气候变化小组四次评估报告的相继推出,包括八国峰会、达沃斯年会等在内的一系列政治经济领域的磋商和讨论,外加各国在能源和应对气候变化政策方面的进展,都为气候变化的国际谈判营造了良好的氛围。而围绕着这场事关人类未来命运的谈判,包括欧盟和中国在内的全球各方都积极开展了行动。欧盟是这场谈判以及全球应对气候变化运动的积极推动者,而中国作为世界上最大的发展中国家在控制未来全球变暖的进程中也起着至关重要的作用。这两者之间的互动无疑是人类应对气候变化过程中的一大看点。

欧盟为什么成为急先锋?

关于气候变化问题,欧盟一直占据着国际谈判中的主导地位。在《京都议定书》的谈判过程中,欧盟积极主张限制温室气体排放并主动承担了减排义务,同时极力推动包括美国在内的发达国家和其他发展中国家的参与,促成了首个有法律效力的温室气体减排具体协议《京都议定书》的生效。而在《京都议定书》第二承诺期谈判开启之后,欧盟继续在这一议题上扮演国际舞台的领导角色,制定宏伟且激进的温室气体减排目标,并通过各国具体的能源环境经济战略加以实施。与此同时,欧盟还继续积极推动全球气候变化的应对议程,努力通过各种外交渠道和手段进行沟通、游说和拉拢,以使其激进的气候变化应对战略能够为国际社会所接受并上升为环境保护的国际机制。

欧盟的态度有其必然性。欧盟国家作为成熟的工业经济体系,在与发展中国家的竞争中,经济活力和竞争优势有所削弱,从而影响到欧盟所谓的软实力和外交影响力。而在欧盟内部,德、法、英三大国的领导人相继更迭之后,形成了一个全新的欧盟大国领导集体。默克尔、萨科奇和布朗相比各自的前任更加务实并且都具有扎实的经济工作背景。他们在欧盟内外政策上也有不少相似点,如主张欧盟各国开展社会、市场和经济体制改革以提高经济活力,重视全球能源和气候变化问题,并关注与发展中国家的关系等。主要成员国的政策趋同有利于欧盟形成清晰有力的外交政策。

尤其是在能源和气候变化这一重要议题上,一向依赖外部能源供给的欧盟迫切需要保障能源供应稳定和提高能源利用效率。而与此同时,欧盟国家自身受到全球变暖的巨大负面影响以及欧盟国家对于世界未来经济竞争趋势的构建,使得应对气候变化也成为欧盟各个成员国密切关注的问题。欧盟各国在应对气候变化问题上的激进,不仅可以迎合民众对气候变暖和能源安全的担心,为欧洲一体化进程的进一步发展寻找新动力,同时也是欧盟扩大其国际政治和外交影响力的有利契机,避免欧盟的综合竞争力进一步下滑。

中欧经贸政治局势的变化

中国自加入世界贸易组织以来,综合国力的快速增长在为中欧双方带来新的合作机遇的同时,也使得中欧关系面临更多的挑战。随着中欧经济贸易的迅速发展,欧盟已经不愿再将中国视为发展中国家,或者至少不是典型的发展中国家,而是倾向于认为中国是一个逐渐强大并且已对欧盟构成挑战的新兴工业化国家和竞争对手。从1995年欧盟首份对华政策中强调中国的经济奇迹及其巨大的市场潜力,到2006年欧盟第六份对华政策文件《欧盟与中国:更紧密的伙伴、扩大的责任》中单独发布的贸易投资文件——《竞争和伙伴关系》,经贸问题已由双边关系的稳固基础开始转化为欧盟对华政策中“最为重要的一个挑战”。其中,环境、社会保障、货币、自然资源、知识产权保护和技术转移等方面的政策成为欧盟指责中国对欧形成不公平竞争的重点。欧盟强烈地要求中欧之间建立互惠的自由、开放和公正的市场,并且强调“实力持续增长的”中国应当在世界经济中扮演建设性和负责任的角色,承担超出一个发展中国家在WTO框架下所应承担的责任。

除此之外,在政治和外交领域,欧盟对于中国的负面认知也在加强,迫切期望中国趋同于欧洲各国“自由、民主与法治”的价值观,对中国人权状况问题的关注也在加强。此外,欧盟对于中国的国际发展援助政策也表示极大关注,指责中国未能以合理和透明的标准开展国际援助,并且认为中国资源和能源导向型的外交政策有悖于欧盟倡导的“良治”政策,不利于欧盟价值观体系的全球推广。

在上述背景下,气候变化问题作为一个全球性的环境和发展问题,就很自然地成为欧盟对华政策中的关键因素。欧盟面对日益凸现的能源安全和全球变暖问题,一方面大幅度提升其在欧盟总体外交中的地位,另一方面则努力整合欧盟能源政策与气候变化应对政策。在欧盟共同政策支持下说服美国重启联合国框架内的气候变化问题谈判之后,欧盟将继续不遗余力地通过各种渠道,试图让包括中国在内的发展中国家接受欧盟的能源、环保以及其他相关的社会和经贸标准。

中欧关系往何处去?

可以预见,气候变化问题在中欧政治外交舞台上将扮演越来越重要的角色。而欧盟更趋保守、更具防卫性和保护性的对华经贸政策将有可能影响包括能源和气候变化问题在内的磋商和谈判,中国“压低能源和环境价格”所造成的“环境倾销”和“气候倾销”将可能成为欧盟批评的对象,而由此引发的能源环境标准争端和解决措施将成为双边经贸问题的新焦点。在技术转让方面更保守,更坚定地利用反倾销手段以及WTO争端解决机制等,都可能成为欧盟应对中国竞争的贸易壁垒。

除此之外,在能源安全、气候变化、环境保护和可持续发展等全球议题上,欧盟的对华政策将会逐渐丰富并且强化“中国责任论”,要求中国承担更多的责任。欧盟国家在制定激进的温室气体减排计划的同时,温升控制的“2度目标”以及2050年温室气体减排50%(相比1990年)的目标,使得包括中国在内的发展中国家面临巨大的挑战。

欧盟是全球新能源和可再生能源开发利用技术的先行者,目前掌握着全球大量的相关技术。在应对气候变化的全球联动中,欧盟大力倡导的全球目标和行动,构建出未来经济、技术和社会的基本蓝图,并且在由此引发的全球新一轮工业革命中,也即在低碳经济技术革命中保持对中国等新兴国家的技术优势,获取新产品和技术转让的巨大商机。另一方面,欧盟国家公开要求中国不要将气候变化问题置于经济增长之后,其倡导的气候和环保标准及义务还将相应地增加包括中国在内的发展中国家的经济发展成本,有利于压制其经济增长势头和政治外交进展,并且有利于欧盟价值观的全球推广。

作者:吴昌华,气候集团climategroup中国区总裁。

首页图片由jonsson

 

 

13:09 整合了IE5.5-IE8 beta1引擎的免费浏览器 My DebugBar | Main / HomePage » del.icio.us/chedong
一款整合了IE5.5-IE8 beta1引擎的免费浏览器,支持在独立的标签页中开启不同版本的IE,支持Vista下运行;但据同事反馈: 模拟出来的效果比真实效果更差;
12:50 序列化协议 Protocol Buffers - by Google » del.icio.us/chedong
一个Protocol Buffers 文本样例: person { name: "John Doe" email: "jdoe@example.com" } 比XML简洁很多,解析起来快一个数量级,少了很多<标签>,编码只用unicode和ascii
Optimizing your search boxInside AdSense » Che, Dong's shared items in Google Reader
Following on the five tips on AdSense for content optimization our Sydney team presented a couple weeks back, now let's turn to AdSense for search. As you may know, we recently integrated Custom Search Engine into AdSense for search to provide additional customization options and improved targeting. Whether you've already implemented an AdSense for search box on your site or you're just getting started with this feature, we recommend these five optimization tips:
  1. Place your search boxes in visible locations.

    Integrate your search boxes in easy-to-find locations, such as under the header or in your left navigation. Also, keep the placement of your search boxes consistent on all your pages, so users will know where to look if they need help finding something.

  2. Add two search boxes to content-rich pages.

    For pages with a lot of content or which require scrolling, try placing one search box at the top of the page and another at the bottom. A box at the top of the page will allow users to perform a search immediately, and a box at the bottom will provide a search option to users who've just finished reading your content. You can also track and compare the performance of each search box by creating custom channels.

  3. Host your search results on your own site.

    To keep users on your pages, you can host your search results and ads within your own pages. If your users don't find what they're looking for in the search results or ads, they'll still be able to to navigate to other sections of your site using your site's template. In addition, you can further integrate your search results into your site by customizing the colors of the results page.

  4. Add a search box to your search results pages.

    Similar to #3, try placing a search box on your search results pages so users can perform additional searches from your site.

  5. Customize your ad locations.

    Place ads at the top and right sidebar of your search results pages. This layout offers added visibility, and our tests have shown that these ad locations can improve monetization.
After you've optimized where search boxes are placed on your site, don't forget to try new targeting options such as keyword refinements and vertical search. To generate AdSense for search code and take advantage of these features, sign in and visit your AdSense Setup tab. You can also find more information in our Help Center.

Posted by Sandra Tsui - AdSense Publisher Support
03:13 Logging your MySQL command line client sessions » MySQL Performance Blog

Baron recently wrote about very helpful but often forgotten about “Pager” feature of command line client. There is another one which falls into the same list - the –tee option.

Specifying –tee=/path/to/file.log you can get all session content (everything typed in and printed out) stored in the log file. Quite handy for example to keep track of changes done on production.

Moreover you can put tee=/logs/mysql.log in “mysql” section in my.cnf to have logging enabled automatically when you start the client.

If you’re looking to log session beyond MySQL command line client you can check out “script” tool.


Entry posted by peter | 9 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

01:39 Velocity Presentation Slides Published » MySQL Performance Blog

I’ve now published slides from my talk at Velocity conference on Percona web site. Enjoy.


Entry posted by peter | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks


^==Back Home: www.chedong.com

^==Back Digest Home: www.chedong.com/digest/

<== 2008-07-08
  七月 2008  
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      
==> 2008-07-10