You may not realize it, but the data analytics market is buzzing. There are new vendors emerging, new products popping up, new deals being done, and several new strategies being pursued. Vendors are predominately chasing big data, with battles lines being drawn by solution providers that cater to between roughly 100 TB and 10 PB data sets. The battle was inevitable because the world is producing data at a phenomenal rate, and we have an increasing need to analyze them within shorter time frames. In this post we analyze one of these vendors, Kickfire.
Yet while the big names in town are capturing the headlines, in reality only a small percentage of businesses today need to be able to analyze petabytes of data. Today, the rest of us are more likely to deal with analytic data sets in the 50 GB to 3 TB range.
Kickfire is interesting because it has decided to let the other vendors fight it out for the massive data volumes. Instead, it has focused on a relatively untapped segment: the MySQL database market or, more correctly, the market that MySQL serves.
The bulk of MySQL installs are for Web 2.0 and web-related applications (i.e. applications based on the LAMP stack), and these applications usually aren't set up to manage industrial-sized data sets. Instead, they often have gigabytes or a few terabytes of data, but analyzing that data is just as important to their owners. However, like many transaction-oriented databases, MySQL doesn't perform very well when you run analytics-style queries, even on mid-sized data sets. Customers often find that running complex ad-hoc queries that aggregate data across many rows is very time-consuming, and the lack of certain features, such as query parallelism, diminishes MySQL's appeal.
Kickfire's solution is to use MySQL as the base, because this gives its customers the ability to easily migrate to Kickfire but replace MySQL's storage engine with their own column store engine. Under the covers, the column store structures data based on the columns in a table, rather than the traditional method based on rows in a table. This structure has been found to achieve better compression and better ad-hoc query performance because only the columns being queried -- not all of the rows -- need to be scanned. The column store is also used by Vertica and was popularized by its founder, the well-known database researcher Michael Stonebraker.
But Kickfire doesn't end there. It goes one step further by adding a proprietary "SQL Chip" co-processor to further enhance its product's performance. Kickfire has replaced the MySQL query optimizer (the component that takes an SQL statement and splits it into a series of operators for processing) to produce operators that can be sent directly to its SQL Chip for processing. So, rather than running these operators on a general-purpose CPU, which has to convert them into a series of regular CPU instructions and then muck around loading the data into registers from memory, the optimizer instead sends them to the SQL Chip, which natively understands them and processes them on data streamed directly from memory.
Kickfire's solution is bundled as a data warehouse "appliance," which is made up of two physical servers: one conventional server running MySQL 5.1, and the other connected via PCIe, which is used to offload processing to the SQL Chip. The underlying capabilities that Kickfire adds remain largely transparent in terms of the user's interaction via SQL code, because Kickfire hasn't changed the MySQL syntax that its customers are already familiar with.
Page 2: Is the Performance Advantage Real?
Kickfire took a major step for a small vendor last year by passing the most credible benchmark in data warehousing: the TPC-H. For those not familiar with it, TPC is a non-profit association whose benchmarks require all vendors to use the same workload, thereby producing comparable results. Of course, performance is a key measure; but perhaps the more important benchmark result is the price vs. performance ratio. While a vendor could throw large amounts of hardware at a workload to produce high-performance results, making it cost-effective has always been the biggest challenge. TPC-H rates the vendors' solutions using its own metric, called the "Composite Query-per-Hour Performance Metric" (QphH). The metric is a comparative measure of effective query throughput and processing power relative to data warehouse workloads.
Kickfire's results are impressive. For a 300 GB workload, it currently ranks fourth on the performance list (and second on the non-clustered performance list). But what stands out is its price/performance ratio. Kickfire has the lowest cost per QphH of any vendor, at $0.89. Compare that to the fifth-placed product on the performance list, an SQL server-based solution that costs $5.40 per QphH, and the sixth-placed product, an Oracle-based solution that costs $18.67 per QphH!
Kickfire announced last week on its blog that it had shipped its first appliance to a Web 2.0 customer. As many Web 2.0 businesses are finding out, killer features alone do not determine success or failure. Success also depends on how well the vendor understands its customers' evolving needs and how it generates revenue by addressing those needs. Kickfire's Web 2.0 customer is using the appliance to do click-stream analysis to better understand its own users' behavior, so that it can target relevant advertising offers. According to Kickfire, the customer has around 500 GB of imported data, but this is expected to grow at a rate of 1 GB per day.
The traditional enterprise space is less of a focus for Kickfire at the moment, partly because that space is already relatively well served by specialized offerings, but also because MySQL has less of a presence there. While Sun is pushing MySQL to break through those enterprise walls, most corporate data platforms remain largely the domain of Oracle, Microsoft, IBM, and niche vendors, such as Teradata. If Sun does manage to break through, then the road will be paved for Kickfire to follow.
Kickfire is a stand-alone solution and can of course be loaded with data from any data source, but the ease of adoption for existing MySQL customers, combined with its strong price/performance ratio, makes Kickfire a compelling option for Web 2.0 businesses looking to add a data warehouse platform to improve their analytics capabilities.
DiscussAt the Oracle Coherence Special Interest Group meeting today in London, Tomas Nilsson, the product manager for JRockit RT and JRockit Mission Control spoke about the future plans for JRockit and especially plans for improved Coherence JRockit integration.
新的经济刺激促进了中国西南地区水力发电的发展,但是对生态和社会成本提出了担忧。蒋高明对此进行了报道。
四万亿刺激经济措施出台后,中国西南山区水电开发再度升温。在四川、云南等县市,那些已规划或正在规划的水电项目,在施工进度上明显提速。甚至有些水电项目尚没通过国家工程环境影响评价(简称环评),或根本就没进行环评,也开始动工了。最近,笔者随有关媒体组成的记者考察团,在云南现场看到了下面一幕又一幕。
在华坪县观音岩、宾川县鲁地拉水电站工地,往来穿梭的施工车辆造成尘土飞扬。上述两项目不但没有通过环评,就连基本的施工防护措施都没有,也没有监理部门介入。尽管水电部门对外界声称是施工是为项目前期论证做准备的,但工人们干的却是修建施工公路、建引水洞以及坝肩等 实质性水电工程。由于没有采取防范措施,工程渣土直排金沙江,在江岸造成干热河谷生态系统破坏,在河流增加了大量泥沙类物质。更为严重的是,水电站调洪水 库是建在程海冰川断裂带上,所在的位置为脆弱山体,地质构造差,易发生山体滑坡或泥石流,并有地震隐患。2008年8月,鲁地拉水电站附件发生泥石流,造 成8人死亡。在这样地质灾害频发环境下进行水坝建设,其环境影响评价是无论如何也不能忽略的。
在永胜县阿海水电站现场,虽然没有看到野蛮施工场景(但2008年4月记者看到的还是野蛮施工现场),但是项目也是“先斩后奏”的。中国水电三局、 八局、十四局等施工单位早在三年前就陆续开始了前期工作,除了施工公路已建好外,引流洞、肩坝也基本完成,具备了蓄水条件。在记者团采访的第二天,据闻环 保部有关领导要来现场实地考察,考虑是否批准该项目。实际上,生米已做成熟饭,你批也得建,不批也得建。由于项目没有通过环评,属非法施工,工程是在秘密 状态下进行的。为增加项目的神秘性,现场竟打着“军事管理区”字样,将来访者拒之门外。玉龙县金安桥水电站干脆不理会什么环保部的环评,不仅秘密完成了引 流洞、肩坝、截流工程,而且开始装机试验了。
最早引起媒体严重关注的虎跳峡水电站,有关方面曾放弃过“一库八级”计划,但就在媒体沉浸在一片欢呼声之后不久,该下马工程又在新经济形势下“粉墨登场”了。为回避公众质疑,他们将“虎跳峡水电站”改名为 “龙盘水电站”,工程内容换汤不换药,目前正在进行勘探洞、“三通一平”(通电、通路、通水、平整土地)工程建设。如果库区坝址选在龙盘,将迫使金沙江上 游10万人移民,造成20万亩耕地淹没。这个静态投资400亿的巨大水电工程,对中央制定的18亿亩耕地红线就造成直接冲击。对该水电工程环境影响如何评 价是我们非常关心的。
在怒江赛 格坝址,记者们现场看到,工人们正在进行交通洞和坝线洞施工,其中施工营地等已经完成。怒江各条支流已经承包给了开发商,开始了前期工程施工。怒江项目同 样是没有通过环评的,但是施工从几年前就陆续开始了。怒江存在的问题最多,曾因生物多样性、土壤损失、地质灾害、移民安置、“三江并流”世界遗产等问题得 到媒体广泛关注,媒体和公众试图保护中国境内最后一条没有开发的江河,但这个任务非常艰巨。亚碧罗电站坝址离世界遗产5.54公里,库尾2.72公里,马吉水库离世界遗产距离更近,坝址2.21公里,库尾仅810米。要世界遗产还是水电站,地方政府看来还是热衷于后者,寻找各种理由继续开发。
水电开发中最弱势的是土壤、植被以及奔腾的河流,其次是世世代代生活于斯的少数民族如彝族、傈僳族、怒族、普米族、藏族、纳西族 等。虽然他们表示为了国家建设而愿意牺牲个人利益,但是,他们唯一的要求是能够生存下去。在云南省丽江市石鼓镇(红军长征路上的重要渡口)、香格里拉县车 轴村,从纳西族农民的住房来看和实际生活水平看,他们已提前达到了小康水平,水电开发可能会造成他们生活贫困。从我们与农民直接交流看,大部分农民表示不 愿意搬迁。我们的疑问是既然是利国利民的项目,为什么不经过环评,公开进行呢?
水电能源开发依然要付出重大的环境与社会代价,“河流改湖”后会淹没大量耕地、自然生态系统;施工中大量泥沙物质直排江中,对下游水利工程产生危 害;“移民后靠”会加重人地矛盾,建坝和拆坝均会对局环境和上下游环境造成危害;淹没的天然植被、农田、土壤等将会向环境中释放更多的温室气体(甲烷), 因此,即使不考虑世界遗产、文化、景观等软的要素,水电开发造成的环境破坏也会对“水电是清洁能源”大打折扣。但是,目前的形势非常严峻,在云南省似乎一 切都要为水电让路。在这种形势下,环评就成了最为边缘化的摆设,地方政府和业主是将环评作为水电开发的必然成本来对待的;在他们的心目中,水电环评仅仅是 工程的一部分,他们心理清楚尚没有哪一个水电工程因环评而下马。
当我们把所有的江河生态破坏殆尽,待我们赚够了钱,回过头来再也买不到优美的生态环境。为此,我们呼吁,西南水电开发一定要权衡利弊,要充分考虑到建坝、拆坝对水生和陆地生态系统和生物多样性的影响,充分考虑社会和文化成本,以及各种地质风险,合理有序地开发水电资源。用科学发展观指导经济建设,水电环评就不能是摆设。
蒋高明,中国科学院植物研究所首席研究员、博士生导师,联合国教科文组织人与生物圈中国国家委员会副秘书长、中国环境文化促进会理事。他提出的“城市植被”概念和“以自然力恢复中国退化生态系统”等观点得到社会各界广泛认可。
首页图片My Hobo Soul摄
有不少站长读了 ZAC 的文章《Email Marketing - 电子邮件营销终极手册》之后,准备着手改造自己网站上的 Newsletter 程序。对于发送邮件列表,最安全的办法,是使用第三方公司的邮件列表服务。因为使用自己的域名发送,尽管你发送的是正规内容,用户的订阅过程也采用了二次确认,然而一旦被投诉列入反垃圾邮件组织的黑名单,就比较麻烦了,要折腾许久才会解封。而第三方的专业邮件列表服务公司,一般都跟各大 ISP 互有协议。所以国外许多大公司也都是用第三方的邮件列表服务。
但大部分站长还是希望使用自己的域名来发送。主要原因不外乎是:
现在,将自己网站的邮件服务交给 Google Apps 托管的站长越来越多。Google Apps 以及 Gmail 的 SMTP 服务器,要求使用 SSL 连接,但 PHP 自带的 mail 函数以及大多数 PHP 邮件程序目前还不能支持 SSL。那么,如何使用 Gmail 或者 Google Apps 的 SMTP 服务器在线发送邮件列表呢?我的解决方案是使用 PHPMailer。
PHPMailer 是基于 GNU/LGPL 的 PHP 开源邮件发送程序,以 PHP Class 的方式提供给 PHP 开发者调用。由 codeworxtech 开发。目前提供 PHP4 和 PHP5 两个版本免费下载。主要特性:
通过 Gmail/Google Apps 的 SMTP 服务器发送邮件,示例代码如下:
<?php // example on using PHPMailer with GMAIL include("class.phpmailer.php"); $mail = new PHPMailer(); $body = $mail->getFile('contents.html'); $body = eregi_replace("[\]",'',$body); $mail->IsSMTP(); $mail->SMTPAuth = true; // enable SMTP authentication $mail->SMTPSecure = "ssl"; // sets the prefix to the servier $mail->Host = "smtp.gmail.com"; // sets GMAIL as the SMTP server $mail->Port = 465; // set the SMTP port $mail->Username = "yourname@gmail.com"; // GMAIL username $mail->Password = "password"; // GMAIL password $mail->From = "replyto@yourdomain.com"; $mail->FromName = "Webmaster"; $mail->Subject = "This is the subject"; $mail->AltBody = "This is the body when user views in plain text format"; //Text Body $mail->WordWrap = 50; // set word wrap $mail->MsgHTML($body); $mail->AddReplyTo("replyto@yourdomain.com","Webmaster"); $mail->AddAttachment("/path/to/file.zip"); // attachment $mail->AddAttachment("/path/to/image.jpg", "new.jpg"); // attachment $mail->AddAddress("username@domain.com","First Last"); $mail->IsHTML(true); // send as HTML if(!$mail->Send()) echo "Mailer Error: " . $mail->ErrorInfo; else echo "Message has been sent"; ?>
可以看出 PHPMailer 的使用很简单。PHP 程序员可以很容易参照上述代码使用 PHPMailer 改造自己的 Newsletter 程序。
目前,使用 Gmail/Google Apps 的 SMTP Mail Server 发送邮件的每日上限是 500封。这点需要注意。
Tags: PHPMailer | 邮件服务器 | 电子邮件营销 | Google Apps | Gmail | 程序 | PHP
URL: http://www.xiaohui.com/dev/server/send-mail-newslatter-via-gmail-by-phpmailer.htm
一月 2009 | ||||||
一 | 二 | 三 | 四 | 五 | 六 | 日 |
1 | 2 | 3 | 4 | |||
5 | 6 | 7 | 8 | 9 | 10 | 11 |
12 | 13 | 14 | 15 | 16 | 17 | 18 |
19 | 20 | 21 | 22 | 23 | 24 | 25 |
26 | 27 | 28 | 29 | 30 | 31 |