With the recent acquisition of MySQL by Sun, there has been talk about the MySQL open source database now becoming relevant to large enterprises, presumably because it now benefits from Sun's global support, professional services and engineering organizations. In a blog post about the acquisition, SUN CEO Jonathan Schwartz wrote that this is one of his objectives.
While the organizational aspects may have been addressed by the acquisition, MySQL faces some technology limitations which hinder its ability to compete in the enterprise. Like other relational databases, MySQL
becomes a scalability bottleneck because it introduces
contention among the distributed application components.
There are basically two approaches to this challenge that I'll touch in this post:
1. Scale your database through database clustering
2. Scale your application, while leaving your existing database untouched by front-ending the database with In-Memory-Data-Grid (IMDG) or caching technologies. The database acts as a persistence store in the background. I refer to this approach as Persistence as a Service (PaaS).
While both options are valid (with pros and cons), in this post I'll focus mostly on the second approach, which introduces some thought-provoking ideas for addressing the challenge.
Disclaimer: While there are various alternative in-memory data grid products, such as Oracle Coherence and IBM ObjectGrid, in this post I'll focus on the GigaSpaces solution, because for obvious reasons I happen to know it better. Having said that, I try to cover the core principles presented here in generic terms as much as possible.
Scaling your database through database clustering:
There are two main approaches for addressing scalability through database clustering:
Limitations:
- Limited to "read mostly" scenarios: when it comes to inserts and updates, replication overhead may be a bigger constraint than working with a single server (especially with synchronous replication)
- Performance: Constrained by disk I/O performance.
- Consistency: asynchronous replication leads to inconsistency as each database instance might hold a different version of the data. The alternative -- synchronous replication -- may cause significant latency.
- Utilization/Capacity: replication assumes that all nodes hold the entire data set. This creates two problems:.1) each table holds a large amount of data, which increases query/index complexity. 2) We need to provision (and pay for) more storage capacity with direct proportion to the number of replicated database instances
- Complexity: most database replication implementations are hard to configure and and are known to cause stability issues.
- Non-Standard: each database product has different replication semantics, configuration and setup. Moving from one implementation to another might become a nightmare.
Limitations:
- Limited to applications whose data can be easily partitioned.
- Performance: we are still constrained by disk I/O performance
- Requires changes to data model: we need to modify the database schema to fit a partitioned model. Many database implementations require that knowledge of which partition the data resides in is exposed to the application code, which brings us to the next point.
- Requires changes to application code: Requires different model for executing aggregated queries (map/reduce and the like).
- Static: in most database implementations, adding or changing partitions involves down-time and re-partitioning.
- Complex: setting-up database partitions is a fairly complex task, due to the amount of moving parts and the potential of failure during the process.
- Non-standard: as with replication, each database product has different replication semantics, configuration and setup. Partitioning introduces more severe limitations, as it often requires changes to our database schema and application code when moving from one database product to another.
Time for a change - is database clustering the best we can do?
The fundamental problems with both database replication and database partitioning are the reliance on the performance of the file system/disk and the complexity involved in setting up database clusters. No matter how you turn it around, file systems are fairly ineffective when it comes to concurrency and scaling. This is pure physics: how fast can disk storage be when every data access must go through serialization/de-serialization to files, as well as mapping from binary format to a usable format? And how concurrent can it be when every file access relies on moving a physical needle between different file sectors? This puts hard limits on latency. In addition, latency is often severely affected by lack of scalability. So putting the two together makes file systems -- and databases, which heavily rely on them -- suffer from limited performance and scalability.
These database patterns evolved under the assumption that memory is scarce and expensive, and that network bandwidth is a bottleneck. Today, memory resources are abundant and available at a relatively low cost. So is bandwidth. These two facts allow us to do things differently than we used to, when file systems were the only economically feasible option.
Scaling through In Memory Caching/Data Grid
It is not surprising that to enhance scalability and performance many Web 2.0 sites use an in-memory caching solution as a front-end to the database. One such popular solution is memcached. Memcached is a simple open source distributed caching solution that uses a protocol level interface to reference data that resides in an external memory server. Memcached enables rudimentary caching and is designed for read-mostly scenarios. It is used mainly as an addition to the LAMP stack.
The simplicity of memcached is both an advantage and a drawback. Memcached is very limited in functionality. For example, it doesn't support transactions, advanced query semantics, and local-cache. In addition, its protocol-based approach requires the application to be explicitly exposed to the cache topology, i.e., it needs to be aware of each server host, and explicitly map operations to a specific node. These limitations prevent us from fully exploiting the memory resources available to us. Instead, we are still heavily relying on the database for most operations.
Enter in-memory Data Grids.
In-memory data grids (IMDG) provide object-based database capabilities in memory, and support core database functionality, such as advanced indexing and querying, transactional semantics and locking. IMDGs also abstract data topology from application code. With this approach, the database is not completely eliminated, but put it in the *right* place. I refer to this model as Persistence as a Service (PaaS). I covered the core principles of this model in this post. Below I'll respond to some of the typical questions I am asked when I present this approach.
How Persistence as a Service works?
With PaaS, we keep the existing databases as-is: same data, same schema and so on. We use a "memory cloud" (i.e., an in-memory data grid) as a front-end to the database. The IMDG loads its initial state from the database and from that point on acts as the "system of record" for our application. In other words, all updates and queries are handled by the IMDG. The IMDG is also responsible for keeping the database in sync. To reduce performance overhead, synchronization with the database is done asynchronously. The rate at which the database is kept in sync is configurable.
The in-memory data model can be different from the one stored in the database. In most cases, the memory-based data model will be partitioned to gain maximum scalability and performance, while the database remains unchanged.
How does PaaS improve performance compared to a relational database?
Performance gains over relational databases are achieved because:
If you keep the database in sync, isn't your solution limited by database performance?
No. Because:
Doesn't
asynchronous replication mean that data might be lost in case of failure?
No, because asynchronous replication refers to the transfer of data between the IMDG and the database. The IMDG, however, maintains
in-memory backups that are synchronously updated. This means that if one of the
nodes in a partitioned cluster failed before the replication to the underlying database took place, its backup will be able to instantly continue from that exact
point.
What happens if
one of my memory partitions fails?
The backup
of that partition takes over and becomes the primary. The data grid cluster-aware
proxy re-directs the failed operation to the hot backup implicitly. This enables
a smooth transition of the client application during failure -- as if nothing
happened. Each primary node may have multiple backups to further reduce the chance of total failure. In addition, the
cluster manager detects failure and provisions a new backup instance on
one of the available machines.
What happens if
the database fails?
The IMDG
maintains a log of all updates and can re-play them as soon as
the database becomes available again. It is important to note that during
this time the system continues to operate unaffected. The end user will not notice this failure!
How do I maintain
transactional integrity?
The IMDG
supports the standard two-phase commit protocol and XA transactions. Having said that, this
model should be avoided as much as possible due to the fact that it introduces
dependency among multiple partitions, as well as creates a single point of
distributed synchronization in our system. Using a classic distributed
transaction model doesn't take advantage of the full linear scalability potential of the partitioned topology. Instead, the recommended approach is
to break transactions into small, loosely-coupled services, each of which can be
resolved within a single partition. Each partition can maintain transaction
integrity using local transactions. This model ensures that in partial
failure scenarios the system is kept in a consistent
state.
How is
transactional integrity maintained with the database?
As noted
above, distributed transactions might introduce a severe performance and scalability bottleneck, especially if done with the
database. In addition, attempting to execute transactions with the database violates one of the core principles behind PaaS: asynchronous updates to
the database. To avoid this overhead, the IMDG ensures that transactions are
resolved purely in-memory and are sent to the database in a single batch. If
the update to the database fails, the system will re-try that operation until the
update succeeds.
What types of queries are supported?
You can find some code snippets of the different query APIs here.
This model relies heavily on partitioning. How do I handle queries that need to span
multiple partitions?
Aggregated
queries are executed in parallel on all partitions. You can combine this model
with stored procedure-like queries to perform more advanced manipulations, such as
sum and max. See more details below.
What about stored procedures and prepared statements?
Because the
data is stored in memory, we avoid the use of a proprietary language for stored procedures. Instead, we can use either native Java/.Net/C++ or dynamic
languages, such as Groovy and JRuby, to manipulate the data in memory. The IMDG
provides native support for executing dynamic languages, routes the query to where
the data resides, and enables aggregation of the results back to the client. A reducer
can be invoked on the client-side to execute second level aggregation.
See a code example that illustrates how this model works here.
Can I change these prepared statements and stored procedure equivalents without bringing down the data?
Yes. When
you change the script, the script is reloaded to the server while the server is
up without the need to bring down the data. The same capability exists in case
you need to re-fresh collocated services code on the server-side.
Do I need to
change my application code to use an IMDG?
It depends.
There are cases In which introducing an IMDG can be completely seamless and there
are cases in which you will need to go through a re-write, depending on the programming model:
|
Nature of Integration with IMDG |
Comments/limitations |
Hibernate 2nd level cache |
Seamless |
Best fit for read-mostly applications. Limited performance gain as it still heavily relies on the underlying database. |
Seamless, but limited |
SQL commands written against the IMDG are guarantied to run with other JDBC resources. Doesn't support full SQL 92 and therefore existing applications may require code changes.Recommended for monitoring and administration. Not recommended for application development as it introduces unnecessary O/R mapping complexity. | |
Seamless |
Extensions such as timeout and transaction support are available as well. | |
Partially seamless |
Abstracts the transaction handling from the code. Domain model is based on POJOs, and therefore, doesn't require explicit changes, only annotations (annotation can be provided through an external XML file). If our application already uses a DAO pattern then it would require changing the DAO. This allows narrowing down the scope of changes required to use an IMDG-specific interface. This option is highly recommended for best performance and scalability. |
What topologies are supported?
Replicated
(synchronous or asynchronous), partitioned, partitioned-with-backup.
See details
here.
Do I need to
change my code if I switch from one topology to
another?
No. The
topology is abstracted from the application code. The only caveat is
that your code needs to be implemented with partitioning in mind, i.e., moving from
a central server or a replicated topology to partitioning doesn't require changes to
the code as long as your data includes an attribute that acts as a routing index.
How are IMDGs and PaaS different from in-memory databases (IMDB)?
The relational model itself doesn't prevents us from taking full advantage of the fact that the data is stored as objects in memory. For example, when we use in-memory storage in an IMDG, we don't need the O/R mapping layer. In addition, we don't need separate languages to perform data manipulation. We can use the native application code, or dynamic languages, for that purpose.
Moreover, one of the fundamental problems with in-memory databases is that relational SQL semantics is not geared to deal with distributed data models. For example, an application that runs on a central server and was uses things like Join, which often maintains references among tables, or even uses aggregated queries such as Sum and Max, doesn't map well to a distributed data model. This is why many existing IMDB implementations only support very basic topologies and often require significant changes to the data schema and application code. This reduces the motivation for using in-memory relational databases, as it lacks transparency.
The GigaSpaces in-memory data grid implementation, for example, exposes a JDBC interface and provides SQL query support. Applications can therefore benefit from best of both worlds: you can read and write objects directly through the GigaSpaces API, query those same objects using SQL semantics, and view and manipulate the entire data set using regular database viewers.
Can I use an existing Hibernate mapping to map data from the database to the IMDG?
Yes. In addition, with PaaS, the Hibernate mapping overhead is reduced as most of it happens in the background, during initial load or during the asynchronous update to the database.
Further information on Hibernate support is available here.
Can I use PaaS with .Net or C++ applications?
Yes. Starting with GigaSpaces 6.5 both Hibernate (Java) and nHibernate (.Net) are supported. C++ applications deffer to the default Hibernate implementation. In addition, with GigaSpaces' new integration with Microsoft Excel, .Net users can easily access data in the IMDG directly from their Excel spreadsheets without writing code!
Final words:
While this approach is generic and can be applied to any database product, MySQL is the most interesting to discuss as it is widely adopted by those who need cost-effective scalability the most, such as web services, social networks and other Web 2.0 applications. In addition, MySQL faced several challenges in penetrating large enterprises. With the acquisition of Sun, MySQL becomes a viable option for such organizations, but still requires the capabilities mentioned above to compete effectively with rival databases. The combination of IMDG/PaaS with MySQL provides a good solution for addressing some of the bigger challenges in cloud-based deployments. More on that in a future post.
现在时间是3月29日,刚才打开Google.com发现整屏都变黑了。按照他们的解释,Google.com的美国用户将看一整天的黑屏主页Google.com。
这是Google.com为了响应一个名字叫做”地球时间”(Earth Hour)的能源环保运动。世界著名环保组织”世界自然保护基金”和澳大利亚最大报纸之一《悉尼先驱晨报》在2007年3月31日,联合发起一项名为”地球时间”(Earth Hour)的活动。他们号召澳大利亚最大城市悉尼的企业、政府部门和个人在当晚19:30-20:30停止使用电器1小时,用行动响应节约能源、减少温室气体排放和减缓全球气候变暖的号召。
这项活动在2008年持续进行,活动倡议全世界的人们在2008年3月29日的本地时区的晚上8:00-9:00时间,停止使用任何电器1小时。Google.com在关于黑屏主页的专门页面中说,他们之所以决定在3月29日将主页黑屏一整天,是为了支持”地球时间”计划,希望协助这个计划将有关理念推广到世界上的每一个角落。同时,Google.com也说明,这个主意遵循他们公司一贯以来的理念:Google承诺于协助世界拓展一个使用清洁能源的未来。
Google.com解释了他们为何支持这项活动。简单地说,他们就是很喜欢这个创意。人们约定时间在一起,一起关灯一小时,兼具实际效应和理念传播,全世界的人们都可以参加这样的活动,举手之劳做公益,一小时看起来很小,聚集在一起却是一个很可观的数目。这个理念也和Google.com的发家法宝PageRank运用群体智慧组织网络世界的秩序有些相似。
发展成熟的网络服务公司,不仅在工具层面影响着人们的生活,也可以在行为观念上引导人们——如果他们愿意的话——他们有这样的能量:通过控制界面的形象、改变信息议程、调整信息排序、捐献重要的版面空间等等手段,来强行改变人们的观念。当人们突然发现习惯的用户界面和信息路径与以往有些不同的时候,人们的注意力就被调动起来,一扇叫做说服的大门就打开了。这种权利当然不可以滥用——或许只适合全球环保这样无争议性的普世话题。
这是地球时间活动的项目网站,Google.com在上面的那个活动介绍页面里放置了不小的缩略图,并且加了链接。小容在反复登录地球时间活动网站的时候,发现有时候不畅通,而且好几次浏览器地址栏的地址都不同:
http://www7.earthhourus.org/
http://www5.earthhourus.org/
http://www2.earthhourus.org/
http://www3.earthhourus.org/
这说明他们已经网站数据备份在不同的服务器里,准备好迎接Google.com大来的巨大的流量:)这些服务器会不会是HP赞助的呢? :)
最后放上Earth Hour的官方视频,希望各位可以看到。同时附上一些相关的中文资源链接。
上面这个是2008年的活动宣传视频,下面的这个是2007年的活动宣传视频。
豆瓣上的相关小组在这里:
http://www.douban.com/event/10048959/
在Google.com里搜索”地球时间” Results 1 - 10 of about 815,000 for 地球时间
上传了三张截图到Yupoo上面,请看这里。
http://www.yupoo.com/photos/tags/?tag=earthhour
WordPress 2.5, the culmination of six months of work by the WordPress community, people just like you. The improvements in 2.5 are numerous, and almost entirely a result of your feedback: multi-file uploading, one-click plugin upgrades, built-in galleries, customizable dashboard, salted passwords and cookie encryption, media library, a WYSIWYG that doesn’t mess with your code, concurrent post editing protection, full-screen writing, and search that covers posts and pages.
For a short overview of the features with screenshots, it’d be best to visit our sneak peek announcement for RC1. Or check out a 4-minute screencast of the new interface in action. If you just want to jump straight to the good stuff here’s where you can find 2.5 upgrade and download information.
If you want to see everything I would grab a cup of coffee or a mojito, because this post is epic.
Cleaner, faster, less cluttered dashboard — we’ve worked hard to take your feedback about what’s most important in the dashboard and organize things to allow you to focus on what’s important — your blog — and get out of your way. In collaboration with Happy Cog and the community we’ve taken the first major step forward in the WordPress interface since version 1.5.
Dashboard Widgets — the dashboard home page is now a series of widgets, including ones to show you fun stats about your posting, latest comments, people linking to you, new and popular plugins, and of course WordPress news. You can customize any of the dashboard widgets to show, for example, news from your local paper instead of WP news. Plugins can also hook in, for example the WordPress.com stats widget adds a handy double-wide stats box.
Multi-file upload with progress bar — before when you would upload a large file you’d wait forever, never knowing how far along it was. And uploading more than one photo was an exercise in patience, as you could only do one at a time. Now you can select a whole of folder images or music or videos at once and it’ll show you the progress of each upload.
Bonus: EXIF extraction — if you upload JPEG files with EXIF metadata like camera make and model, aperture, shutter speed, ISO, et al. WordPress will extract all the data into custom fields you can use in your template. If you use the EXIF title fields or similar those will be put into their equivalent fields in WP. Most modern digital cameras generate EXIF data.
Search posts and pages — search used to cover just posts, now it includes pages too, a great boon for thoe using WordPress as a CMS. New themes can style or sort pages differently in results.
Tag management — you can now add, rename, delete, and do whatever else you like to tags from inside WordPress, no plugins needed.
Password strength meter — when you change your password on your profile it’ll tell you how strong your password is to help you pick a good one.
Concurrent editing protection — for those of you on multi-author blogs, have you ever opened a post while someone was already editing it, and your auto-saves kept overwriting each other, irrecoverably losing hours of work? I bet that added a few words to your vocabulary. Now if you open a post that someone else is editing, WordPress magically locks it and prevents you from saving until the other person is done. You’ll see a message like below.
Few-click plugin upgrades — if the plugins you use are part of the plugin directory since 2.3 we’ve told you when they have an update available. Now we take that to the next logical step — downloading and installing the upgrade for you. This is dependent a little bit on your host setup, and it may ask you for your FTP password much like OS X or Windows will ask you for a password, but it works well on majority of hosts we were able to test, your mileage may very, plugins in mirror may be larger than they appear.
Friendlier visual post editor — I’m not sure how to articulate this improvement except to say “it doesn’t mess with your code anymore.” We’re now using version 3.0 of TinyMCE, which means better compatibility with Safari, and we’ve paid particular attention this release to its integration and interaction with complex HTML. It also now has a “no-distractions” mode which is like Writeroom for your browser.
Built-in galleries — when you take advantage of multi-file upload to upload a bunch of photos, we have a new shortcode that lets you to easily embed galleries by just putting [ gallery] (without the space) in your post. It’ll display all your thumbnails and captions and each will link each to a page where people can comment on the individual photos. I’ve been using this feature on my blog and have already uploaded over 1,200 pictures into 23 galleries. The shortcode has some hidden options too, check out this documentation.
Now for the geeky stuff. While we’re excited about the above features, each one represents a new opportunity or API for other developers to take to another level. (The best of which we’ll someday integrate back into WP.)
Salted passwords — we now use the phpass library to stretch and salt all passwords stored in the database, which makes brute-forcing them impractical. If you use something like mod_auth_mysql we’ve created a plugin that will allow you to use legacy MD5 hashing. (The hashing is completely pluggable.) Users will automatically switch to the more secure passwords next time they log in.
Secure cookies — cookies are now encrypted based on the protocol described in this PDF paper. which is something like user name|expiration time|HMAC( user name|expiration time, k)
where k = HMAC(user name|expiration time, sk)
and where sk
is a secret key, which you can define in your config.
Easy taxonomy and URL creation — probably best illustrated with an example: I can call register_taxonomy()
with a few arguments to register a “people” taxonomy and whenever I edit an image I’ll see a UI like tags has for identifying the people in a photo, and these will be URL addressable with /person/firstname-lastname/
. All with a single function call.
Inline documentation — the vast majority of the new code going into WordPress include inline documentation that explains the functions and documents their arguments.
Database optimization — we haven’t changed the table layout in this release, which is one of the reasons so many plugins work fine with 2.5. We have added a few new indicies and made a few default fields more flexible based on some bottlenecks we found on WordPress.com, which now hosts 2.7 million WordPress blogs. It should be invisible to the application, just a bit faster on the database side.
$wpdb->prepare() — now almost all of the SQL in WordPress is prepared first, and the same functions are available to your plugins. This should prevent elementary SQL escaping issues.
Media buttons — the add media buttons above the post are both expandable, so you could have an “Add Google Map” button if you like, They can be overridden, so if you think you can do the video or audio tab better than we have you can replace the default.
Shortcode API — the new gallery functionality is powered by the new shortcode API. Shortcodes are little bracket-delineated strings that can be magically expanded at runtime to something more interesting. They give users a short, easy to type and copy/paste string they can move around their post without worrying about messing up complex HTML or embed codes. The Shortcode API is fully documented.
Now you see why 2.5 took a little extra time.
The upgrade instructions for this version are pretty much the same as any other version. The most important thing to check is your plugins, so if for example everything works except the new uploader, a legacy plugin might be causing a javascript error on the page and breaking it. If something goes wrong, the safest thing to do is turn your plugins off (we have a button to do them all at once, now) and turn them back on one-by-one, testing the problem along the way. This has solved almost everybody’s problems in testing, and it also lets you know which plugin author to show some love to so they’ll update their plugin, and which plugin authors already have so you can shower them with praises on your blog.
One brief note about some of the new upload and plugin upgrade features, there are some edge-case hosting platforms, like versions of Lighttpd before 1.5 or over-agressive mod_security rules, which can break. If something isn’t working like it was looked in the screenshot, ask your host if there’s something on the server side which may be interfering. Hosts, feel free to join and post to our wp-testers mailing list if you have an environment that requires some extra code to work around. We’d be happy to include it in the next update.
Quick tip: in 2.5 you click the name of things to edit them, like your username to edit your profile or the title of a post to edit it.
More than growing, it’s on fire. We always talk about things like downloads, and the 2.3 branch has already had 1.92 million downloads as I write this post, but this time we have some far more interesting information I’d like to share.
There were over 1,200 commits to our repository since 2.3.0 and over 90 people were credited in them. This means in our core code, not plugins, there were at least 90 individual folks that contributed something high-quality enough that it made the cut to be part of the download you guys get today. I had no idea this group of people was so large.
Outside of the core commit team, there was particular help from these people, in rough order of number of credits and tickets: mdawaffe (Michael Adams), azaozz (Andrew Ozz), nbachiyski (Nikolay Bachiyski), andy (Andy Skelton), iammattthomas (Matt Thomas), tellyworth (Alex Shiels), josephscott (Joseph Scott), lloydbudd (Lloyd Budd), DD32 (Dion), filosofo (Austin Matzko), hansengel (Hans Engel), pishmishy, ffemtcj, Viper007Bond, ionfish (Benedict Eastaugh), jhodgdon (Jennifer Hodgdon), Otto42, thee17 (Charles E. Free-Melvin), and xknown. Also want to thank MichaelH and Lorelle on the documentation side, and moshu, Kafkaesqui, whooami, MichaelH, Otto42, and jeremyclark13 for helping with support.
The 2.5 branch is nicknamed “Brecker” in honor of Michael Brecker, an exceptionally talented saxophonist who could cross styles effortlessly and never stopped experimenting and pushing himself until he passed away last year.
All of this wasn’t enough, so in our copious spare time we decided to redesign WordPress.org to better match the aesthetics of the new dashboard and also to spruce up a few areas that needed lovin’. Some parts of the site, like the Codex, might show the old style for a day or two. We know, just give us a bit of time. Thanks to Matt Thomas for his epic effort in designing and coding the new site.
As always with WordPress, we don’t claim any of these features to be perfect, or to be better than everyone else in the world, but they are done by and for the people and the one thing we do promise is that with every release we listen and do our best to improve.
2.5 is a major milestone for WordPress not because it added dozens of user-requested features, but because it reaffirms that we’re as passionate about blogging as the day we started. Our community is too fierce to rest on its laurels — contrary to what pundits claim, blogging is far from “finished” and every improvement just whets our appetite for more. And more is coming.
It’s a good thing WordPress doesn’t limit the length of posts, because this one would have hit it. If you made it this far, thanks for sharing a bit of your day with us. I sincerely hope this new version of WordPress helps you do what you love to do.
三月 2008 | ||||||
一 | 二 | 三 | 四 | 五 | 六 | 日 |
1 | 2 | |||||
3 | 4 | 5 | 6 | 7 | 8 | 9 |
10 | 11 | 12 | 13 | 14 | 15 | 16 |
17 | 18 | 19 | 20 | 21 | 22 | 23 |
24 | 25 | 26 | 27 | 28 | 29 | 30 |
31 |