22:01 To find the bottleneck, stop guessing and start measuring » MySQL Performance Blog

We recently examined a customer’s system to try to speed up an ETL (Extraction, Transformation and Loading) process for a big data set into a sort of datamart or DW.  What we typically do is ask customers to run the process in question, and then examine what’s happening.  In this case, the (very large, powerful) database server was almost completely idle, with virtually no I/O activity or CPU usage.  So we looked at the server where the ETL process was running.  It was running at 25% CPU usage and was writing some files to disk, but not waiting on I/O.

What’s going on here?  Where’s the bottleneck?  The process is slow, and neither machine is really doing much work.  Why?

Maybe you guessed the network.  Nope, not the network either.  There was plenty of spare network capacity.

If I told you the ETL machine was using exactly 25% of its CPU capacity, would you guess that it had 4 CPU cores and one of them was running at 100% usage?  This is what was happening.  The ETL app was single-threaded and CPU-bound.

Of course, we measured the entire process, so we could say authoritatively what was going on.  But this problem actually took only a few minutes to diagnose.  The point here is to look beyond the database server for what seems to be a database problem.  This is why we call ourselves “full-stack performance tuning experts.”  We try not to have tunnel vision. This reminds me of another problem I helped debug a few weeks ago — a really slow website was due to a curl call that was hidden in the code, and timing out because of DNS issues.

One of the best things you can do to improve your performance is build profiling into your application, or run the application under a profiler (Google for “profiling <language>”) and find the places where it consumes the most time.  It’s especially valuable to profile “external resource calls” such as calls to the database, web services, and so on. These approaches can make it much easier to find the slow parts.

We devoted part of a chapter to profiling in our book.  We explain a lot of useful techniques to help you build “light-weight” profiling into the application from the start (a very smart thing to do).

Sometimes our clients believe they already know the source of the problem, but they can’t prove it. They ask us to either prove them right or show them the real problem. The way we do this is to prefer measurements to guesses.

Of course, finding the problem is only part of the battle. Fixing it is another matter. But “how to fix every performance problem” doesn’t fit into a single blog post!


Entry posted by Baron Schwartz | One comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks

20:54 自暴自弃(20080804) » 呆伯特

原图

20:53 不上班那点事(呆伯特-20080802) » 呆伯特

原图

19:20 Userstats patches with information schema support » MySQL Performance Blog

Recently, we added information schema support to Google’s userstats patch.

There are three information schema tables added: user_statistics, table_statistics, index_statistics.

One can now use select * from information_schema.user_statistics along with show user_statistics.

Links:
Patch for 5.0.62
Patch for 5.1.26


Entry posted by Evgeniy | 4 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks

16:03 Spammers Now Paying For Help Evading Filters » I got Spam?!
Spammers targeting blogs have come up with a new technique. According to an article on NetworkWorld.com, they are now paying people to post for them. The apparent theory is that comment made by an automated system are easy to spot and block, while comments made by a real human are more likely to get through moderation. Spammers are paying around $20 for 100 comments, and seem to be hiring people from India and other third world countries. Good work if you can get it? Probably not. See full article.

Related Entries:

Storm Worm Bots Activated - 14 November 2007

Storm Worm Strikes Again With New Year's Attack - 27 December 2007

Stock Spammers Using Video to Dodge Filters - 30 December 2007

Spammers Using Google To Evade Filters - 25 January 2008


Knock out SPAM at server level!

Contents of this feed are a property of Creative Weblogging Limited and are protected by copyright laws. Violations will be prosecuted. Please email us if you'd like to use this feed for non-commercial activities at feeds - at - creative-weblogging.com.
08:43 一致通过(呆伯特-20080731) » 呆伯特

原图


^==Back Home: www.chedong.com

^==Back Digest Home: www.chedong.com/digest/

<== 2008-08-03
  八月 2008  
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
==> 2008-08-05