Monday, May 6, 2013

Thoughts on move to 10gen and MongoDB as NoSQL market leader

It has been a little over a month since I joined 10gen and I want to share some of my impressions now that some of the newness has worn off and I am settling into my role.  For context, in March I left Oracle and my position as Director of Product Management for the MySQL database to assume a similar role at 10gen working with MongoDB.  I still have many good friends and respected colleagues on the MySQL team and truly believe them and the MySQL product to be in good hands under Oracle.  That said, here are my thoughts on my move and on the database market as a whole (note the emphasis here is on use case vs proprietary and open source):

10gen reminds me very much of MySQL AB prior to the Sun acquisition. 
I joined MySQL in 2005, a full 3 years before the Sun acquisition. Back then, there was little distinction across the titles and roles within the MySQL team. Basically, everything that everybody did everyday made every bit of a difference. This meant that there was a cohesion across the entire team and that most tasks were met with a “can-do”, positive attitude by the owner or owners regardless of where they fell on the org chart. I see this same trait carried out in the 10geners I have been in contact with and hope it carries on even as the company continues to grow.

10gen places a huge importance on all employees having a working technical knowledge of its products. 
I joined 10gen with years of application and database development experience, mostly using relational models. While most concepts apply to a document-oriented model there are enough technical differences that leave me as a complete novice when it comes to specific MongoDB details and its practical use cases. I have spent my first month in a series of self-guided and instructor led technical training sessions and a practical, real-world bootcamp that have proven to be a welcome quickstart to my understanding and working with the technology. This will pay great dividends as I get more overwhelmed with my true PM duties.

MongoDB is winning the NoSQL database market. 
More importantly, in winning this market it is also winning many general use deployments and projects that have traditionally been implemented on other open source or proprietary rdbms solutions. How is this happening? It really boils down to a few simple, but important factors:
  • MongoDB is winning the hearts and minds of the developer.  By providing flexible, direct access to schema and data definition via JSON, there is little/no developer learning curve when moving between application development and data definition and management. 
  • MongoDB is a true “hero” maker.  Replication and cluster based sharding are designed as the default deployments and are comparatively simple to implement. Developers can add HA scalability to their upfront deployment plan without adding pain to a DBA or Sysadmin’s life, which is a huge advantage over other databases’ modus operandi. 
  • MongoDB users leverage > 90% of its functionality.  Without the complexity and overhead of unneeded features. On the flipside most Oracle, SQL Server users/applications leverage < 20% of features while paying for them all. These things, along with the tremendous momentum around downloads, user events, big name community and customer success stories are good indicators that MongoDB is poised to not only win the NoSQL database market, but the overall database market in due time.  
  • It took Oracle 30 years to build an empire; I believe 10gen + MongoDB can do better for both community users and paying customers in a much shorter timeframe if we remain focused on and true to the points noted above. 
One final thought.  I have gained a true appreciation and respect for Eliot, Dwight, Max and the other executive level leaders of the 10gen business. They are clear in their ambitions for MongoDB and any supporting products that come out of 10gen in the days, months and years to come. They are on public record as saying that 10gen’s goal is to build a sustainable investment and business around the best and most widely used general use database in world. That long term mentality, with many small sprints factored in along the way, is what excites me most about my move to the 10gen team.

Cheers, from 10gen employee/partner #237! Very excited to be here.

Friday, January 22, 2010

5.1 + new InnoDB Plugin

If you're wondering what MySQL 5.1 with the new InnoDB Plugin is all about you'll want to tune into the webinar I am hosting next Tuesday. I'll cover the why, what and how behind the immediate performance and scale improvements that can be had, especially on modern (> 4 CPU) servers, by enabling the new plugin in MySQL 5.1. Learn more and get registered, it should be good time!

You can also take it to a deeper level by joining Brian Miezejewski from our PS team on 3/9 for a practical guide to using and tuning the new plugin features to improve performance and scale. Hope you can join us!

Thursday, January 14, 2010

One reason a no indexes approach is nice

Like a lot of you, I’ve been following with interest Percona’s testing of the open source column databases. One thing I think is pretty cool about some column databases that work with MySQL is that they don’t require you to create indexes. The reason is, in general, the column is the index. Not having to create indexes is nice because lots of indexes can really bog down a database if you’ve got a lot of load or DML activities because the indexes have to be maintained for all data input and alterations.

In Percona’s test, they showed the load time for all the different databases, but I noticed that the times didn’t include the index creation for LucidDB or MonetDB. I decided to follow Vadim’s link on the LucidDB index creation and totaled up the time it took to create the indexes. For the index and statistics times, it was 384,314 seconds and when you add that to the 140,736 seconds for the table load, you get 6 days just to create the database. That’s quite a difference from the 6 hours for InfiniDB and 14 hours for InfoBright, both of which don’t need or use indexes.

I’m sure indexes supply a benefit for some column DB’s in various use cases, but if the database was real dynamic and required a lot of new objects with indexes be added, continuous heavy loads, or DML, it would seem that indexes could really put a ding in things. In that case, it would seem column DB’s thatdon’t require indexes could have an edge there.

Wednesday, September 9, 2009

Less time finding, more time fixing! Enterprise Monitor 2.1, Updated Query Analyzer Now GA!

I just wanted to tip my hat to the MySQL Enterprise Tools Engineering team for another great release of the Enterprise Monitor. Not to name names, but I want to give a special thanks to a team that always over delivers on a collective commitment to producing quality software. So, a mega thanks to:

Andy Bang, Sloan Childers, Darren Oldag, Eric Herman, Jan Kneschke, Kay Roepke, Mark Matthews, Bill Weber, Diego Medina, Marcos Palacios, Carsten "Pino" Segieth, Josh Sled, Keith Russell, Mark Leith, Heidi Bergh-Hoff, and Gary Whizin (and also welcome Michael Schuster!)

Yet another great job guys!

The new version, 2.1, was posted as GA early on Tuesday and it is quite possibly the best release of the Enterprise Monitor to date.

For those not familiar with the Enterprise Monitor, it is included in a MySQL Enterprise subscription and is a distributed web-based app that users deploy in their environment to monitor and tune the security, performance and availability of their dev, QA and production MySQL servers. It is comprised of :

  • An agent, written in C, that is installed on each monitored data source, which collects MySQL and OS metrics, SQL code and exec results
  • A central server (aka "service manager"), written in Java, that collects, monitors and alerts on the data collected by each agent. The service manager uses MySQL Best Practice Advisors to measure the collected data against user-defined thresholds and to proactively notify DBAs of problems or tuning opportunities. Alerts are sent to a central console or via SMTP or SNMP notifications.
  • A repository that holds the data collected by each agent. The service manager monitors the data stored in repository vs maintaining a persistent connection to each MySQL server.
The Enterprise Monitor also provides MySQL specific monitoring features. There is a Replication Monitor that proactively monitors replication topologies for synch and performance issues and a new Query Analyzer that made its debut in release 2.0 in late 2008. The Query Analyzer is designed to save DBAs/Devs time in finding the most expensive queries (by total exec time, exec count, amount of data returned, etc) running across all dev, QA and prod MySQL servers without any dependence on the MySQL logs or things like SHOW PROCESSLIST. You can learn more about the Query Analyzer here.

Based on customer interviews and years spent in the field, we understand the pain associated with finding bottlenecks rooted in poorly written or inefficient SQL code (this is consistently the #1 problem we hear when talking with MySQL DBAs and devs.) This new release helps a DBA/Dev spend less time "finding" and more time "fixing" poorly performing queries. The key new features in the Query Analyzer include:

  • Clickable MySQL and OS graphs that visual correlation system and query activity. Mouse-over a spike in any graph and drill into the queries that were running at the same time. Big time saver.
  • Drill down capability for query specific executions – drill into any query and see execution specific graphs for exec time, count, data returned. Helps you see the "normal" exec pattern for a query and identify outliers that may occur during specific windows or time or for specific variable combinations.
  • Counts for SQL errors/warnings - help you quickly identify queries that may have never finished or that finished in error. These routinely go undetected by the MySQL logs, etc.
  • UI support for EXPLAIN generation threshold - gone are the days of hacking the quan.lua script to set this value!
We have also added new Advisor Rules and Graphs around connections, stale table statistics, tables without indexes, PKS and locked and long running processes.

The new release also includes an updated "What's New" page that allows you to optionally subscribe to live feeds for your open support issues (nice, especially for the issues with a status of "waiting on customer") and for MySQL product updates and alerts. Nice time saver, especially for getting updates on new releases of the Enterprise Server and for Monitor Advisors and graphs.

So how do you get it? Glad you asked...if you are an Enterprise subscriber you can grab the new release and all of the updated docs from the Enterprise Customer download page. If you are interested in learning more or want to try the new Monitor and Query Analyzer for yourself, you can register for a 30 day trial subscription which includes a fully functioning version of the Monitor and Query Analyzer. You also learn everything about MySQL Enterprise by visiting the MySQL web site.

Look forward to hearing about your experience with the new release!

Tuesday, September 1, 2009

Update on MySQL Enterprise Monitor and "Quan"

Just a quick update that the new MySQL Enterprise Monitor has reached RC readiness and is set for official launch in the next few weeks. This is version 2.1 and features:
  • Enhanced Query Analyzer with correlation graphs so you can highlight and drill into spikes in key system resources to see the queries that were running at the same time
  • Query specific execution graphs so you can track the "normal" behavior of your queries over time
  • GUI support for Query Analyzer EXPLAIN generation
  • Live feeds for MySQL product updates and alerts
  • Live feeds for your open support issues
  • New Advisor Rules and Graphs
  • and some other things...
Here's a screenshot of the Query Analyzer and correlation graphs...


Enterprise subscribers can grab the RC here. If you aren't a subscriber but want to try it for yourself, you can get it here.

Look for the official announcement on GA availability early next week. Another great job by the MySQL Enterprise Tools Engineering team!

Wednesday, September 3, 2008

MySQL Query Analyzer: Quick Update

As expected, many people are interested in the "what" behind our new Enterprise Monitor 2.0 w/Query Analyzer, and again, as expected more than a fair number are asking "OK, so when can we try it?" With that in mind, here is a quick update on where things stand:

- MySQL Enterprise subscribers can download build 7038 (soon to be 704x) and all of the docs from the MySQL Enterprise customer site. It has been a popular download and we should have a refreshed build very soon.

- We are planning to post a public beta in the next 2-3 weeks, most likely after we (the Enterprise Tools engineering team) all return from our annual Engineering meeting (this year it is in Riga, Latvia). That event wraps up on 9/24.

- I am doing a webcast on Query Analyzer for our friends in EMEA tomorrow. You can learn more and get registered here. I did the same presentation in the US back on 8/20, so if you can't make the live event, the 8/20 recording/demo is here.

As the public beta kicks into gear I am interested in hearing your feedback on the Enterprise Monitor 2.0, especially the new Query Analyzer. Please plan to actively participate by using our public discussion forums (feel free to start posting now!) In advance of the public beta I would love to get your feedback on the Monitor/Advisors and how we can improve things to better fit your needs. The forums are open and your honest feedback is greatly appreciated.

Thursday, August 28, 2008

MySQL Query Analyzer: Tracking query executions

From a performance standpoint, sometimes even tightly tuned queries can cause a performance drag. The common problem here is not one of actual query performance, rather it is a function of:

- the velocity and frequency that a query is submiited for execution
- the total execution time of the aggregated executions

This could be symptomatic of an application not properly configured for caching (see Darren Oldag's blog on this!), or just overall poor design. Regardless of why, when or how we all know it happens. The trouble with this particular problem is that when a query is tuned, or very simple, it is usually not suspect for being a resource hog. Pulling aggregates for number of execs and total exec time for specific queries is a little tricky and labor intensive with the Slow Query Log, and not really a good option for SHOW PROCESSLIST. With this in mind, we designed the Query Analyzer to aggregate these values for quick reference in the Enterprise Monitor. Take a look:



We have also been listening when Mark Callaghan talks about reporting rollups for the top-N objects that are consuming resources on the server. Given we will probably have to wait until 6.0 to get the SHOW STATS extensions for this, we are looking at creative ways we can do this now using the proxy and Monitor service agent.