Monday, 28 March 2011

Last Week in Drizzle

Welcome to this week's (slightly late) edition of Last Week in Drizzle.  This week sees the kick-off of many new features for the next release of Drizzle codenamed 'Fremont' and the mailing list is a hive of activity around Google Summer of Code.  I apologise for publishing a few days late this week and will try and stay on-track for future editions.

Fremont


In the tradition of Drizzle using Seattle road names in alphabetical order for codenames the next release of Drizzle is codenamed 'Fremont' (the current GA release is codenamed 'Elliott').  Monty Taylor has outlined the merge process going forward as can be seen in his mailing list post.

Google Summer of Code


We have been accepted for Google Summer of Code 2011 and are getting a lot of interest from potential applicants.  If you are interested in working on the Drizzle project as part of GSoC we have the following recommended instructions:

  1. Check out our wiki page on potential projects

  2. Email the mailing list with an introduction about yourself and join the Freenode #drizzle IRC channel to chat to us

  3. Look at our low-hanging-fruit tasks and try to take one or two on.  This gets you used to the code and launchpad processes as well as gives us an insight into your abilities


Xtrabackup


Stewart Smith has been working hard on integrating Percona's Xtrabackup with Drizzle.  Xtrabackup is an online backup tool for InnoDB much like MySQL's Enterprise Backup.  This is nearly ready for merging into Drizzle and Stewart has written a great blog post on the subject on his blog.

Catalogs


Stewart has been a real busy guy this last week, another project he has been working on is getting catalogs support working with more than one catalog.  For those not familiar with catalogs they are a way of totally isolating one user's databases from another, similar to having multiple installations of Drizzle in one box but all running from one daemon.  In the GA release a lot of the framework already existed for catalogs and everything in it runs from a catalog called 'local'.  More information on the progress Stewart has made can be found on his blog post.

Libdrizzle 2.0


In Fremont we are working towards Libdrizzle 2.0.  This is a C++ version of Libdrizzle with a C compatible API.  Eventually it will contain new features such as native sharding (we are still working on filling out a potential features list for it).  For now in the Drizzle trunk you can see libdrizzle has been moved to libdrizzle-1.0 and a new libdrizzle-2.0 directory exists for the new work.

Multi-Master Replication


David Shrewsbury has been working on multi-master replication in Drizzle with a beta release ready to try.  By multi-master I mean having multiple masters write to a single slave.  For more information on this work take a look at his blog post.

Final Thoughts


Development is starting to move forward at a rapid pace for our next GA and we have had a lot of branches merged that I haven't discussed here from people such as Olaf van der Spek who has contributed a lot towards code cleanups.

As always if you have any feedback or topics you would like me to cover, please let me know.

Tuesday, 22 March 2011

Using Wordpress 3.1 on Drizzle

Since the GA release of Drizzle7 I've had several people asking me about how to convert their MySQL sites to use Drizzle instead.  By far the most common one to crop-up is Wordpress.  This is aimed to be a simple guide to starting a new blog using Wordpress 3.1 and Drizzle.

Initial Problems


Wordpress by design is very MySQL orientated, for the most part this is good thing, but when trying to switch to another database for it there can be complications.  An attempt has been made to create a plugin to use Drizzle, but unfortunately it has side-effects such as modifying your content if you happen to blog about anything related to MySQL or Drizzle.  For the purposes of this blog post I have create a patch and will give instructions on how to use it below.  If any Wordpress guru has a way to make this into a good plugin, please get in touch!

Conversions Needed


Almost all the conversions for Wordpress 3.1 revolve around the date.  When creating a draft or any other table entry Wordpress uses the date '0000-00-00' in several columns.  In Drizzle we try to be closer to the SQL standards, and this means that the first valid date is '0001-01-01'.  A large majority of the patch is this particular conversion for the queries throughout the PHP code.  The rest is to do with schema creation, to be specific:

  1. Drizzle has no LONGTEXT, TINYTEXT, etc...  Just TEXT

  2. Drizzle doesn't support multiple character sets, just UTF-8, so we need to drop the character set part of schema creation


The Patch


To patch your wordpress 3.1 source:

  1. Download the patch

  2. Enter the directory of your wordpress installation

  3. Run the following


patch -p1 < wordpress-drizzle.diff

You should now be good to run the install as normal.  Noting that if you are not using the mysql-unix-socket-protocol plugin that you should tell Wordpress to connect to '127.0.0.1' for a local database instead of 'localhost'.

Converting an Installtion


If you already have Wordpress 3.1 installed and using MySQL the patch combined with drizzledump's migration function should still work but I have not tried this, so please backup first before attempting it.

Friday, 18 March 2011

Last Week in Drizzle

Welcome to the latest edition of Last Week in Drizzle.  This week we announced our GA release!!  Interest in Drizzle in the last week has been much higher than anticipated, this blog alone got 4,500 visitors on Wednesday! (which was also a nice test for the Drizzle database powering it)

GA Release


So, on Tuesday the tarball was cut for our GA release called Drizzle7.  Most of the changes from the last week relate to code cleanup, documentation and test suite improvements so that we could keep the codebase stable ready for the release (also many of us are busy writing conference talks around now :) ).  For a quick summary of what to expect in Drizzle7 and the future you can see my three-part special called "Drizzle - The Icing on the Cake": part 1 part 2 part 3

There have also been blog posts from several members of the Drizzle team: Brian Aker, Patrick Crews, Stewart Smith and more can be found on Planet Drizzle

Drizzle in the Media


We have had huge coverage by many technology publications which has been fantastic to read, I'll try to link to some of them here but you should be able to find more by searching Google News:

There has also been a lot of coverage in foreign media which is also fantastic to see.

Drizzle Downloads


The only metric we have for the amount of downloads for Drizzle is the source downloads.  We can't record the usage of the Ubuntu PPA or the installations using the pre-release of Ubuntu Natty (which has Drizzle in it's repositories) and currently don't log the RPM yum repository.  But on source downloads alone there were more downloads in the first 2 days of GA than the 2 weeks of the RC2 release (I count a shade under 400 downloads at the time of publishing this).  On top of this many more people have come to the #drizzle IRC channel on Freenode to ask us questions and even one or two minor bugs have already been found by new users.

Criticisms


The biggest criticism I have seen so far is the name 'Drizzle' for a database.  I personally like the name 'Drizzle', but I come from the UK where Drizzle is a very regular weather condition.  I actually think that is a really good thing, if the only big complaint is the name we most be going right somewhere :)

Fremont


Development has already started on the next version of Drizzle codename Fremont (Drizzle7's codename was Elliott).  This time we aim to make you wait a little less time, with the next GA scheduled for later this year.  For those who haven't guessed it already the codenames for Drizzle are based on road names in Seattle (another place where drizzle regularly happens).  It hasn't yet been decided whether this will be Drizzle7.1 or Drizzle8 and I'm sure we would take suggestions on this at the upcoming Drizzle Developer Day.

Final Thoughts


The overall feedback from the last week has been fantastic.  I'd like to thank everyone who has given us great coverage, everyone who has tried Drizzle and last but not least everyone who develops Drizzle whether it be at Rackspace, another company or just a community developer.

As always if you have any feedback or topics you would like me to cover, please let me know.

Thursday, 17 March 2011

Drizzle - The Icing on the Cake - part 3

[caption id="attachment_159" align="alignleft" width="240" caption="Photo by Sifu Renka under a CC BY-NC-SA 2.0 license"][/caption]

In the second part of this three part special on the GA release of Drizzle7 I covered the development and testing model we use for Drizzle.  In this final part I will cover what you can expect from the future of Drizzle.

What to expect in the future


Whilst we are very proud of the GA release of Drizzle7 there are still features we would like to implement that we could not complete before the release.  Whether the next release is Drizzle7.1 or Drizzle8, we haven't quite decided yet.  But one thing is for sure, we will not be making you wait 3 years for it, expect the next GA to come later this year!  Some of the features I outline here might not make the next GA but we will work our hardest to make sure as much gets in as possible.

Several of the features outlined in this post are planned as possible Google Summer of Code projects, so if you are interested in picking one up please see our GSoC wiki page.

Data Types


We have done a lot of work on data types in Drizzle7 including microsecond precision TIMESTAMP along with new native BOOLEAN and UUID types.  We are planning a few additional types including a native IP address type and a TUPLE type which with encapsulate a replacement for the currently missing SET data type.

Catalogs


This is a big one and can be difficult to explain, but I will give it a go.  We have already put much of the framework for catalogs into Drizzle7 and hidden away from the user there is a built in 'local' catalog used when starting Drizzle.  But what is a catalog?

Think of it as a way of multiple instances of the Drizzle server but running under one daemon.  So, inside a catalog you have users, schemas, tables, etc... and each catalog is isolated from each other.  If you think about this for a moment it is a huge deal in many ways.  For example if you have some kind of shared services (such as cloud) each user to the cloud can have their own catalog, completely isolating their data from everyone else.

On top of this we are planning adding a tunable per-catalog limits system so that you can have some catalogs with higher priority of resources than others.

Stored Procedures


One of the first things we dropped in Drizzle was stored procedures.  In many articles it has been written that they are gone and will never return.  But in Drizzle if there is demand for something and the developer time to do it, we will bring it in.  What people may not realise is part of the framework for this already exists and is used for the slave applier (much like the slave SQL thread in MySQL).  We will be doing stored procedures, we will be doing the properly and they will be done as a plugin so that if you don't want them you don't have to have them.

HailDB


Drizzle7 has an optional HailDB plugin.  But lets take a step back as many won't know what HailDB is.  HailDB is a fork of the Embedded InnoDB source code with many fixes and improvements in it.  It can be used completely independently of Drizzle and integrated into your own code.  It is the eventual goal to have HailDB take a bigger role in Drizzle, possibly replacing InnoDB as the primary storage engine.

Summary


This is just a small sample of the things we have planned.  But the great thing about Drizzle is you, the community, help shape it.  If there is something you feel Drizzle needs we may be able to include it.  Being able to code helps but contributions come in many forms, from helping in mailing lists and the IRC channel, to documentation, to filing bugs, to raw code.  Every contribution is valuable, and every contribution helps to evolve Drizzle.

I hope this three-part blog post has been useful.  If you have any questions please direct them the #drizzle Freenode IRC channel, the mailing list or even contact me directly.

Wednesday, 16 March 2011

Drizzle - The Icing on the Cake - part 2

[caption id="attachment_153" align="alignleft" width="240" caption="Photo by Stéfan under a CC BY-NC-SA 2.0 license"][/caption]

In part 1 of this 3 part series I talked about what is new in the recently released Drizzle7 and what makes it different to MySQL.  In this part I will talk about the development and testing processes behind Drizzle.

The Development Model


Drizzle is developed differently to many open source products.  Instead of dual-licensing Drizzle is developed by companies and users that actually use the product.  No part of it is closed-source and there is no contributor agreement to sign.  We have had many open source developers come from seemingly nowhere to join in development which is fantastic.  Development happens on Launchpad using the Bazaar version control system so that everyone can see what is happening.

Bugs reports are also on Launchpad and are pretty easy to search/track and file a new bug.  If you have a problem running Drizzle that may or may not be a bug Launchpad has 'Questions' which are a bit like support tickets.  We also have the mailing list and Freenode #drizzle IRC channel to ask any questions on.

Testing


Every code branch to be merged goes through the same process regardless of whether it came from a developer from Rackspace or a general community developer.  First the code goes through a peer review and then gets tested on every platform we support using the Jenkins Continuous Integration system.  This doesn't just test to see if the code compiles and runs the test suite, every branch also goes through a Valgrind run and multiple performance benchmarks to make sure there are no regressions (and also to see if a branch improves performance).  All results of these tests are publicly available on our Builds and Benchmarks mailing lists.

Google Summer of Code


We are big fans of GSoC at Drizzle, and every year we have more and more students come to us asking to be a part of GSoC.  Many of these students have gone on to get really good jobs in the database industry straight after GSoC.  If you are a student and are interested in being a part of GSoC you can find a list of projects on our wiki page, to register your interest please contact the mailing list.

O'Reilly MySQL Conference and Expo


We have 12 talks lined up for the conference as well as a section in the "Mastering the MySQL And Drizzle Plugin Development" tutorial.  This year the focus is much more on using Drizzle rather than the development of it.  But anyone interested is welcome to ask as questions.  We will have a Drizzle booth in the Expo hall for those who wish to come and have a chat.

Drizzle Developer Day


If you are at the MySQL conference and want to take part in shaping the future of Drizzle or just want to listen to talks about Drizzle development processes please come along to the Drizzle Developer Day which will be on Friday 15th April (the day after the UC).

Summary


One of the keys to Drizzle's success is the open development model.  Anyone wanting to join in can see our documentation on the subject or contact us on the Freenode #drizzle channel and the mailing list.

In part 3 I will discuss the features currently planned and in-progress for future versions of Drizzle.

 

Tuesday, 15 March 2011

Drizzle - The Icing on the Cake - part 1

As I'm sure all of you know already, today marks the GA release of Drizzle7.  But what was the recipe behind Drizzle?

  1. Take the raw ingredients from previous delicious, well-tried recipes

  2. Sieve out the lumps, separate the eggs and mix

  3. Bake using many cooks in many ovens for around 3 years

  4. Drizzle the special source on top


What is Drizzle?


There are many marketing buzzwords which can describe what Drizzle is such as "A lightweight, microkernel fork of MySQL optimized for the web and cloud".  To me such things are pretty meaningless.  So, lets start at the beginning...

Several people inside MySQL saw that the code could really do with re-factoring.  At the same time they believed that the focus was heading away from it's core web based installations.  They also loved open source, and whilst MySQL is open source, community contributions can be difficult.  These people got together inside Sun Microsystems (and other companies) to create a completely open development of a fork of MySQL 6.0 called "Drizzle".  They aimed to have it easy for new developers to pick up and develop on, moved many parts out to plugins thereby making it light on resources when features are not needed.

In 2010 the original development team moved to Rackspace and several more members were hired (including me), with the aim of Drizzle being used in it's cloud based products.  Even today the amount of active community contributors is higher than the amount of developers inside Rackspace working on Drizzle.

Differences From MySQL


I have been asked many times what the differences are between MySQL and Drizzle.  This is something I could probably write a book on now.  Something that should be clear at this stage is Drizzle is not MySQL, it was MySQL over 3 years but a lot has changed since then.  Having said that, applications that use MySQL can usually be converted to use Drizzle relatively easily.  For new users to Drizzle, here are a few of the key differences:

  • Strictness - Drizzle doesn't assume what you mean (which can cause incorrectly recorded data).  For example trying to store an invalid ENUM will error instead of storing an empty value.

  • Data Types - Drizzle has removed, altered and added data types to simplify things and become closer to the SQL standard.  For example:

    • There is no TINY/SMALL/MEDIUM INT, just INT (and BIGINT).

    • There is no TINY/MEDIUM/LONG TEXT/BLOB, there is just TEXT/BLOB.

    • TIMESTAMP supports microsecond precision.

    • UUID and a true BOOLEAN type added.



  • Replication - Drizzle's replication uses Google Protocol Buffer messages so a replication reader can be written in any language in minutes.  The replication data is stored in InnoDB as part of the transaction as it is being committed so that writing the replication log is very fast.

  • Development - Drizzle is developed using a completely open development model which I will discuss in part 2.

  • Licensing - The main drizzle source is GPLv2 licensed, libdrizzle is BSD licensed and the docs (which are also included in the docs directory of the source) are CC SA 3.0 licensed.  There is no proprietary licensing for any part of Drizzle.


Compatibility With MySQL


Despite many changes there is still a great deal of compatibility with MySQL.  Drizzle speaks the MySQL protocol, so existing MySQL connectors for PHP/Perl/etc... will also connect to and query Drizzle.  The SQL syntax is still very similar to MySQL and on top of all this, drizzledump (which is very similar to mysqldump) can convert table structures and data from MySQL to Drizzle on-the-fly.

Drizzle also includes libdrizzle.  This is a BSD licensed client library written in C which can talk to MySQL and Drizzle servers, from our testing as well as the testing of developers who are integrating libdrizzle into their products it appears that libdrizzle performs better than libmysqlclient too.  Connectors for libdrizzle have been written for most widely used languages such as Python, Java, Perl and PHP.

Plugins


Drizzle uses a completely new plugin architecture so that almost everything is a plugin.  From storage engines, to functions, to protocols, to authentication and even query cache.  This makes it much simpler to switch off the parts you don't use as well as customising Drizzle for your unique application.  In total there are around 80 plugins bundled in the Drizzle source and several others available around the web.

Despite this we have tried to make this easy for most people by having the plugins that most people will use compiled and running by default.

Summary


It is almost impossible to get a feel for what Drizzle is like without trying it for yourself.  We have had some great feedback both positive and negative, and have made changes thanks to this feedback.  We are all very approachable on #drizzle on Freenode and the Drizzle mailing list.

In part 2 I will discuss the open development model behind Drizzle

Friday, 11 March 2011

Last Week in Drizzle

Welcome to this week's edition of "Last Week in Drizzle".  As an introduction this week I would like to quote John David Duncan's recent Facebook post: "And what's in the weather forecast for next week? Drizzle.".  Yes, our first GA release is due next week, does that mean the development pace has slowed?  Heck no!  Over 150,000 lines of bzr diff in the trunk since last week and quite a few branches still in the merge queue going through our extensive regression testing system.

Google Summer of Code


We have once again applied to be part of the Google Summer of Code program.  We had some great students last year and some new faces interested in being students on projects for Drizzle have already started taking on some low-hanging-fruit tasks to get them used to our code and processes.  We will have a sign-up form up soon so that anyone interested in being part of the program which I will blog about when ready.  In the mean time you can read our wiki page about participation and if you have any suggestions for projects this year, please let us know.

Race to GA


We are just a few short days away now from the first Drizzle GA.  The release schedule for Drizzle7 is as follows:

RC1 - 14th February 2011 Released
RC2 - 28th February 2011 Released
GA - 14th March 2011

Engine Removal


By "engine removal" I don't mean the poor state of my car but the fact that we have removed some of the bundled storage engines from Drizzle this week.  This is because some needed maintaining, some didn't quite fit in with Drizzle and some just plain didn't compile any more.  This also helps us as developers support Drizzle by concentrating on the storage engines that are important to users.  Will the removed engines be gone forever?  If there is demand for them and they can be maintained, they will return.  The removed engines are archive, blackhole, filesystem_engine, blitzdb, csv and pbxt.

Node.js


Mariano Iglesias has created a node.js binding for libdrizzle.  In his words "the libdrizzle binding is outperforming node.js mysql bindings by a factor of at least 2 to 1" which is great to hear, especially since it can be used against a MySQL server.  An example of how to use it can be found here.

Libdrizzle Only Option


Monty Taylor as added the configure option --without-server to go along with "make libdrizzle" which will only compile libdrizzle from the drizzle trunk.  This should help anyone who only requires libdrizzle from source and doesn't want to have the dependencies required for the server to get it.

Authentication Defaults


Brian Aker has outlined changes to authentication such as the requirement of a username to connect to Drizzle and only listening on localhost by default.  Further details can be found on the mailing list where he is also asking for feedback on changes.

Final Thoughts


This time of year is incredibly busy for us, preparing for the GA release whilst getting ready to give lots of conference talks and other such things.  But despite this spirits are still high.  I for one am very proud of what has been achieved in Drizzle by the team at Rackspace and other companies and community members involved.  I hope new users coming to Drizzle find it as exciting as we do.

As always if you have any feedback or topics you would like me to cover, please let me know.

Friday, 4 March 2011

Last Week in Drizzle

Welcome to the third edition of Last Week in Drizzle.  The diff of the trunk between last Friday and right now is just over 230,000 lines in size, 10x the size of the previous week!  This includes many changes to the documentation, code clean-ups and Patrick Crews' continued work on our new DBQP test suite.

Replication


David Shrewsbury (I'm going to spell his name correctly this week ;)) and Patrick Crews have been working hard on making replication even more rock solid.  The slave plugin is in, working and is stable with everything we can throw at it.

Drizzle developer day


We have a Drizzle Developer Day at the 2011 O'Reilly MySQL Conference and Expo.  Anyone is welcome to come and learn, contribute and make suggestions about Drizzle.  It will be on Friday 15th April 9:30 a.m. to 5:00 p.m. although the exact location is to be confirmed it will likely take place in the Santa Clara Convention Centre.

If you wish to come along please sign-up here.

Race to GA


Our RC2 was released on the 28th of February and included the stabilised replication code amongst other fixes.  So the release schedule for Drizzle7 is:

RC114th February 2011 Released
RC228th February 2011 Released
GA – 14th March 2011

SQLAlchemy


Monty Taylor has been working hard to get Drizzle working with SQLAlchemy which has a very rigorous test suite.  This will help make it easier for Drizzle to act as a data store for OpenStack Compute which uses SQLAlchemy.

Libdrizzle


We have noticed that people have been downloading libdrizzle from the old libdrizzle Launchpad page.  Back in October we merged libdrizzle into the main Drizzle trunk and since then all bug fixes/development has happened there.  We highly recommend not using the old libdrizzle (which has known bugs) and we are in the process of shutting down the old libdrizzle development page.

Drizzle module for PHP


We have started developing the Drizzle module for PHP in Launchpad and have created a new release of this which is compatible with the libdrizzle in the Drizzle trunk.  We are working to get the fixes in PECL but in the mean time we will be developing in Launchpad and basing any binary releases from this.

Of course, Drizzle is compatible with the MySQL protocol so the existing MySQL functions and classes will work with Drizzle.

Drizzle module for Perl


Patrick Galbraith has released version 0.303 of DBD::Drizzle which contains several fixes for talking to a Drizzle server.

Final Thoughts


We are pretty much at the homestretch for the first GA, years of work from a huge amount of contributors will have a stable release.  Thanks to the many people who have helped make this happen.  We receive valuable feedback every day and all of it goes to make Drizzle a better product.

As always if you have any feedback or topics you would like me to cover, please let me know.

Thursday, 3 March 2011

libdrizzle

We have had several users report issues with libdrizzle lately, but on closer inspection it has been found they are using an old version with known problems.

Back in October we merged libdrizzle into the main drizzle trunk.  All libdrizzle development since then has happened in drizzle rather than the separate libdrizzle project.  We had intended to shut down the libdrizzle project page but for several reasons it had not happened.  The libdrizzle project page now has a message to state that you should use drizzle instead and we have pulled the downloads down.  In the next few weeks we intend to:

  1. kill the libdrizzle project page completely

  2. devise a way to compile just libdrizzle when using the drizzle trunk, omitting the need to have all of drizzle's dependencies to compile it.


On a related note, I have created a release of the drizzle module for PHP on Launchpad which is now compatible with current libdrizzle.  I am also in the process of working on packaging for this as well as a PDO module.