Online poker isn't about staring a guy down to determine if he's bluffing. It is more about statistics -- examining playing patterns to improve your own game and exploit others. SharkScope, based in Alameda, Calif., maintains a MySQL database that tracks many of those statistics for customers. As online poker has boomed, so has SharkScope's database, and the company struggled with MySQL scalability.
SharkScope gathers data from popular online poker sites, digesting information from tournaments into MySQL for its customers to use. Aside from there being an estimated 6 million real-money online poker players, many players often do what's called multitabling, where more than one game is played at a time. The more extreme end of multitabling sees people playing 24 poker games or more simultaneously. This new way of playing poker online means more tournaments are being played, which means more data for SharkScope to collect.
"We had some huge data problems," CEO Steve Mayes said. "The data hadn't normalized properly when we were bringing it in, and we were growing. MySQL doesn't handle scalability that well."
SharkScope wanted to build its system to be able to sort both by player results and by individual tournament results, according to Mayes. But "having those two massive indexes really puts you at the mercy of the RAM," and SharkScope's existing MySQL compression efforts weren't enough.
SharkScope's MySQL infrastructure includes nine servers -- two masters and seven slaves. Its database is 451 GB in size and has 500 tables with 10 billion rows. The company was largely struggling with replication lag and a long cache-warming period when it rebooted the database. Most importantly, its customers were waiting longer than they wanted when they queried the data. SharkScope wanted customers to be able to get results instantly, but they weren't, and the company feared that customers might start leaving.
The database wasn't always that big. When SharkScope first starting tracking data in 2005, it was just "starting up in a bedroom" and was looking for a free database that could handle its modest needs. At the time the choice was between MySQL and PostgreSQL, and MySQL had a stellar reputation. MySQL compression an issue back then because SharkScope's database was so small.
"We had some experience in using Oracle, but we couldn't come close to affording that," Mayes said.
Solving scalability problems with MySQL compression
Mayes said the company could solve the problem in several ways. First, it could throw iron at the problem. Purchase more servers, which means more CPU and more RAM, and user wait times would drop. But SharkScope felt that wasn't addressing the root of the problem, which was the database itself. Data would continue to grow, and eventually the database would outgrow the hardware yet again.
SharkScope also considered moving to NoSQL database technology, which has a good reputation for supporting Web-based platforms in need of high scalability.
"[NoSQL] is the buzzword flying around," Mayes acknowledged. But SharkScope wanted to avoid moving to an entirely different database platform, as it would require a rewrite of the company's main application, possibly with some staff training to go along with it. SharkScope wasn't looking for the eventual consistency that NoSQL brings, Mayes added -- it needed the correct information quickly for players looking for results.
Next up was an evaluation of MyISAM, a common storage engine for MySQL. SharkScope had tried compressing tables with MyISAM, but that was a maintenance nightmare, Mayes said, because "you compress them, but then you have a lot of read-only tables. It was just impractical for us."
SharkScope took to Googling for an answer, and after looking around for a while, came upon TokuDB, which is an alternative MySQL storage engine. It is designed to improve performance on workloads that are write-intensive, which described SharkScope's database.
That was the answer the company was looking for. Server cache warm-up time went from hours to minutes. Querying times for users had been spiky, sometimes going higher than 20 seconds. After implementing TokuDB, SharkScope got sub-second query times on average. Finally, the company was able to use TokuDB for MySQL compression, shrinking its database to 166 GB, giving it plenty of storage on its existing servers and saving it from having to add more hardware.
"Most of the benefits you can trace back just to the data being smaller on the disk," Mayes said. "With less data, we have less I/O to transfer, and I/O has always killed us. It has always been about I/O. The benefit multiplies because then you can start storing indexes in your RAM because it's compressed. That again cuts down on I/O."
"It enabled us to use complex queries that weren't possible before," Mayes added. "It's pretty amazing."