FanGraphs’ advanced baseball analytics has a new cloud home: MariaDB

Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.

With the 2021 Major League Baseball season opening today, fans will be filling out their scorecards as they return to stadiums for the first time since the COVID-19 pandemic took hold last spring.

Of course, the data that is now regularly made available by the MLB goes well beyond the hits, runs, and errors fans typically record in a scorecard they purchase at a game. MLB has made the Statcast tool available since 2015. It analyzes player movements and athletic abilities. The Hawk-Eye service uses cameras installed at ballparks to provide access to instant video replays.

Fans now regularly consult a raft of online sites that uses this data to analyze almost every aspect of baseball: top pitching prospects, players who hit the most consistently in a particular ballpark during a specific time of day, and so on.

One of those sites is FanGraphs, which has transitioned the SQL relational database platform it relies on to process and analyze structured data to a curated instance of the open source MariaDB database that has been deployed on the Google Cloud Platform (GCP) as part of a MariaDB Sky cloud service.

MariaDB provides IT organizations with an alternative to the open source MySQL database Oracle gained control over when it acquired Sun Microsystems in 2009. MariaDB is a fork of the MySQL database that is now managed under the auspices of a MariaDB Foundation that counts Microsoft, Alibaba, Tencent, ServiceNow, and IBM among its sponsors, alongside MariaDB itself.

FanGraphs uses the data it collects to enable its editorial teams to deliver articles and podcasts that project, for example, playoff odds for a team based on the results of the SQL queries the company crafts. These insights might be of particular interest to a baseball fan participating in a fantasy league, someone who wants to place a more informed wager on a game at a venue where gambling is, hopefully, legalized, or those making baseball video games.

The decision to move from MySQL to MariaDB running on GCP was made after a few false starts involving attempts to lift and shift the company’s MySQL database instance into the cloud, FanGraphs CEO David Appelman said.

One of the things that attracted FanGraphs to MariaDB is the level of performance that it could attain using a database-as-a-service (DBaaS) platform based on MariaDB and that it provides access to a columnstore storage engine that might one day be employed to drive additional analytics, Appelman said.

In addition, MariaDB now manages the underlying database FanGraphs uses. Appleman said he previously handled most of the IT functions for FanGraphs, including the crafting of SQL queries. Now he will have more time to create SQL queries and monitor the impact they have on the performance of the overall database, Appelman said. “I like to see where the bottlenecks created by a SQL query are,” he added.

FanGraphs plans to eventually take advantage of the data warehouse service provided by MariaDB, Appelman noted.

It’s not likely any of the analytics capabilities provided by FanGraphs and similar sites will one day be able to predict which baseball team will win on any given day. However, the insights they surface do serve to make the current generation of baseball fans a lot more informed about the nuances of the game than Abner Doubleday probably could have imagined.

GamesBeat

  • Newsletters, such as DeanBeat
  • The wonderful, educational, and fun speakers at our events
  • Networking opportunities
  • Special members-only interviews, chats, and “open office” events with GamesBeat staff
  • Chatting with community members, GamesBeat staff, and other guests in our Discord
  • And maybe even a fun prize or two
  • Introductions to like-minded parties

Source: Read Full Article