Business analysts versed in SQL have usually carried enterprise analytics efforts. Such traditional BI techniques...
are now being joined by new approaches as predictive machine learning moves from the research lab to front-line activity in mainstream organizations.
While SQL analytics tools continue to find wide use, the new classes of analytics and machine learning tools have led vendors to extend their language support beyond SQL. For Oracle, that need drove its acquisition earlier this year of DataScience.com, a small company with a data science platform that acts as a workbench-style hub for development, testing and deployment of predictive and machine learning models.
The people who use advanced analytics tools are as likely to use languages like R and Python as they are to use SQL. But the differences between data science and SQL analytics go deeper than just language.
Up from BI reporting
With traditional BI, you separate data into smaller parts by regularly asking set questions, said Tim Vlamis, vice president and analytics strategist at Vlamis Software Solutions, an Oracle partner based in Liberty, Mo., that focuses on analytics. Data science and machine learning differ from classic BI reporting, he said.
"In these approaches, we ask the data to tell us what the patterns show rather than asking queries and getting direct answers," Vlamis said. "It's a different way of dealing with data -- one with which a lot of very good business analysts may have trouble."
Both data science and analytics skills are expected to see higher and higher demand, according to a 2017 poll conducted by Gallup. Performed for the Business-Higher Education Forum, the research showed that 69% of employers expect candidates with data science and analytics skills to get preference for jobs in their organizations by 2021.
Python enters the fray
The DataScience.com purchase is important for Oracle because it helps reinforce its position as a cloud provider, according to Mike Gualtieri, a Forrester Research analyst.
"Oracle is trying to build out a very competitive cloud portfolio in much the way that Microsoft and Amazon did," he said, pointing to Microsoft's Azure Machine Learning service and Amazon SageMaker from AWS.
Mike GualtieriForrester Research
Forrester has been tracking DataScience.com before and since the Oracle purchase. Gualtieri said the startup was building out an enterprise workbench platform that supported important open source tools for data science, but had yet to gain considerable customer traction when Oracle stepped in. It also faced competition from vendors like IBM, Cloudera and Domino Data Lab that offer similar platforms.
Nonetheless, DataScience.com supports a role Oracle that didn't have a way to fully support itself, he said. That is, the role of the data scientist in organizations that want to improve collaboration and workflow management for their analytics teams.
Gualtieri noted that both Oracle and Microsoft moved a few years ago to support R as part of their relational database offerings, but that Python continues to gain ground as a language for data science. He said DataScience.com gives Oracle a platform capable of handling R, Python and other languages that are yet to come along for machine learning and predictive analytics.
Addressing enterprise gaps
Unveiled in May 2018, the DataScience.com deal closed in June. Indications at the recent Oracle OpenWorld 2018 conference suggested it is still too early for Oracle to further detail the sparse plans outlined at the time of the purchase to add DataScience.com to its cloud and big data offerings via a new Data Science Cloud Service.
The gap that DataScience.com stands to fill is one between innovative new machine learning tooling and enterprise needs, according to Mahesh Thiagarajan, director of product management for Oracle Cloud Infrastructure, who spoke in an interview at OpenWorld 2018. The parade of tools like PyTorch, TensorFlow and others is hard to manage, he said.
"The innovation is happening in the open source community, and you want to get that innovation. But one of the biggest issues enterprise customers are facing is that there is so much tooling out there," Thiagarajan said.
Moreover, while much of the work of building predictive and machine learning models happens on the laptops or desktop systems of data scientists, it's important to properly protect data at the enterprise level. "You don't want that data exposed," he said.
DataScience.com will help Oracle move toward that goal, according to Thiagarajan, while "enabling collaboration in the cloud on an enterprise scale."
IDC analyst Dan Vesset also said he sees the benefit of bringing diverse advanced analytics efforts under an enterprise umbrella with DataScience.com for Oracle.
"There are a lot of discrete data science projects in a lot of companies," he said. "These are often one-off proof of concepts."
There's an opportunity to optimize the platform to run on Oracle infrastructure, but the data itself can be from any source, Vesset noted. He described DataScience.com as a platform for managing analytical modeling operations.
A challenge that Oracle faces, Vesset said, is getting out the message that this is for data scientists. The perception that Oracle is a database company steeped in SQL analytics won't be easy to change, he added.