Quantcast
Channel: Blog | Dell
Viewing all articles
Browse latest Browse all 17822

Data As The New Oil: Producing Value for the Oil & Gas Industry

$
0
0
EMC logo

featured-oil-dataDuring a webinar for Data Science Central, Pivotal’s Senior Data Scientist, Rashmi Raghu, discussed how apt the popular metaphor “data is the new oil” is, particularly within the oil and gas industry. Significant technological advances in oil and gas production methods are producing an ever-increasing amount of data from sensors, which companies can leverage to improve logistics, business operations, and more. Through a wide breadth of data collection sources and big data technologies and techniques, oil companies can improve efficiency, realize new business opportunities, and enhance decision making.

There are certainly challenges and roadblocks for companies that want to yield new opportunities and insights from the increase in data production and analysis. As in many industries, oil and gas companies may currently keep data sources and skilled practitioners within silos, often due to legacy software and policies. These technologies and domain experts may be unlikely to work and communicate with one another. As a result, an organization may not know the numerous data sources it has access to and what it can leverage.

In her presentation, Raghu made the case that the promise of Data as “the new oil” is realized when we can tap into its value, in a meaningful, cross-functional way to enhance decision-making. It is precisely in this collaboration and access to all available data sources wherein business value is realized. In contrast to the siloed approaches common within the industry, a Data Lake model enables data to be stored centrally and curated in a meaningful way, and provides businesses with a comprehensive view of the truth. The integration of data assets lead to more informed, powerful models, realizing never-before-realized opportunities to gain opportunity and insights from these models. Moreover, businesses which operationalize the real-time application of predictive models can enhance their ability to rapidly respond to new events.

Such innovations can have an outsized effect upon gas and oil companies, across their operations. Data-driven use cases include predictive maintenance of equipment through the modeling of function and failure and the optimization of maintenance schedules, seismic imaging and inversion analysis, reservoir simulation and management, production optimization, supply chain optimization, and energy trading.

Most significant perhaps is the impact predictive analytics will have upon drilling operations. The goal of such efforts is to increase efficiency, reduce costs, take steps towards zero unplanned downtime by predicting equipment needing maintenance and establishing an early warning system for equipment failure, optimize drilling operations parameters, and improve health and safety while reducing environmental risks. The data to perform these tasks comes from a number of sources, including sensors-enabled machinery as well as data reported by operators.

Drilling wells is an expensive process, compounded by equipment failure. According to The American Oil & Gas Reporter, April 2014, drilling motor damage can account for 35% of rig non-productive time and $150,000 per incident.  By realizing an effective business data lake and introducing data science techniques, a business can track and predict drilling equipment function and failure, an important step on the path towards establishing early warning systems that ensure zero unplanned downtime. Downtime and failure can be reduced in a number of ways, as sensors demonstrate potential risk factors and businesses optimize their parameters for more efficient drilling.

During the presentation, Raghu demonstrated how the software within Pivotal’s Big Data Suite can be utilized throughout all phases of the analytics cycle. Pivotal’s products provide companies with the framework to integrate data from multiple sources across data warehouses and rig operators, the ability to analyze both structured and unstructured data in a unified manner, and support the development of complex and extensible predictive models to predict equipment function and failure.

Predictive analytics for drill operations can be performed in a number of ways. For an example, we’ll consider two approaches in this post. These include predicting drill rate-of-penetration (ROP) and predicting drilling equipment failure. For both cases, there are a number of relevant data sources to consider. In the case of drill rig sensor data, you would consider depth, ROP, RPM speed, torque, weight on bit, and much more, accounting for billions of records. You must also consider operator data, such as drill bit, failure, and component details, accounting for up to hundreds of thousands of records.

Performing analytics on such data sets requires a comprehensive framework for performing data integration at scale, which includes data cleansing and the standardization of columns. As in many big data analytics jobs, this presents a number of challenges: data sources do not use consistent entries in the features and columns that link them, there is the potential for errors in manually entered data, and sensor measurements can return invalid values due to malfunction.

Once such false values are accounted for and cleansed, the next step is understanding correlations in the available data. For useful results, summary statistics and correlations between variables need to be computed at-scale for thousands of variable combinations. MADlib, included within the Pivotal Big Data Suite, can be particularly useful in this case thanks to its parallel implementation of the summary function (a generic function that produces summary statistics from any data table) and Pearson’s correlation (a function that associates two variables to determine how well one random variable can be predicted from the other.)

When working with use cases that have a complex feature set across multiple data sources, it’s often useful to create features from time series variables rather than strictly working with raw data. One such class are statistical features created by moving windows of time series

data, a task that Pivotal’s Big Data Suite is well-equipped to perform rapidly using HAWQ, Pivotal’s SQL-on-Hadoop engine, MADlib, and PL/R. These tools can be used separately or in tandem for the fast computation of hundreds of features over time windows, using billions of rows of time series data. Pivotal Greenplum Database in particular has built-in support for dealing with time series data, which was covered in detail in a previous series of time series analysis blog posts.

image00

Through these processes, predictive analytics can be performed for a number of drilling operations functions, predicting the rate of penetration, the occurrence of equipment failure in a chosen future time window, and the remaining lifecycle of equipment. This is performed using elastic net regularized regression, a method that fits problem statements, provides ease of interpretation, scoring and operationalization, and offers probability of failure in the binomial case. In this particular use case, our team leveraged MADlib’s in-database parallel processing capabilities to perform these tasks.

image01

Such predictive analytics can be performed on a number of aspects of drilling operations, using drill rig sensor and operator data to predict the drill rig’s rate-of-penetration potential faults and failures. The goal is to achieve zero unplanned downtime for businesses in the oil and gas industries. This information provides oil and gas businesses with increased efficiency and can reduce the cost and risk of drilling and maintaining wells. It also offers them a number of operations and competitive advantages, enabling companies to fully utilize big data, develop a comprehensive data integration framework for multiple complex data sources, build and operationalize predictive models, and ultimately gain a competitive advantage by leveraging the entire big data analytics pipeline.

Learn More


Viewing all articles
Browse latest Browse all 17822

Trending Articles