机器学习法解决注入井选位难题-石油圈

在复杂油田中，井位选择的难度极大，机器学习技术能够带来更好的预测结果。

编译 | 惊蛰

对于大多数常规储层，数值模拟可以为新井选取最佳位置，提供比较有价值的预测信息。然而，在 Lost Hills油田的作业中，由于其硅藻岩的储层低渗透率且高孔隙度、弱岩石强度但强渗吸作用，其数值模拟结果并不理想。机器学习 (ML)方法不需要特定的物理模型，可以提供具有足够数据的良好模拟结果，是井位布置的最佳选择。

Lost Hills油田位于美国加利福尼亚州贝克斯菲尔德西北约45英里处。它发现于1910年，1970年代后期引入水力压裂。1992年进入水驱开发阶段并成为目前主要的生产方法。主要的储集岩类型是硅藻岩，它是由硅藻和单细胞生物的积累形成的。平均生产层厚度约为800~1000英尺，到储层底部的深度约为2000英尺，预计原始地层储量超过20亿桶。在油田开发的注入井建造中，井位的选择非常重要，开发商也经历了几种不同的方法。

ML算法和Voronoi图

K均值聚类算法是一种无监督学习算法，与监督算法相反，无监督学习算法不需要提供准确答案。K聚类算法通过迭代来识别每个聚类的成员，其步骤简略为：1.随机生成每个集合的图心，集合的数量K由用户输入；2.将每个数据点分配到最近的图心，以便创建K个集合；3.不断更新每个集合的新图心，使到图心的欧几里德距离达到最小；4.重复步骤2和3，直到达到收敛。

SVM是受监督的ML算法，对于监督学习算法，必须提供答案以训练和构建预测模型。SVM有两个主要应用：分类和回归。在该项研究中主要使用了分类功能。Voronoi图是将空间划分为围绕某个点的区域，并使区域的边界与两个最近的点等距。在油藏工程中，这些边界代表着无流动边界。特别是在难以估算的无流量边界的储层中，比如Lost Hills油田，Voronoi图表示近似的泄油面积。

数据分析与模型构建

数据采集：作业人员共收集了三类数据，即油藏数据、第一年石油生产率数据和完井数据。大多数储层和生产数据都是结构化数据，但是，大多数完井数据都是非结构化的，并且信息的类型不一致，数据也经常丢失。为了建立一套一致的训练数据，设定了以下标准：

1992年水驱开始后，经过压裂的生产井数据；
已知生产井的注入支撑剂量、泥浆量和前置液量，这三个特征是影响水力压裂程度的主要因素，并且可以从各种形式的水力压裂数据中获得；
具有含油饱和度、孔隙度和厚度等测井数据的油井。

除了油井数据之外，还从Lost Hills油田3D Earth模型中查询了每个Voronoi单元的当前地下原油储量（COIP）。基于生产井的Voronoi图重叠在3D模型之上，然后，沿着Voronoi单元边界垂直切割3D模型，并计算Voronoi单元的COIP值。使用了3000多口井的数据用于建造模型的结构框架，其中大约一半用于定义属性数据和油藏描述。历史产量是假设产量与渗透率成正比根据渗透率分配的。

聚类结果和分析：通过六个属性将输入数据分类为两类，0级和1级。属性分别是支撑剂、缓冲剂、泥浆等。这两个类别显示出明显的特征，即1类生产井是品质较好的储层，并且具有较大程度的水力压裂。第1类的所有分布和均值均高于第0类。在储层质量优越且水力压裂程度高的地区的油井往往是生产效果较好的生产井。尽管在K集群中没有使用原油产量，但结果清楚地表明，1类生产井比0类生产井的产量要高很多。1类生产井的平均产量是0类生产井的1.76倍。

采用SVM构建预测模：使用K聚类结果，构建并训练SVM输入数据集，输入的数据集被随机分成两个子集，训练集和测试集。测试数据不用于训练但用于测试训练模型的预测程度。使用网格搜索方法确定正则化参数以防止过度拟合。训练和测试精度分别为0.997和0.987。为了估计输入属性的灵敏度，计算出了影响因子。

根据调查结果和Lost Hills油田现场条件，开发了以下工作流程来选择注入井位置：

设置单位面积COIP值标准，过滤掉通过滤波器的Voronoi单元。如果只开发高含油饱和度区域，则必须设定相对较高的值。否则，如果需要打更多的新井，则可以设置较低的值。该研究建立了标准的单位面积COIP值，即1类油井的平均值。
选择实际没有开发枯竭的Voronoi单元。在这种单元中1类油井甚至包括普通油井都很少。
选择Voronoi边界附近的位置（靠近没有开发枯竭的区域）但远离废弃和现产井以及注水管线，以防止早期注水突破。
重复步骤1到3，直到没有位置可供选择。
生成新的Voronoi图以重新计算单位面积COIP值。
使用SVM模型估算推荐的井位置的类别，以对候选位置进行排序。

最终，通过引入一种ML方法，帮助选择了注入井位置。该方法确定了大约550个井候选位置，这些位置被推荐给Lost Hills油田开发团队，并对所有候选位置进行了排序。井选址工作流程包括历史数据分析、泄油面积估算、ML模型训练和候选位置估算。利用K-means聚类和统计方法分析和识别出更好的生产井特征。Voronoi图提供了对剩余油储量和非泄油区域相对较好的评估，因为Voronoi图可以近似于无渗透边界。对SVM模型进行了培训，根据生产能力对位置进行排序。新颖的工作流程为新井位置选择提供了更系统的方法，使开发团队能够基于数据驱动做出成功决策。

For English, Please click here (展开/收缩)

For most conventional reservoirs, numerical simulation is successful in forecasting and extracting valuable information regarding optimal locations for new wells. The results of numerical simulations for the Lost Hills field, however, were not successful because of the special characteristics of its diatomite reservoirs—low permeability but high porosity, weak rock strength, and strong imbibition. Machine learning (ML) has been considered because it does not require specific physical models but can provide good estimations with enough data.

Introduction

The Lost Hills field is approximately 45 miles northwest of Bakersfield, California, USA. It was discovered in 1910, and hydraulic fracturing was introduced in the late 1970s. Waterflooding was introduced in 1992 and has become the main production method. The main reservoir rock type is diatomite, which is formed by the accumulation of diatoms, single-cell organisms. The average pay thickness is approximately 800 to 1,000 ft, and depth to the reservoir base is approximately 2,000 ft. The estimated original oil in place is more than 2 billion bbl.

ML Algorithms and Voronoi Diagram

K-Means Clustering. K-means clustering is one type of unsupervised learning algorithm. As opposed to supervised algorithms, unsupervised learning algorithms do not need answers supplied. K‑means clustering algorithms identify the members of each cluster by iterating the following steps:

Generate the centroid of each of the clusters randomly. The number of clusters, K, is a user input.
Each data point is assigned to its nearest centroid so that K clusters are created.
The new centroid of each cluster is updated so that the Euclidian distances to the centroid are minimized.
Repeat Steps 2 and 3 until convergence has been reached.

Support Vector Machine (SVM). SVM is a supervised ML algorithm.

For supervised learning algorithms, the answers must be supplied to train and build a predictive model. SVM has two main applications, classification and regression, and classification is used in this study.

SVM builds a hyperplane or set of hyperplanes that separate data into groups (classification) or form a regression line; a hyperplane is a subspace of one dimension less than its ambient space (e.g., a plane in 3D space). Fig. 1 shows an example of classification in 2D space. The red line is a hyperplane set to have the maximum gap between two classes. Data vectors in circles are borderline members and are called support vectors. When building a hyperplane is difficult, data are mapped into a higher-dimensional space to facilitate finding a hyperplane.

Voronoi Diagram. A Voronoi diagram is the division of space into regions around each point shaped so that the borders of the regions are equidistant from the two nearest points. In reservoir-engineering applications, these boundaries are the proxy of a no-flow boundary. Especially in reservoirs with no-flow boundaries that are hard to estimate, such as in Lost Hills, a Voronoi diagram represents approximated drainage areas.

Data Analysis and Model Building

Data Collection. Three types of data were collected, reservoir data, first-year oil-production rate, and completion data. Most reservoir and production data are structured data. Most completion data, however, are unstructured, and the types of available information are not consistent. Data are also often missing. To build a consistent set of training data, the following criteria were set:

Producers fractured after waterflooding began in 1992
Producers for which proppant amount, slurry volume, and pad volume are known; these three features are the main factors that affect the magnitude of hydraulic fractures and are all available from various formats of hydraulic-fracture data
Wells that have log data with oil-saturation (So), porosity (ø), and thickness (H) data available

In addition to the well data, the current oil in place (COIP) of each Voronoi cell was queried from the Lost Hills 3D Earth model. A producer-based Voronoi diagram was overlapped on top of the 3D model. Then, the 3D model was cut vertically along the Voronoi cell boundaries and the COIP values of Voronoi columns were calculated. Data from more than 3,000 wells were used in the construction of the model’s structural framework, with approximately half used in defining the property data and reservoir description. Historic production was assigned on the basis of permeability with the assumption that production is directly proportional to permeability.

Clustering Results and Analysis. Six attributes were used to classify input data into two classes, Class 0 and Class 1. The attributes were proppant, pad, slurry, So/ø/H, COIP per Voronoi cell area, and sand So/ø/H. The two classes show distinct characteristics in that Class 1 producers are better-quality reservoirs and have large hydraulic fractures. All distributions and means of Class 1 are higher than those of Class 0. Producers in areas of superior reservoir quality coupled with larger hydraulic fractures tend to be better oil producers. Even though oil-production rates were not used in K‑means clustering training, results show clearly that Class 1 producers are much better producers than Class 0 producers. The mean oil-production rate of Class 1 producers is 1.76 times greater than that of Class 0 producers.

Building Predictive Models With SVM. Using K-means clustering results, SVM input data sets were built and trained. The input data set was divided randomly into two subsets, a training set (80%) and a testing set (20%). The testing set was not used for training but was used for testing how well the trained model predicted. Regularization parameters were determined using the grid-searching method to prevent overfitting. Training and testing accuracies were 0.997 and 0.987, respectively. To estimate the sensitivity of the input attributes, impact factors were calculated.

Procedures for Selecting Infill-Well Locations

On the basis of findings and Lost Hills field conditions, the following work flow was developed for selecting infill-well locations:

Set a criterion of COIP/area and filter out Voronoi cells passing the filter. If only high-oil-saturation areas are to be developed, then a relatively high value must be set. Or, if more drilling queues are desired, then a lower value can be set. This study set up a normalized COIP/area that is the mean of Class 1 wells.
Select Voronoi cells that have not been drained effectively. Class 1 producers that include nonactive wells in such cells are few.
Select locations near Voronoi boundaries (near the area not drained effectively) but far enough from abandoned and active producers and from the injector lines to prevent early water breakthrough.
Repeat Steps 1 through 3 until the available locations are consumed.
Generate the new Voronoi diagram to recalculate the COIP/area.
Estimate the classes of recommended infill locations using the SVM model to rank candidate locations.

Conclusions

A novel methodology using ML was introduced to aid in selection of infill-well locations. This approach led to the identification of approximately 550 infill-well candidate locations, which were recommended to the Lost Hills asset-development team for execution with rankings of all candidate locations.
The infill-well location-selection work flow consists of historical-data analysis, drainage-area estimation, ML-model training, and candidate-location estimation.
Using K-means clustering and statistical means, the characteristics of better producers are analyzed and identified.
Voronoi diagrams provide a relatively good estimation of the remaining oil in place and the nondraining areas because Voronoi diagrams can approximate no-flow boundaries.
An SVM model was trained to rank the locations on the basis of their production capabilities. The novel work flow provides a more-systematic means of new-well location selection and enables the asset team to make data-driven decisions.

未经允许，不得转载本站任何文章：

相关推荐