For most conventional reservoirs, numerical simulation is successful in forecasting and extracting valuable information regarding optimal locations for new wells. The results of numerical simulations for the Lost Hills field, however, were not successful because of the special characteristics of its diatomite reservoirs—low permeability but high porosity, weak rock strength, and strong imbibition. Machine learning (ML) has been considered because it does not require specific physical models but can provide good estimations with enough data.
Introduction
The Lost Hills field is approximately 45 miles northwest of Bakersfield, California, USA. It was discovered in 1910, and hydraulic fracturing was introduced in the late 1970s. Waterflooding was introduced in 1992 and has become the main production method. The main reservoir rock type is diatomite, which is formed by the accumulation of diatoms, single-cell organisms. The average pay thickness is approximately 800 to 1,000 ft, and depth to the reservoir base is approximately 2,000 ft. The estimated original oil in place is more than 2 billion bbl.
ML Algorithms and Voronoi Diagram
K-Means Clustering. K-means clustering is one type of unsupervised learning algorithm. As opposed to supervised algorithms, unsupervised learning algorithms do not need answers supplied. K‑means clustering algorithms identify the members of each cluster by iterating the following steps:
- Generate the centroid of each of the clusters randomly. The number of clusters, K, is a user input.
- Each data point is assigned to its nearest centroid so that K clusters are created.
- The new centroid of each cluster is updated so that the Euclidian distances to the centroid are minimized.
- Repeat Steps 2 and 3 until convergence has been reached.
Support Vector Machine (SVM). SVM is a supervised ML algorithm.
For supervised learning algorithms, the answers must be supplied to train and build a predictive model. SVM has two main applications, classification and regression, and classification is used in this study.
SVM builds a hyperplane or set of hyperplanes that separate data into groups (classification) or form a regression line; a hyperplane is a subspace of one dimension less than its ambient space (e.g., a plane in 3D space). Fig. 1 shows an example of classification in 2D space. The red line is a hyperplane set to have the maximum gap between two classes. Data vectors in circles are borderline members and are called support vectors. When building a hyperplane is difficult, data are mapped into a higher-dimensional space to facilitate finding a hyperplane.
Voronoi Diagram. A Voronoi diagram is the division of space into regions around each point shaped so that the borders of the regions are equidistant from the two nearest points. In reservoir-engineering applications, these boundaries are the proxy of a no-flow boundary. Especially in reservoirs with no-flow boundaries that are hard to estimate, such as in Lost Hills, a Voronoi diagram represents approximated drainage areas.
Data Analysis and Model Building
Data Collection. Three types of data were collected, reservoir data, first-year oil-production rate, and completion data. Most reservoir and production data are structured data. Most completion data, however, are unstructured, and the types of available information are not consistent. Data are also often missing. To build a consistent set of training data, the following criteria were set:
- Producers fractured after waterflooding began in 1992
- Producers for which proppant amount, slurry volume, and pad volume are known; these three features are the main factors that affect the magnitude of hydraulic fractures and are all available from various formats of hydraulic-fracture data
- Wells that have log data with oil-saturation (So), porosity (ø), and thickness (H) data available
In addition to the well data, the current oil in place (COIP) of each Voronoi cell was queried from the Lost Hills 3D Earth model. A producer-based Voronoi diagram was overlapped on top of the 3D model. Then, the 3D model was cut vertically along the Voronoi cell boundaries and the COIP values of Voronoi columns were calculated. Data from more than 3,000 wells were used in the construction of the model’s structural framework, with approximately half used in defining the property data and reservoir description. Historic production was assigned on the basis of permeability with the assumption that production is directly proportional to permeability.
Clustering Results and Analysis. Six attributes were used to classify input data into two classes, Class 0 and Class 1. The attributes were proppant, pad, slurry, So/ø/H, COIP per Voronoi cell area, and sand So/ø/H. The two classes show distinct characteristics in that Class 1 producers are better-quality reservoirs and have large hydraulic fractures. All distributions and means of Class 1 are higher than those of Class 0. Producers in areas of superior reservoir quality coupled with larger hydraulic fractures tend to be better oil producers. Even though oil-production rates were not used in K‑means clustering training, results show clearly that Class 1 producers are much better producers than Class 0 producers. The mean oil-production rate of Class 1 producers is 1.76 times greater than that of Class 0 producers.
Building Predictive Models With SVM. Using K-means clustering results, SVM input data sets were built and trained. The input data set was divided randomly into two subsets, a training set (80%) and a testing set (20%). The testing set was not used for training but was used for testing how well the trained model predicted. Regularization parameters were determined using the grid-searching method to prevent overfitting. Training and testing accuracies were 0.997 and 0.987, respectively. To estimate the sensitivity of the input attributes, impact factors were calculated.
Procedures for Selecting Infill-Well Locations
On the basis of findings and Lost Hills field conditions, the following work flow was developed for selecting infill-well locations:
- Set a criterion of COIP/area and filter out Voronoi cells passing the filter. If only high-oil-saturation areas are to be developed, then a relatively high value must be set. Or, if more drilling queues are desired, then a lower value can be set. This study set up a normalized COIP/area that is the mean of Class 1 wells.
- Select Voronoi cells that have not been drained effectively. Class 1 producers that include nonactive wells in such cells are few.
- Select locations near Voronoi boundaries (near the area not drained effectively) but far enough from abandoned and active producers and from the injector lines to prevent early water breakthrough.
- Repeat Steps 1 through 3 until the available locations are consumed.
- Generate the new Voronoi diagram to recalculate the COIP/area.
- Estimate the classes of recommended infill locations using the SVM model to rank candidate locations.
Conclusions
- A novel methodology using ML was introduced to aid in selection of infill-well locations. This approach led to the identification of approximately 550 infill-well candidate locations, which were recommended to the Lost Hills asset-development team for execution with rankings of all candidate locations.
- The infill-well location-selection work flow consists of historical-data analysis, drainage-area estimation, ML-model training, and candidate-location estimation.
- Using K-means clustering and statistical means, the characteristics of better producers are analyzed and identified.
- Voronoi diagrams provide a relatively good estimation of the remaining oil in place and the nondraining areas because Voronoi diagrams can approximate no-flow boundaries.
- An SVM model was trained to rank the locations on the basis of their production capabilities. The novel work flow provides a more-systematic means of new-well location selection and enables the asset team to make data-driven decisions.