Technical Guide: Gaussian Processes and Kriging for Reservoir Modeling
A practical guide to spatial interpolation with uncertainty quantification
Overview
Reservoir modeling requires accurate spatial estimates of rock properties. Oil and gas production forecasts, recovery strategies, and capital allocation decisions all depend on how well we understand the permeability field. But we rarely observe permeability directly across the entire volume—we measure it at sparse well locations.
The Challenge: Construct a gridded model of reservoir properties that:
- Honors observed data at well locations
- Provides reliable estimates elsewhere
- Quantifies uncertainty for decision-making
- Handles non-stationary spatial patterns
The Solution: Gaussian Process (GP) regression provides a flexible, probabilistic framework that addresses all these requirements.
Theoretical Foundation
Why Gaussian Processes?
Traditional interpolation methods have limitations:
| Method | Strengths | Limitations |
|---|---|---|
| Linear Interpolation | Fast, simple | No uncertainty, ignores spatial structure |
| Inverse Distance | Intuitive | Arbitrary power parameter, no uncertainty |
| Kriging | Spatial covariance | Assumes stationarity, requires variogram |
| Gaussian Processes | Flexible, probabilistic | Computational cost for large datasets |
GP Advantages:
- Probabilistic: Returns mean AND variance
- Flexible: Custom kernels for different spatial patterns
- Principled: Bayesian framework
- Extensible: Easy to add features beyond coordinates
Mathematical Framework
A Gaussian Process is a collection of random variables, any finite number of which have a joint Gaussian distribution.
Definition:
Where:
m(x)is the mean function (often constant)k(x, x')is the covariance (kernel) function
Prediction:
Given training data (X, y) and test points X*:
f* | X, y, X* ~ N(μ*, Σ*)
μ* = K(X*, X)[K(X, X) + σ²I]⁻¹ y
Σ* = K(X*, X*) - K(X*, X)[K(X, X) + σ²I]⁻¹ K(X, X*)
Where K is the kernel matrix and σ² is noise variance.
Kernel Selection
The kernel encodes assumptions about spatial correlation:
1. Radial Basis Function (RBF)
- Infinitely differentiable
- Smooth predictions
- Good for continuous properties
2. Matérn Kernel
- Controls smoothness via ν parameter
- ν = 0.5: Exponential (rough)
- ν = 1.5: Once differentiable
- ν = 2.5: Twice differentiable
- ν → ∞: RBF
3. Composite Kernels
Implementation with PyGeomodeling
1. Data Preparation
from spe9_geomodeling import GRDECLParser, UnifiedSPE9Toolkit
import numpy as np
# Load SPE9 data
parser = GRDECLParser('data/SPE9.GRDECL')
data = parser.load_data()
# Extract permeability
permx = data['properties']['PERMX']
nx, ny, nz = data['dimensions']
print(f"Grid: {nx} × {ny} × {nz} = {nx*ny*nz} cells")
print(f"PERMX range: [{permx.min():.2f}, {permx.max():.2f}] mD")
Key Steps:
- Filter unrealistic values (zeros, negatives)
- Log-transform for normality
- Create coordinate arrays
- Normalize coordinates
# Create coordinates
x = np.arange(nx)
y = np.arange(ny)
z = np.arange(nz)
X, Y, Z = np.meshgrid(x, y, z, indexing='ij')
coordinates = np.column_stack([X.ravel(), Y.ravel(), Z.ravel()])
values = permx.ravel()
# Filter valid values
mask = values > 0.1 # Remove unrealistic values
coordinates = coordinates[mask]
values = values[mask]
# Log transform
log_values = np.log(values)
print(f"Valid samples: {len(values)}")
2. Variogram Analysis
Before fitting a GP, analyze spatial correlation structure:
from spe9_geomodeling import (
compute_experimental_variogram,
fit_variogram_model,
plot_variogram
)
# Compute experimental variogram (2D slice for visualization)
layer = 5
coords_2d = coordinates[coordinates[:, 2] == layer][:, :2]
vals_2d = log_values[coordinates[:, 2] == layer]
lags, semi_var, n_pairs = compute_experimental_variogram(
coords_2d, vals_2d, n_lags=15
)
# Fit spherical model
vario_model = fit_variogram_model(
lags, semi_var,
model_type='spherical',
weights=np.sqrt(n_pairs)
)
print(f"Nugget: {vario_model.nugget:.4f}")
print(f"Sill: {vario_model.sill:.4f}")
print(f"Range: {vario_model.range_param:.2f}")
# Visualize
plot_variogram(lags, semi_var, model=vario_model, n_pairs=n_pairs)
Interpretation:
- Nugget: Measurement error or micro-scale variability
- Sill: Total variance (should match sample variance)
- Range: Distance of spatial influence (~correlation length)
3. Gaussian Process Model
Using scikit-learn (Smaller Datasets)
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, Matern, ConstantKernel, WhiteKernel
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
# Split data
X_train, X_test, y_train, y_test = train_test_split(
coordinates, log_values, test_size=0.2, random_state=42
)
# Normalize coordinates
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# Define composite kernel
kernel = (
ConstantKernel(1.0) * RBF(length_scale=1.0) +
ConstantKernel(1.0) * Matern(length_scale=1.0, nu=1.5) +
WhiteKernel(noise_level=0.1)
)
# Create and train GP
gp = GaussianProcessRegressor(
kernel=kernel,
n_restarts_optimizer=10,
random_state=42,
normalize_y=True
)
print("Training GP model...")
gp.fit(X_train_scaled, y_train)
print(f"Optimized kernel: {gp.kernel_}")
print(f"Log-likelihood: {gp.log_marginal_likelihood_value_:.2f}")
Using GPyTorch (Larger Datasets)
import torch
import gpytorch
class ExactGPModel(gpytorch.models.ExactGP):
def __init__(self, train_x, train_y, likelihood):
super().__init__(train_x, train_y, likelihood)
self.mean_module = gpytorch.means.ConstantMean()
self.covar_module = gpytorch.kernels.ScaleKernel(
gpytorch.kernels.RBFKernel() +
gpytorch.kernels.MaternKernel(nu=1.5)
)
def forward(self, x):
mean_x = self.mean_module(x)
covar_x = self.covar_module(x)
return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)
# Convert to tensors
train_x = torch.FloatTensor(X_train_scaled)
train_y = torch.FloatTensor(y_train)
# Initialize model
likelihood = gpytorch.likelihoods.GaussianLikelihood()
model = ExactGPModel(train_x, train_y, likelihood)
# Training
model.train()
likelihood.train()
optimizer = torch.optim.Adam(model.parameters(), lr=0.1)
mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model)
print("Training GPyTorch model...")
for i in range(50):
optimizer.zero_grad()
output = model(train_x)
loss = -mll(output, train_y)
loss.backward()
optimizer.step()
if (i + 1) % 10 == 0:
print(f"Iter {i+1}/50 - Loss: {loss.item():.3f}")
4. Prediction with Uncertainty
# Predict on test set
y_pred, y_std = gp.predict(X_test_scaled, return_std=True)
# Back-transform from log space
perm_pred = np.exp(y_pred)
perm_true = np.exp(y_test)
# Compute metrics
from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_error
r2 = r2_score(perm_true, perm_pred)
rmse = np.sqrt(mean_squared_error(perm_true, perm_pred))
mae = mean_absolute_error(perm_true, perm_pred)
print(f"\nTest Set Performance:")
print(f" R²: {r2:.4f}")
print(f" RMSE: {rmse:.2f} mD")
print(f" MAE: {mae:.2f} mD")
# Uncertainty statistics
print(f"\nUncertainty (log space):")
print(f" Mean std: {y_std.mean():.4f}")
print(f" Min std: {y_std.min():.4f}")
print(f" Max std: {y_std.max():.4f}")
5. Full Field Prediction
# Predict across entire grid
print("\nPredicting full field...")
X_full_scaled = scaler.transform(coordinates)
# Predict in batches for memory efficiency
from spe9_geomodeling import BatchPredictor
predictor = BatchPredictor(batch_size=5000, n_jobs=-1)
y_full_pred, y_full_std = predictor.predict(
gp, X_full_scaled, return_std=True
)
# Reshape to 3D grid
perm_pred_3d = np.exp(y_full_pred).reshape(nx, ny, nz)
perm_std_3d = y_full_std.reshape(nx, ny, nz)
print(f"Predicted field shape: {perm_pred_3d.shape}")
6. Visualization
import matplotlib.pyplot as plt
# Plot predictions vs truth
fig, axes = plt.subplots(2, 3, figsize=(15, 10))
for i, layer_idx in enumerate([0, nz//2, nz-1]):
# Predicted
im1 = axes[0, i].imshow(
perm_pred_3d[:, :, layer_idx].T,
cmap='viridis', origin='lower'
)
axes[0, i].set_title(f'Predicted - Layer {layer_idx}')
plt.colorbar(im1, ax=axes[0, i], label='Permeability (mD)')
# Uncertainty
im2 = axes[1, i].imshow(
perm_std_3d[:, :, layer_idx].T,
cmap='Reds', origin='lower'
)
axes[1, i].set_title(f'Uncertainty - Layer {layer_idx}')
plt.colorbar(im2, ax=axes[1, i], label='Std Dev (log space)')
plt.tight_layout()
plt.savefig('gp_predictions.png', dpi=300, bbox_inches='tight')
7. Export for Simulation
# Export to GRDECL format
from spe9_geomodeling import GRDECLParser
def export_property_to_grdecl(values_3d, property_name, filename):
"""Export 3D property array to GRDECL format"""
with open(filename, 'w') as f:
f.write(f"{property_name}\n")
# Flatten in Fortran order (column-major)
values_flat = values_3d.ravel(order='F')
# Write values (8 per line)
for i in range(0, len(values_flat), 8):
line_values = values_flat[i:i+8]
line = ' '.join(f'{v:.6f}' for v in line_values)
f.write(f" {line}\n")
f.write("/\n")
# Export predicted permeability
export_property_to_grdecl(perm_pred_3d, 'PERMX', 'PERMX_GPR.GRDECL')
# Export uncertainty
export_property_to_grdecl(perm_std_3d, 'SIGMA', 'SIGMA_GPR.GRDECL')
print("✓ Exported to GRDECL format for Eclipse/CMG")
Comparison: GP vs Kriging
Ordinary Kriging Implementation
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF
# Kriging is equivalent to GP with specific kernel
# Use variogram range as length scale
kriging_kernel = RBF(length_scale=vario_model.range_param)
kriging_model = GaussianProcessRegressor(
kernel=kriging_kernel,
alpha=vario_model.nugget, # Nugget as noise
optimizer=None, # Don't optimize (use variogram params)
normalize_y=False
)
kriging_model.fit(X_train_scaled, y_train)
y_krig_pred = kriging_model.predict(X_test_scaled)
# Compare
print("\nKriging vs GP:")
print(f" Kriging R²: {r2_score(y_test, y_krig_pred):.4f}")
print(f" GP R²: {r2:.4f}")
Key Differences
| Aspect | Kriging | Gaussian Process |
|---|---|---|
| Kernel | From variogram | Learned from data |
| Optimization | Manual (variogram) | Automatic (MLE) |
| Flexibility | Limited to spatial | Can add features |
| Dimensionality | Typically 2D | Any dimension |
| Uncertainty | Yes | Yes |
| Stationarity | Assumed | Can handle non-stationary |
Advanced Topics
1. Sparse GP for Large Datasets
import gpytorch
class SparseGPModel(gpytorch.models.ExactGP):
def __init__(self, train_x, train_y, likelihood, inducing_points):
super().__init__(train_x, train_y, likelihood)
self.mean_module = gpytorch.means.ConstantMean()
self.base_covar_module = gpytorch.kernels.ScaleKernel(
gpytorch.kernels.RBFKernel()
)
self.covar_module = gpytorch.kernels.InducingPointKernel(
self.base_covar_module,
inducing_points=inducing_points,
likelihood=likelihood
)
def forward(self, x):
mean_x = self.mean_module(x)
covar_x = self.covar_module(x)
return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)
# Use 500 inducing points for 10,000+ training samples
n_inducing = 500
inducing_indices = np.random.choice(len(X_train), n_inducing, replace=False)
inducing_points = torch.FloatTensor(X_train_scaled[inducing_indices])
sparse_model = SparseGPModel(train_x, train_y, likelihood, inducing_points)
2. Multi-Output GP (Co-Kriging)
# Predict multiple properties simultaneously
class MultiOutputGP(gpytorch.models.ExactGP):
def __init__(self, train_x, train_y, likelihood, num_tasks):
super().__init__(train_x, train_y, likelihood)
self.mean_module = gpytorch.means.MultitaskMean(
gpytorch.means.ConstantMean(), num_tasks=num_tasks
)
self.covar_module = gpytorch.kernels.MultitaskKernel(
gpytorch.kernels.RBFKernel(), num_tasks=num_tasks, rank=1
)
def forward(self, x):
mean_x = self.mean_module(x)
covar_x = self.covar_module(x)
return gpytorch.distributions.MultitaskMultivariateNormal(mean_x, covar_x)
# Train on PERMX and PORO simultaneously
3. Anisotropic Kernels
# Different length scales in different directions
class AnisotropicRBF(gpytorch.kernels.Kernel):
def __init__(self, ard_num_dims):
super().__init__()
self.register_parameter(
'lengthscale',
torch.nn.Parameter(torch.ones(ard_num_dims))
)
def forward(self, x1, x2):
# Scale each dimension independently
x1_scaled = x1 / self.lengthscale
x2_scaled = x2 / self.lengthscale
dist = torch.cdist(x1_scaled, x2_scaled)
return torch.exp(-0.5 * dist ** 2)
# Use for reservoirs with different correlation lengths in x, y, z
aniso_kernel = AnisotropicRBF(ard_num_dims=3)
Best Practices
1. Data Preparation
- ✓ Remove outliers and invalid values
- ✓ Log-transform skewed distributions
- ✓ Normalize coordinates
- ✓ Check for trends (detrend if needed)
2. Model Selection
- ✓ Start with variogram analysis
- ✓ Use composite kernels (RBF + Matérn)
- ✓ Cross-validate with spatial folds
- ✓ Compare multiple models
3. Validation
- ✓ Hold-out test set
- ✓ Spatial cross-validation
- ✓ Check uncertainty calibration
- ✓ Visual inspection of predictions
4. Computational Efficiency
- ✓ Use sparse methods for >5,000 points
- ✓ Batch predictions for memory
- ✓ Parallel processing where possible
- ✓ GPU acceleration for GPyTorch
5. Production Deployment
- ✓ Save models with metadata
- ✓ Version control predictions
- ✓ Export to standard formats
- ✓ Document assumptions and limitations
Limitations and Considerations
Computational Complexity
- Exact GP: O(n³) for training, O(n²) for prediction
- Solution: Sparse methods, inducing points, or local GP
Stationarity Assumptions
- Issue: Kernel assumes similar correlation everywhere
- Solution: Detrend data, use local GPs, or non-stationary kernels
Kernel Selection
- Challenge: Many kernel options
- Solution: Start with variogram, use composite kernels, cross-validate
Extrapolation
- Issue: GPs revert to prior mean outside data range
- Solution: Ensure good data coverage, use physical constraints
Conclusion
Gaussian Process regression provides a solid foundation for modern geomodeling. It offers:
- Mathematical rigor: Bayesian framework with clear assumptions
- Uncertainty quantification: Built-in confidence intervals
- Flexibility: Custom kernels for different spatial patterns
- Extensibility: Easy to add features beyond coordinates
When paired with variogram analysis, spatial cross-validation, and proper visualization, GPs become a powerful tool for reservoir engineers seeking interpretable and defensible models.
Key Takeaway: Move beyond basic interpolation toward a more rigorous, probabilistic understanding of the reservoir.
References
- Rasmussen & Williams (2006). Gaussian Processes for Machine Learning
- Chilès & Delfiner (2012). Geostatistics: Modeling Spatial Uncertainty
- SPE9 Benchmark: Society of Petroleum Engineers
- GPyTorch Documentation: gpytorch.ai
Next Steps:
- Try the tutorial notebooks in
examples/notebooks/ - Explore the variogram analysis module
- Experiment with different kernels
- Apply to your own reservoir data