1
0
Fork 0
mirror of https://github.com/chrislusf/seaweedfs synced 2025-09-09 21:02:46 +02:00
seaweedfs/weed/mount/ml_integration.go
chrislu f02c4f816b Production Integration: ML-aware FUSE mount optimizations
OPTION A COMPLETE: Full production integration of ML optimization system

## Major Integration Components:

### 1. Command Line Interface
- Add ML optimization flags to 'weed mount' command:
  * -ml.enabled: Enable/disable ML optimizations
  * -ml.prefetchWorkers: Configure concurrent prefetch workers (default: 8)
  * -ml.confidenceThreshold: Set ML confidence threshold (default: 0.6)
  * -ml.maxPrefetchAhead: Max chunks to prefetch ahead (default: 8)
  * -ml.batchSize: Batch size for prefetch operations (default: 3)
- Updated command help text with ML Optimization section and usage examples
- Complete flag parsing and validation pipeline

### 2. Core WFS Integration
- Add MLIntegrationManager to WFS struct with proper lifecycle management
- Initialize ML optimization based on mount flags with custom configuration
- Integrate ML system shutdown with graceful cleanup on mount termination
- Memory-safe initialization with proper error handling

### 3. FUSE Operation Hooks
- **File Open (wfs.Open)**: Apply ML-specific optimizations (FOPEN_KEEP_CACHE, direct I/O)
- **File Read (wfs.Read)**: Record access patterns for ML prefetch decision making
- **File Close (wfs.Release)**: Update ML file tracking and cleanup resources
- **Get Attributes (wfs.GetAttr)**: Apply ML-aware attribute cache timeouts
- All hooks properly guarded with nil checks and enabled status validation

### 4. Configuration Management
- Mount options propagated through Option struct to ML system
- NewMLIntegrationManagerWithConfig for runtime configuration
- Default fallbacks and validation for all ML parameters
- Seamless integration with existing mount option processing

## Production Features:

 **Zero-Impact Design**: ML optimizations only activate when explicitly enabled
 **Backward Compatibility**: All existing mount functionality preserved
 **Resource Management**: Proper initialization, shutdown, and cleanup
 **Error Handling**: Graceful degradation if ML components fail
 **Performance Monitoring**: Integration points for metrics and debugging
 **Configuration Flexibility**: Runtime tunable parameters via mount flags

## Testing Verification:
-  Successful compilation of entire codebase
-  Mount command properly shows ML flags in help text
-  Flag parsing and validation working correctly
-  ML optimization system initializes when enabled
-  FUSE operations integrate ML hooks without breaking existing functionality

## Usage Examples:

Basic ML optimization:
backers.md
bin
build
cmd
CODE_OF_CONDUCT.md
DESIGN.md
docker
examples
filerldb2
go.mod
go.sum
k8s
LICENSE
Makefile
ML_OPTIMIZATION_PLAN.md
note
other
random
README.md
s3tests_boto3
scripts
seaweedfs-rdma-sidecar
snap
SSE-C_IMPLEMENTATION.md
telemetry
test
test-volume-data
unmaintained
util
venv
weed
chrislu          console      Aug 27 13:07
chrislu          ttys004      Aug 27 13:11
chrislu          ttys012      Aug 28 14:00
Filesystem     512-blocks       Used Available Capacity  iused      ifree %iused  Mounted on
/dev/disk3s1s1 1942700360   22000776 332038696     7%   425955 1660193480    0%   /
devfs                 494        494         0   100%      856          0  100%   /dev
/dev/disk3s6   1942700360    6291632 332038696     2%        3 1660193480    0%   /System/Volumes/VM
/dev/disk3s2   1942700360   13899920 332038696     5%     1270 1660193480    0%   /System/Volumes/Preboot
/dev/disk3s4   1942700360       4440 332038696     1%       54 1660193480    0%   /System/Volumes/Update
/dev/disk1s2      1024000      12328    983744     2%        1    4918720    0%   /System/Volumes/xarts
/dev/disk1s1      1024000      11064    983744     2%       32    4918720    0%   /System/Volumes/iSCPreboot
/dev/disk1s3      1024000       7144    983744     1%       92    4918720    0%   /System/Volumes/Hardware
/dev/disk3s5   1942700360 1566013608 332038696    83% 11900819 1660193480    1%   /System/Volumes/Data
map auto_home           0          0         0   100%        0          0     -   /System/Volumes/Data/home
Filesystem     512-blocks       Used Available Capacity  iused      ifree %iused  Mounted on
/dev/disk3s1s1 1942700360   22000776 332038696     7%   425955 1660193480    0%   /
devfs                 494        494         0   100%      856          0  100%   /dev
/dev/disk3s6   1942700360    6291632 332038696     2%        3 1660193480    0%   /System/Volumes/VM
/dev/disk3s2   1942700360   13899920 332038696     5%     1270 1660193480    0%   /System/Volumes/Preboot
/dev/disk3s4   1942700360       4440 332038696     1%       54 1660193480    0%   /System/Volumes/Update
/dev/disk1s2      1024000      12328    983744     2%        1    4918720    0%   /System/Volumes/xarts
/dev/disk1s1      1024000      11064    983744     2%       32    4918720    0%   /System/Volumes/iSCPreboot
/dev/disk1s3      1024000       7144    983744     1%       92    4918720    0%   /System/Volumes/Hardware
/dev/disk3s5   1942700360 1566013608 332038696    83% 11900819 1660193480    1%   /System/Volumes/Data
map auto_home           0          0         0   100%        0          0     -   /System/Volumes/Data/home
/Users/chrislu/go/src/github.com/seaweedfs/seaweedfs
HQ-KT6TWPKFQD
/Users/chrislu/go/src/github.com/seaweedfs/seaweedfs

Custom ML configuration:
backers.md
bin
build
cmd
CODE_OF_CONDUCT.md
DESIGN.md
docker
examples
filerldb2
go.mod
go.sum
k8s
LICENSE
Makefile
ML_OPTIMIZATION_PLAN.md
note
other
random
README.md
s3tests_boto3
scripts
seaweedfs-rdma-sidecar
snap
SSE-C_IMPLEMENTATION.md
telemetry
test
test-volume-data
unmaintained
util
venv
weed
/Users/chrislu/go/src/github.com/seaweedfs/seaweedfs

## Architecture Impact:
- Clean separation between core FUSE and ML optimization layers
- Modular design allows easy extension and maintenance
- Production-ready with comprehensive error handling and resource management
- Foundation established for advanced ML features (Phase 4)

This completes Option A: Production Integration, providing a fully functional ML-aware FUSE mount system ready for real-world ML workloads.
2025-08-30 16:06:25 -07:00

178 lines
5.3 KiB
Go

package mount
import (
"time"
"github.com/hanwen/go-fuse/v2/fuse"
"github.com/seaweedfs/seaweedfs/weed/glog"
"github.com/seaweedfs/seaweedfs/weed/mount/ml"
"github.com/seaweedfs/seaweedfs/weed/pb/filer_pb"
"github.com/seaweedfs/seaweedfs/weed/util/chunk_cache"
"github.com/seaweedfs/seaweedfs/weed/wdclient"
)
// MLIntegrationManager manages ML optimization integration for the main WFS
type MLIntegrationManager struct {
mlOptimization *ml.MLOptimization
fuseIntegration *ml.FUSEMLIntegration
enabled bool
}
// NewMLIntegrationManager creates a new ML integration manager
func NewMLIntegrationManager(chunkCache chunk_cache.ChunkCache, lookupFn wdclient.LookupFileIdFunctionType) *MLIntegrationManager {
// Create ML optimization with default config
config := ml.DefaultMLConfig()
mlOpt := ml.NewMLOptimization(config, chunkCache, lookupFn)
// Create FUSE integration
fuseInt := ml.NewFUSEMLIntegration(mlOpt)
manager := &MLIntegrationManager{
mlOptimization: mlOpt,
fuseIntegration: fuseInt,
enabled: true,
}
glog.V(1).Infof("ML integration manager initialized")
return manager
}
// NewMLIntegrationManagerWithConfig creates a new ML integration manager with custom configuration
func NewMLIntegrationManagerWithConfig(
chunkCache chunk_cache.ChunkCache,
lookupFn wdclient.LookupFileIdFunctionType,
prefetchWorkers int,
confidenceThreshold float64,
maxPrefetchAhead int,
batchSize int,
) *MLIntegrationManager {
config := &ml.MLConfig{
PrefetchWorkers: prefetchWorkers,
PrefetchQueueSize: prefetchWorkers * 4, // 4x workers for queue depth
PrefetchTimeout: 30 * time.Second,
EnableMLHeuristics: true,
SequentialThreshold: 5,
ConfidenceThreshold: confidenceThreshold,
MaxPrefetchAhead: maxPrefetchAhead,
PrefetchBatchSize: batchSize,
}
mlOpt := ml.NewMLOptimization(config, chunkCache, lookupFn)
// Create FUSE integration
fuseInt := ml.NewFUSEMLIntegration(mlOpt)
manager := &MLIntegrationManager{
mlOptimization: mlOpt,
fuseIntegration: fuseInt,
enabled: true,
}
glog.V(1).Infof("ML integration manager initialized with custom config: workers=%d, confidence=%.2f, prefetchAhead=%d, batchSize=%d",
prefetchWorkers, confidenceThreshold, maxPrefetchAhead, batchSize)
return manager
}
// EnableMLOptimization enables or disables ML optimization
func (mgr *MLIntegrationManager) EnableMLOptimization(enabled bool) {
mgr.enabled = enabled
if mgr.mlOptimization != nil {
mgr.mlOptimization.Enable(enabled)
}
if mgr.fuseIntegration != nil {
mgr.fuseIntegration.EnableMLOptimizations(enabled)
}
glog.V(1).Infof("ML optimization %s", map[bool]string{true: "enabled", false: "disabled"}[enabled])
}
// OnFileOpen should be called when a file is opened
func (mgr *MLIntegrationManager) OnFileOpen(inode uint64, entry *filer_pb.Entry, fullPath string, flags uint32, out *fuse.OpenOut) {
if !mgr.enabled || mgr.fuseIntegration == nil {
return
}
mgr.fuseIntegration.OnFileOpen(inode, entry, fullPath, flags, out)
}
// OnFileClose should be called when a file is closed
func (mgr *MLIntegrationManager) OnFileClose(inode uint64) {
if !mgr.enabled || mgr.fuseIntegration == nil {
return
}
mgr.fuseIntegration.OnFileClose(inode)
}
// OnFileRead should be called when a file is read
func (mgr *MLIntegrationManager) OnFileRead(inode uint64, offset int64, size int) {
if !mgr.enabled || mgr.fuseIntegration == nil {
return
}
mgr.fuseIntegration.OnFileRead(inode, offset, size)
}
// OnChunkAccess should be called when a chunk is accessed
func (mgr *MLIntegrationManager) OnChunkAccess(inode uint64, chunkIndex uint32, fileId string, cacheLevel int, isHit bool) {
if !mgr.enabled || mgr.fuseIntegration == nil {
return
}
mgr.fuseIntegration.OnChunkAccess(inode, chunkIndex, fileId, cacheLevel, isHit)
}
// OptimizeAttributes applies ML-specific attribute caching
func (mgr *MLIntegrationManager) OptimizeAttributes(inode uint64, out *fuse.AttrOut) {
if !mgr.enabled || mgr.fuseIntegration == nil {
return
}
mgr.fuseIntegration.OptimizeAttributes(inode, out)
}
// OptimizeEntryCache applies ML-specific entry caching
func (mgr *MLIntegrationManager) OptimizeEntryCache(inode uint64, entry *filer_pb.Entry, out *fuse.EntryOut) {
if !mgr.enabled || mgr.fuseIntegration == nil {
return
}
mgr.fuseIntegration.OptimizeEntryCache(inode, entry, out)
}
// ShouldEnableWriteback determines if writeback should be enabled for a file
func (mgr *MLIntegrationManager) ShouldEnableWriteback(inode uint64, entry *filer_pb.Entry) bool {
if !mgr.enabled || mgr.fuseIntegration == nil {
return false
}
return mgr.fuseIntegration.ShouldEnableWriteback(inode, entry)
}
// GetComprehensiveMetrics returns all ML optimization metrics
func (mgr *MLIntegrationManager) GetComprehensiveMetrics() *ml.FUSEMLMetrics {
if !mgr.enabled || mgr.fuseIntegration == nil {
return &ml.FUSEMLMetrics{}
}
metrics := mgr.fuseIntegration.GetOptimizationMetrics()
return &metrics
}
// IsEnabled returns whether ML optimization is enabled
func (mgr *MLIntegrationManager) IsEnabled() bool {
return mgr.enabled
}
// Shutdown gracefully shuts down the ML integration
func (mgr *MLIntegrationManager) Shutdown() {
glog.V(1).Infof("Shutting down ML integration manager...")
if mgr.fuseIntegration != nil {
mgr.fuseIntegration.Shutdown()
}
glog.V(1).Infof("ML integration manager shutdown complete")
}