Feature Store¶
In order to use Amora's feature store, you need to install it with the feature-store
extra:
Using Amora's Feature Store capabilities enables data teams to:
- Run the necessary data pipeline to transform the data into feature values
- Easily productionize new features from Amora Models
- Store and manage feature data
- Track feature lineage, versions and related metadata
- Serve feature data consistently for training and inference purposes
- Share and reuse features across multiple teams
Components¶
Amora's Feature Store is composed by the following components:
graph LR
GCS[GCS]
subgraph Feature Store
subgraph Storage
OnS[Online Storage]
OffS[Offline Storage]
end
subgraph Serving
O[Feature Service]
end
subgraph Registry
FR[Feature Registry]
end
end
OnS --> R[Redis];
OffS --> B[Big Query];
O -->|get_online_features| M[Model Serving]
O -->|get_historical_features| X[Model Training]
FR --> GCS
Storage¶
Feature data is stored by the Feature Store to support retrieval through Online and Offline feature serving layers.
Offline Storage is typically used to store months or years of feature data for training purposes. In Amora, data is stored in Big Query.
Online Storage is used to persist feature values for low-latency lookup during inference. It only store the latest feature values for each entity, essentially modeling the current state of the world. In Amora, data is stored in Redis.
graph LR
subgraph Feature Store
subgraph Storage
OnS[Online Storage]
OffS[Offline Storage]
end
subgraph Serving
O[Feature Service]
end
end
OnS --> R[Redis];
OffS --> B[Big Query];
Serving¶
A ML Model require a consistent view of features through training and serving.
The definitions of features used to train a model must exactly match the features provided in online serving. When they donβt match, training-serving skew is introduced, which can cause model performance problems. Amora's Feature Store is able to consistently serve feature data to ML Models:
- During the generation of training datasets, querying the offline storage for historical feature values.
- Low-latency retrieval of the latest feature value from the online store.
graph LR
subgraph Feature Store
subgraph Storage
OnS[Online Storage]
OffS[Offline Storage]
end
subgraph Serving
O[Feature Service]
end
end
O -->|get_online_features| M[Model Serving]
O -->|get_historical_features| X[Model Training]
Registry¶
The registry is the single source of truth for information and metadata about features in a project. It is a central catalog for Data Teams and automated jobs that registers what kind of data is stored and how it is organized.
In Amora, the registry is stored on GCS using Feast's Feature Registry.
graph LR
GCS[GCS]
subgraph Feature Store
subgraph Storage
OnS[Online Storage]
OffS[Offline Storage]
end
subgraph Serving
O[Feature Service]
end
subgraph Registry
FR[Feature Registry]
end
end
FR --> GCS