The Open-Source Virtual Feature Store

Featureform allows data scientists to define, manage, and serve machine learning features across your organization. The days of untitled_128.ipynb are over. Transformations, features, and training sets can be pushed from notebooks to a centralized feature repository with metadata like name, variant, lineage, and owner.

Transform your Existing Infrastructure into a Feature Store

Featureform allows teams to pick the right data infrastructure to solve their processing problems, while providing a feature store abstraction above it. Featureform orchestrates and manages transformations and offloads the computations to the organization's existing data infrastructure. Featureform is a workflow, not more data infrastructure.

With Without
import featureform as ff

redis = ff.register_redis(
    name = "redis-quickstart",
    host="quickstart-redis", # The internal dns name for redis
    port=6379,
    description = "A Redis deployment we created for the Featureform quickstart"
)

postgres = ff.register_postgres(
    name = "postgres-quickstart",
    host="quickstart-postgres", # The internal dns name for postgres
    port="5432",
    user="postgres",
    password="password",
    database="postgres",
    description = "A Postgres deployment we created for the Featureform quickstart"
)
transactions = postgres.register_table(
    name = "transactions",
    variant = "kaggle",
    description = "Fraud Dataset From Kaggle",
    table = "Transactions", # This is the table's name in Postgres
)

@postgres.sql_transformation(variant="quickstart")
def average_user_transaction():
    """the average transaction amount for a user """
    return "SELECT CustomerID as user_id, avg(TransactionAmount) " \
           "as avg_transaction_amt from {{transactions.kaggle}} GROUP BY user_id"

user = ff.register_entity("user")
# Register a column from our transformation as a feature
average_user_transaction.register_resources(
    entity=user,
    entity_column="user_id",
    inference_store=redis,
    features=[
        {"name": "avg_transactions", "variant": "quickstart", "column": "avg_transaction_amt", "type": "float32"},
    ],
)
import featureform as ff

client = ff.ServingClient()
dataset = client.dataset("fraud_training", "quickstart")
training_dataset = dataset.repeat(10).shuffle(1000).batch(8)
for feature_batch in training_dataset:
    # Train model
import featureform as ff

client = ff.ServingClient()
fpf = client.features([("avg_transactions", "quickstart")], {"user": "C1410926"})
# Run features through model

A framework to define, manage, and share features.

Define your ML resources in Python.

Featureform ensures that transformations, features, labels, and training sets are defined in a standardized form, so they can easily be shared, re-used, and understood across the team.

check out our documentation

Read more about Feature Stores

Our articles explain everything from the feature store landscape to the architecture and design patterns of a virtual feature store.

explore our resources

A unified feature store abstraction for your heterogenous infrastructure

Featureform's virtual feature store is compatible with a wide range of
data infrastructure. Mix and match providers like Snowflake, Redis, Spark, and Cassandra.

Ready to get started?

See what a virtual feature store means for your organization.