Feathr is a data and AI engineering platform that is widely used in production at LinkedIn for many years and was open sourced in 2022. It is currently a project under LF AI & Data Foundation. Define data and feature transformations based on raw data sources (batch and streaming) using Pythonic APIs. Register transformations by names and get transformed data(features) for various use cases including AI modeling, compliance, go-to-market and more. Share transformations and data(features) across team and company. Feathr is particularly useful in AI modeling where it automatically computes your feature transformations and joins them to your training data, using point-in-time-correct semantics to avoid data leakage, and supports materializing and deploying your features for use online in production.

Features

  • Native cloud integration with simplified and scalable architecture
  • Battle tested in production for more than 6 years: LinkedIn has been using Feathr in production for over 6 years and backed by a dedicated team
  • Scalable with built-in optimizations: Feathr can process billions of rows and PB scale data with built-in optimizations such as bloom filters and salted joins
  • Rich transformation APIs including time-based aggregations, sliding window joins, look-up features, all with point-in-time correctness for AI
  • Pythonic APIs and highly customizable user-defined functions (UDFs) with native PySpark and Spark SQL support to lower the learning curve for all data scientists
  • Unified data transformation API works in offline batch, streaming, and online environments

Project Samples

Project Activity

See All Activity >

Categories

Data Quality

License

Apache License V2.0

Follow Feathr

Feathr Web Site

Other Useful Business Software
Easily Host LLMs and Web Apps on Cloud Run Icon
Easily Host LLMs and Web Apps on Cloud Run

Run everything from popular models with on-demand NVIDIA L4 GPUs to web apps without infrastructure management.

Run frontend and backend services, batch jobs, host LLMs, and queue processing workloads without the need to manage infrastructure. Cloud Run gives you on-demand GPU access for hosting LLMs and running real-time AI—with 5-second cold starts and automatic scale-to-zero so you only pay for actual usage. New customers get $300 in free credit to start.
Try Cloud Run Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Feathr!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

Scala

Related Categories

Scala Data Quality Tool

Registered

2023-06-12