

impacted by data changes, understand the severity of the impact, and notify the relevant stakeholders. With data lineage, data teams can see all the downstream consumers - applications, dashboards, machine learning models or data sets, etc. To understand the importance of data lineage, we have highlighted some of the common use cases we have heard from our customers below.ĭata goes through multiple updates or revisions over its lifecycle, and understanding the potential impact of any data changes on downstream consumers becomes important from a risk management standpoint. With a data lineage solution, data teams get an end-to-end view of how data is transformed and how it flows across their data estate.Īs more and more organizations embrace a data-driven culture and set up processes and tools to democratize and scale data and AI, data lineage is becoming an essential pillar of a pragmatic data management and governance strategy. Lineage includes capturing all the relevant metadata and events associated with the data in its lifecycle, including the source of the data set, what other data sets were used to create it, who created it and when, what transformations were performed, what other data sets leverage it, and many other events and attributes. What is data lineage and why is it important?ĭata lineage describes the transformations and refinements of data from source to insight. This blog will discuss the importance of data lineage, some of the common use cases, our vision for better data transparency and data understanding with data lineage, and a sneak peek into some of the data provenance and governance features we’re building. We are excited to announce that data lineage for Unity Catalog, the unified governance solution for all data and AI assets on lakehouse, is now available in preview.
