Understanding the data modeling capability¶
Note
The Data Playbook defines a set of capabilities that represent conceptual building blocks that are used to build data-related solutions. See Defining data capabilities to see the full set of capabilities defined in the playbook.
Data modeling is the process of analyzing and defining all the different data types businesses collect and produce, as well as the relationships between those bits of data. Data modeling concepts create visual representations of data as it’s captured, stored, and used.
The benefits of data modeling and the concepts of data modeling can be found on data modeling - Microsoft and data modeling - Wikipedia pages.
Data modeling tools - Wikipedia has a comparison of some data modeling tools.
Most applications are built by layering one data model on top of another. It is common to see multiple types of databases and multiple modeling techniques that are employed by a single modern big data system. The diagram - Types of Data Modeling shows different data modeling techniques and how they are employed in a data system. Click on each model to learn more.
Understanding data modeling characteristics¶
The most important consideration in data modeling is creating a foundation for a database that can rapidly load, retrieve, and analyze large data volumes. An effective data modeling concept requires mapping business data, connecting the relationships between that data, and understanding how the data is used. Key considerations include:
- Materialization: Tables vs Views; Type of storage and storage formats used.
- Permissions and Governance: Data access policies, data security and confidentiality.
- Storage usage/cost: Optimize for storage costs and data retrieval performance.
- Business requirements: Query patterns, data granularity, data frequency, aggregations, and reporting requirements.
- Performance: Indexing, partitioning strategies, and query latency.
- Portability: Compatible with other data models in the organization.
- Expandability: Model systems for growth.
Learn more about data modeling in Microsoft Fabric¶
- Microsoft Fabric supports a Power BI Semantic model which is a logical description of an analytical domain, with metrics, business friendly terminology, and representation, to enable deeper analysis.
Refer to data warehousing experience with the SQL analytics endpoint of the Lakehouse and scenarios for use of the Lakehouse in data warehousing.
Implementations¶
- For guidance on using SCD types and an example implementation of SCD type 2, refer to the SCD Type Example implementation.
For more information¶
- ER model - Research paper by Peter Chen
- ER model - Wikipedia
- Data models and query languages - Chapter 2 - Designing Data-Intensive Applications by Martin Kleppmann
- Dimensional-modeling-techniques
- Best-practices-for-dimensional-model-using-dataflows
- Guide-to-uml-diagramming-and-database-modeling
- Data-modeling-on-Power-Platforms
- Synapse Database Templates
- Dimension model - Star schema
- Dimensional model - Snowflake schema
- Data: Graph Database Best Practices