Pentaho

 View Only

Building an enterprise business glossary

By Rishu Shrivastava posted 03-23-2023 06:00

  

Introduction

Organizations are working with a lot of data to boost insights and drive critical business decisions. However, they often struggle to create a common data vocabulary. This results in the inconsistency of business definitions across the organization's data. A common data vocabulary is the key component of a successful data governance program. Data catalog tools like Lumada Data Catalog use features like Business Glossary to enable companies to develop a common data vocabulary.

There are many benefits of building a business glossary and why an organization should invest in building a business glossary. However, building a business glossary comes with its own set of challenges. This blog post provides an approach to successfully building a business glossary.

Defining Business Glossary

A business glossary is a collection of related terms, definitions, and other properties explained in clear language for every member of an organization to understand. A business glossary ensures that an organization speaks a common data language when discussing data. It helps clear any ambiguity in business terminology or understanding what a particular field in a database holds. A business glossary further helps set up data governance policies and data quality standards to improve trust and acceptance of the data within an organization.

Some of the attributes of a business glossary are mentioned in the below figure.

 


Fig-1: Components of a business glossary

 

 

How an enterprise benefits from a business glossary

The benefits of a business glossary are as follows:

  • Improved understanding of your data

A business glossary enables an improved unified understanding of your data within your organization. It promotes the clear, unambiguous definition of business terms, KPIs (Key performance indicators), data policies and owners.

  • Common business language

Business glossaries enable the organization to have a common understanding of the data. It drives consistency and enables correct interpretation when building data-driven solutions across the organization.

  • Establish Ownership and Accountability

A data catalog tool like Lumada Data Catalog enables the tagging of owners and drives accountability. It's hard to find subject matter experts sometimes in large organizations. Lumada Data Catalog simplifies it to identify the subject matter experts on the associated data by saving time and effort. 

  • Data policies governance

Every organization needs to comply with the data policies like GDPR, HIPAA, etc. Business glossaries help an organization set data policies at a logical level and enforces implementation at the physical level. A business glossary serves the key role within that abstraction. For example, if an organization classifies “Credit Card Numbers” as “Personal Identifiable Information (PII) / Confidential” in a data catalog, then the physical implementation also needs to adhere to the definition of confidential by making the underlying data anonymous/masked. On the implementation approach, companies could make use of some clever scripting or make use of Lumada Data Catalog REST API endpoints to help enforce the policies.

  • Create trust in data!

Every database attribute does not necessarily adhere to the same data quality rules. A business glossary enables the data teams to build the data quality rules with the business expectations. It facilitates uniform quality checks based on a business term across all the data sources.

 

What are the challenges of building a business glossary?

Building a business glossary for an enterprise is a challenging and time-intensive process. The challenges of building a business glossary are not just technical problems but also understanding your data and making the organization accept your changes.

  • Technology

When rolling out a business glossary, an enterprise must have the right tools to capture relevant business definitions, relationships, and metadata. Data Teams usually use Text Documents/MS Excel/MS PowerPoint to start documenting business glossaries. This choice of technology makes managing and communicating glossaries difficult. Organizations must invest in data catalog tools that comply with industry standards and use the latest technological advancements.

  • People & Knowledge

One of the key factors when building a glossary is the need for the right people in your organization. A combination of the right people i.e., subject matter experts and data stewards are essential to ensure their knowledge when building a glossary is considered. It is a challenge for data teams to bring consensus on a global definition of a business glossary and its related terms.

  • Organizational Processes

When building a business glossary, it is essential to understand the key processes and policies which your data team executes. In large organizations, these processes and policies are highly convoluted and complex. It is difficult to understand these processes resulting in time loss and frustration.

  • Writing Skills

It is safe to say that not everyone loves to write, especially if it is documentation. In a large organization, many of your subject matter experts will not entertain the idea of documenting their data/domain knowledge because they know their thing too well. This becomes a challenge for an organization if that person leaves the company and all the valuable knowledge is lost. Hence, data teams should build business glossaries that will retain such knowledge over time and reduces dependencies.

To overcome these challenges, an enterprise needs to carefully build a structure around its data to facilitate the smooth building of business glossaries. In this blog, I have tried to highlight below the key stages of building a business glossary in an enterprise.

Stages of building a business glossary

 


Fig-2: Stages of building an enterprise business glossary.

 

Building a business glossary for an enterprise can be broken down into four stages. As a data professional, you will follow these stages during your glossary-build process. It is also important to note that a glossary-building process is not a one-person job. It requires a team of data stewards, subject matter experts, data owners, and data engineers to iteratively support you with the necessary artefacts in your journey to build a glossary.

  • Information Gathering Stage

This stage is the first and preliminary stage that you should execute. It is essential to investigate, search and collect all the knowledge that is available around your data. It helps to reduce the time and effort in building a business glossary. The knowledge acquired at this stage makes it easier to build a business glossary. 

During this stage, it is also essential for teams to start small and narrow the scope of the work. It will help better control the job of building the glossary. Also, consider using industry standards such as ISO 27001 and ISO 11179 to identify and prioritize the glossary-building work. For example, you can first start with the sensitive data discovery in your project.

  • Defining Business Terms Stage

This stage lets you define the business terms or elements and capture the attributes and relationships with other business terms. The key components like business glossary & term descriptions, owners, sensitivity properties, data associations, governance workflow, etc., need to be considered in this stage. The business glossary and its components created are in line with the overall business goals and are based on the information gathered in stage-1 of the process.

  • Building Consensus Stage

Building consensus is perhaps the most difficult stage of glossary building. This is when a business glossary created in stage-2 needs to be circulated to the relevant stakeholders in the organization for review and accepting any proposals or changes during the process. It is an iterative process and involves getting the overall majority of the changes suggested to a business glossary.

  • Communication, Tagging and Curation Stage

The last stage revolves around communicating the changes to the broader audience, primarily targeting the data teams who are enforcing the glossary. It is also essential to communicate the impacts of the business glossary.

At this stage, the data catalog team can also start tagging data objects to a business glossary by enabling the AI/ML model to learn new rules about the data.

At this stage, data teams are responsible to monitor the real-world impact of the business glossary. Any suggestions or changes required can be added to the new revisions list of the business glossary.

Team Structure and Roles Assignment

 


Fig-3: Sample Team Structure and roles assignments

 

Building a business glossary for an enterprise is not a one-person job. It requires a team from multiple departments to collaborate and communicate efficiently. The above diagram is a sample team structure and roles assignment. A team consists of the Data Catalog Team (Business Glossary Builder), Data Stewards / Owners, Subject Matter Experts from both the business and engineering sides, Data Analysts and the Engineering Team.

  • The Data Catalog Team (Business Glossary Builder) is responsible for communicating, coordinating, and building out the business glossary as part of the primary role within an organization.
  • During the building of the glossary stages, a team of experts (Data Stewards and Subject Matter Experts) are responsible to guide the data catalog team to take the right direction when building out any specific attributes of a glossary.
  • The team of Data Analyst and Engineering Team are responsible to enforce the newly added/renewed business glossary into their data pipelines, reports and dashboards to ensure uniformity between the business and engineering teams.

The structure and implementation can change per organization, and it is advisable to take expert help when setting up catalog and business glossaries. A pitfall I have seen in the past is not having enough dedicated time from the supporting teams around the data catalog team.

 

Conclusion

In an enterprise, data teams can create more than 2000 business terms when building a business glossary. Starting with the right approach and team is the key to the success of business glossaries. Starting on small use cases and reusing the existing knowledge is the right way forward for any data team. Strong collaboration and effective communication among the data experts and teams also assist in successfully building a business glossary.


Lumada Data Catalog enables its user to help efficiently build business glossaries and terms.  

Learn more about Lumada Data Catalog and how it helps your organization drive business value out of your data.

Try Lumada Data Catalog today to find, understand and govern your data.

Author: Rishu Shrivastava, Solutions Architect - Lumada, ASG, Hitachi Vantara, UK

Reviewer: Jon Hanson, Lead Architect - Lumada, ASG, Hitachi Vantara, USA

1 comment
64 views

Permalink

Comments

04-04-2023 03:53

Very good article by my colleague @Rishu Shrivastava on how to build a business glossary that allows you to create a robust Data Governance practice that puts the data, securely, in the hands of those people who can get the most value of it.

I would just like to add that, based on my experience, you can approach a Data Catalog project in, at least, three ways.

One option is to start "just" from the business, creating the entities that serve to describe the business, without paying too much attention to the existing data and how it is stored. The downside here is that it can then be difficult to automate the governance or map those entities to the actual data that IT handles.

Another option is to focus on the data you have, analyzing databases, data lakes, clouds and collecting all the information contained there without paying too much attention to the business entities. The downside here is to find an "infinite" glossary that does not represent the business and does not allow the data citizens to access self-service by having a too technical representation of the data.

From Hitachi Vantara, thanks to our Lumada Data Catalog solution, we combine both approaches to allow an effortless auto-discovery of existing data while our glossary capabilities speed up the construction of a business glossary, including business entities and relationships between business terms, while testing very quickly how these terms can be associated with data thanks to our patented #ml #artificialintelligence  #algorithms 

If you are interested in knowing more about our solutions, you know where to find us.

#datacatalog  #datagovernance  #Data