Blog: Jill Dyché Subscribe to this blog's RSS feed!

Jill Dyché

There you are! What took you so long? This is my blog and it's about YOU.

Yes, you. Or at least it's about your company. Or people you work with in your company. Or people at other companies that are a lot like you. Or people at other companies that you'd rather not resemble at all. Or it's about your competitors and what they're doing, and whether you're doing it better. You get the idea. There's a swarm of swamis, shrinks, and gurus out there already, but I'm just a consultant who works with lots of clients, and the dirty little secret - shhh! - is my clients share a lot of the same challenges around data management, data governance, and data integration. Many of their stories are universal, and that's where you come in.

I'm hoping you'll pour a cup of tea (if this were another Web site, it would be a tumbler of single-malt, but never mind), open the blog, read a little bit and go, "Jeez, that sounds just like me." Or not. Either way, welcome on in. It really is all about you.

About the author >

Jill is a partner with Baseline Consulting, a data integration and business intelligence (BI) services firm. She is an internationally recognized speaker and writer on the topic of the business value of technology, and has been featured in the Wall Street Journal, CIO Magazine, Intelligent Enterprise and Newsweek.com. Jill leads the Customer Data Integration, Master Data Management and Data Governance channel for the BeyeNETWORK, and blogs regularly on those and other IT-related topics. She is the author of two acclaimed books, e-Data, which introduced enterprise data to business executives, and The CRM Handbook, which was the best-selling book on the topic of customer relationship management. Her latest book, Customer Data Integration: Reaching a Single Version of the Truth – co-authored by Baseline Partner Evan Levy – was recently published by John Wiley & Sons.

Editor's note: More articles, resources, news and events are available in Jill's BeyeNETWORK Expert Channel. Be sure to visit today!


By Caryn Maresic, Senior Consultant

summer reading by Robert S. Donovan via Flickr

Contribute to society and human well-being.   Avoid harm to others.   Be honest and trustworthy.   Be fair and take action not to discriminate.   Those are the first four items in the ACM Code of Ethics.   The ACM, for those who may not be familiar, is the Association for Computing Machinery, whose mission is to advance computing as a science and a profession.

In the course of a recent assignment with a major insurance carrier our team was asked to create various target lists for sales and marketing based on certain selection criteria.   While it is likely that all of the things they asked for were legal and ethical, we never questioned it.   As good Data Stewards, what should we have done in this case?   Should we be asking the business to justify their selection criteria?   Should we be checking to make sure there are no legal or ethical violations inherent in the rules?   A little research on the topic turned up this presentation  
which is very interesting and thought provoking.   That being said, it focuses more on the hot-topic issues like privacy and identity theft than it does the ethical dilemmas of sales and marketing.

This article tells the story of an ”Agent Profile System” set up by an insurer in Texas to rate its agents.   Agents who didn’t score well were punished by not getting any new business.   The agents filed suit contending this was illegal as it compelled them to drop clients with low credit ratings, low income, and/or those who lived in undesirable locations in order to boost their own score.   Is the IT team that built the Agent Profile System responsible, at least in part, for discrimination?

When we are dealing with situations where lives are in danger the ethical answer is clear.   For example, no reasonable person would deny that engineers working on Space Shuttle software have a duty to report concerns regarding possible malfunction.   In the BI community our issues are not always so clear cut.   Sometimes discrimination is good for the business’ bottom line, yet still unethical and possibly illegal.   If we go back to the statements ”Avoid harm to others” and ”Be fair” and ”take action not to discriminate” it appears that we should take serious our responsibility to be involved in how the business uses data.   In fact, I would argue that we should make ethical considerations part of our data governance program.

photo by Robert S. Donovan via Flickr (Creative Commons
License)



Caryn_50x50
Caryn has over 20 years experience in providing high-quality data
solutions to clients in the areas of Business Intelligence, Data
Warehousing and System Integration.   Caryn has expertise in across
industries with an emphasis in Pharmaceutical, Manufacturing, and
Insurance.   Prior to joining to Baseline, she ran her own consulting
company.



Posted July 15, 2010 6:00 AM
Permalink | No Comments |


by Caryn Maresic, Senior Consultant


Design

The Data Architect is the core of any BI team.   It is important to choose a person with the right skill set.   As I tried to put together a list of skills I looked to IT Toolbox and Database Answers for help, but my mind wandered a bit.   System Construction. Data Architect. Data Warehouse. Software Factory.   We like to portray what we do in terms of construction and/or manufacturing.   A recent client bemoaned her departments inability to move from ”building custom cars” to ”an assembly line”.   Comparing ourselves to these burly industries might make us feel strong, but it does it accurately represent what we aspire to be?

What is a Data Architect?   What should they know how to do?     I borrowed the following description from this article. Before you click, read on and see if you can guess what this is really describing.   I think it is a great description for a Data Architect:

A Data Architect is qualified by education, experience, and imagination to enhance the function and quality of systems. The purpose of this pursuit is to improve the quality of life, increase productivity, and protect the health, security, and welfare of the business.

The best Data Architects are capable of analyzing a client's needs, goals, safety and business requirements and integrating this information into a design that is both pleasing to the eye and functional. They will work with the client closely to develop preliminary design concepts that meet their aesthetic, functional, and economic needs while maintaining adherence to standards.

In essence, the best Data Architects are part detective, part artist, and part psychologist and they use these skill sets to create systems that fit a client's tastes and needs with their budget in mind.

Doesn’t that sound like a great job?   Sign me up!   What this is actually describing is an interior designer.   While I doubt that HGTV has any plans to showcase the next dashboard you build, we are indeed closer to Designing Women than Rosie the Riveter!   Stay tuned for future posts on the talents of a good Data Design Star.

photo by Annahape Gallery via Flickr (Creative Commons License)


Caryn_50x50 Caryn has over 20 years experience in providing high-quality data solutions to clients in the areas of Business Intelligence, Data Warehousing and System Integration.   Caryn has expertise in across industries with an emphasis in Pharmaceutical, Manufacturing, and Insurance.   Prior to joining to Baseline, she ran her own consulting company.


Posted July 8, 2010 6:00 AM
Permalink | No Comments |


By Caryn Maresic, Senior Consultant


Mickey Mouse by wrayckage via Flickr Creative Commons

Most Data Warehouse designs include constructs for Address, Phone, and/or Email for Customers.   Len Silverston came up with what he calls a Universal Data Model that does a very good job of abstracting address, email and phone number data.   I have seen clients use the Contact Point portion of his model as-is and with a few simplifications with great success.   That being said, in the area of Marketing and Sales, the manner in which we reach out to our customers and prospects gets more diverse every day.   Disneyland has just partnered with Verizon so that park guests can get real time information about the park and play Disney games on their phones....and, of course, Disney gets access to more information about its customers!

How does this new and ever changing world of communication change the way we think about and model contact points?   What would my ”address” look like if I were near the Haunted Mansion looking for a lunch spot?   Would it be different than if I were at Downtown Disney looking for a cup of coffee?   On Main Street looking for Winnie the Pooh?   In all instances I would be using the same phone, possibly the same IP address, but I would be in different locations which would be important to the marketeers at Disney.

As time goes by (and cell phone GPS systems become more accurate) I suspect that the way we run marketing campaigns to smart phones will be similar to the way in which we use billboards today.   Where the customer is physically located at any given time will be as important as the phone number and/or IP address, thus creating a two dimensional contact point.

Have you come across this issue in your organization?   Have you changed your data model to include two dimensional contact points?   If not, has the use of smart phones changed your data model in other ways?

photo by wrayckage via Flickr (Creative Commons license)



Caryn_50x50
Caryn has over 20 years experience in providing high-quality data
solutions to clients in the areas of Business Intelligence, Data
Warehousing and System Integration.   Caryn has expertise in across
industries with an emphasis in Pharmaceutical, Manufacturing, and
Insurance.   Prior to joining to Baseline, she ran her own consulting
company.



Posted July 1, 2010 6:00 AM
Permalink | No Comments |


By Carol Newcomb, Senior
Consultant

Diamond in the Rough: Data
Quality

The third part of my summertime primer
addresses Data Quality Analysis.   Don’t even
start a data quality
analysis until you have completed the first two steps of your Root
Cause Analysis--investigate & prioritize any potential causative
factors, and start your metadata assessment.   Otherwise, you may be
misled by your findings.


Diamonds

Data quality is defined as complete and accurate data that is ready for business consumption.   Sources of poor data quality may include lack of data entry rules, unclear data element definitions, inconsistent metadata definitions for field type, format or intent, or breakdowns in data transformation processes as data flow between systems or applications.   Poor data quality results in bad business decisions; it contributes to major problems in using data effectively, and costs companies millions of dollars/year in terms of rework and inefficiency.   Data quality, in combination with robust metadata definitions, is part of the foundation of good data governance.

Data Quality Analysis

A Data Quality Management process should be designed to enable an area to start with a simple approach and over time to mature to one that is more proactive and comprehensive.   Initially, investigation may be focused on single data elements or events.   As patterns, data commonalities and other relationships appear, the data quality management process will grow to support complete business processes.     A mature data quality management process will not just resolve individual issues; it will also track relationships between data elements, ensure that business rules are consistent and generate statistical analyses to monitor previously addressed issues to ensure that data quality is stable and that an early warning system is in place as part of the data governance program.   The goal is to design a data quality management lifecycle, as shown in this diagram:


Carol_fig1

Initial Data Quality Analysis Process

I. Define data scope


    • Determine data elements that are associated with or are direct results of the reported issue

    • Check that all metadata definitions are present and current

    • Enlist the involvement of the Data SME or Data Stewards

    • Identify all source systems where the data originates, is   entered or derived



II. Extract and profile the data


    • Extract the relevant data from all key source systems.

    • Design the profile.   A profile will consist, at a minimum, of total record counts, min/max values, frequency of unique values, and frequency of invalid values (if defined) for each data element profiled.  

    • Profile the data to determine key characteristics that are contributing to the issue, such as:


      1. Wrong values

      2. Missing values

      3. Corrupt transformation processes

      4. Incorrect business rules

      5. Incorrect usage rules





III. Analyze Data Profile Results


    • Summarize the key findings from the profile detail

    • Determine what key drivers are contributing to the impact

    • Determine accountability for the data quality issue

    • Involve other Data Stewards in troubleshooting and designing the data quality solution



IV. Design the Corrective Action Plan

Two types of plans should be developed to address known data quality issues: a corrective action plan to fix the immediate source of the problem identified, and an ongoing monitoring plan, where thresholds have been determined and metrics are routinely collected and reported to data stakeholders.   This monitoring process should be scalable based on the number of data elements being tracked.



    1. Corrective Action Plan


      • Does scope of problem warrant change in metadata definitions, business practices or data entry rules?



      • Does scope of problem warrant a data governance standard?

      • Does the corrective action plan include details on how to fix the source of the problem as well as ways to correct historical data in the system?



    2. Preventive Action Plan


      • This plan will be designed to minimize the probability of data quality issues from recurring

      • Determine ‘early warning triggers’ based on designated thresholds.   These thresholds should reflect the business tolerance for inaccurate data (is 95% acceptable?)

      • If data latency is the source of a data quality issue, then latency thresholds should be included in the monitoring plan

      • Determine how frequently results of the monitoring plan will be reported to data stakeholders or governance oversight committees





Carol_fig2
So, now that summer is officially here, this wraps up my Data Governance Primer series.   Time for some iced tea and my favorite beach towel.   Come August, these little refreshers might be just the thing!

photo by Swamibu via Flickr (Creative Commons License)


CarolNewcomb_thumb Carol
Newcomb is a Senior Consultant with Baseline Consulting. She
specializes in developing BI and data governance programs to drive
competitive advantage and fact-based decision making. Carol has
consulted for a variety of health care organizations, including Rush
Health Associates, Kaiser Permanente, OSF Healthcare, the Blue Cross
Blue Shield Association and more. While working at the Joint Commission
and Northwestern Memorial Hospital, she designed and conducted
scientific research projects and contributed to statistical analyses.



Posted June 24, 2010 6:00 AM
Permalink | No Comments |


By Carol Newcomb, Senior Consultant

Minding Your Metadata

The second part of my summertime primer addresses ‘Minding your Metadata’.   I can just hear the collective groans and yawns now.   Sorry, but metadata collection is one of those necessary evils that may not be fun in the doing, but having it available as a resource to understand your data and use it appropriately is invaluable.   And you just might find some interesting surprises along the way!


Carol_image3

Metadata: What Is It & Why Do I Need It?

As you start your Root Cause Analysis (see last week’s primer), you first need to examine existing data definitions (or lack thereof).   Metadata is the foundation of good data management and forms the basis for Data Governance.     Pardon me for stating the obvious, but metadata is fundamental to investigating and resolving data issues and it is the first place to start when investigating data quality issues.

Metadata is ”data about data”.   Plain and simple.   It includes descriptive information about electronic data used in common daily business practice.   Metadata includes items usually found in a data dictionary: field name, field length, retention rules, and security access, as well as additional descriptive information that may include data origin (source or system), creation/entry date, method of creation (key-entry or the result of a calculation), purpose of the data (its intended use), how frequently it gets updated or refreshed, and current location in a database (table, view, schema).   If a data element is the result of calculation logic or groupings (such as age categories), those business rules used to generate the resulting data values should be collected as part of the metadata.

A good example of metadata that you may use every day would be ‘document properties’ in a Word document.   This feature captures data on the original document creation date, most recent access and update times, document creator, count of characters, words and pages.   If the document should be private, this will be indicated in its properties.   You may also tag the document by indicating key words in order to make it easier to find by you or others.

A few of the benefits of Metadata Management include:


  • Clarify rules for data entry

  • Reduce ambiguity around appropriate use of data elements

  • Eliminate problems associated with not having data definitions, business rules or transformation logic available

  • Validate legitimate values at the data element level

  • Provide evidence to regulators that security and confidentiality are protected

  • Centralize the storage and accessibility of metadata for end-users

  • Reduce the amount of effort required to research data results.


A Metadata Management Repository is a central location or system to collect and store metadata that may exist in disparate parts of the organization (data dictionaries, systems, spreadsheets, or people’s brains). The metadata repository will store detailed definitions centrally on a network where other users can find it.

There are three general sources of metadata that should be included in this repository:

Business Metadata – Business metadata attributes facilitate identification, understanding, and appropriate use of existing data elements.   These include clear business names and descriptions, relevant business rules, descriptions of the data sources, security and privacy rules, etc.  
Technical Metadata – Describes the technical attributes of data such as physical location (host server, database server, schema, etc.), data types, any transformations applied and domain of valid values, relationships to other data elements, precision, and lineage.   Technical metadata is used by business users and by IT staff to design efficient databases, queries, and applications, and to reduce duplication of data.  
Operational Metadata – Describes the attributes of routine operations on data and related statistics.   These include job schedules and descriptions, data movement and transformation processes, data read, update and performance statistics, volume statistics, backup and archival information.   Operational metadata is used by operations staff, and DBA’s to tune the system and ensure its continued efficient operations.   It is also used by business users to track such events as ”last use” of a field, and ”last load” of a data element.
Exciting stuff, huh?   Well, the whole point of metadata is to have the information about data available to a multitude of users when they need it, to keep it current, and to avoid confusion around usage.   So if you appreciate having a clean bathroom, and knowing where you keep your antiperspirant, you will also appreciate having good metadata!   The time for spring cleaning is well overdue.

CarolNewcomb_thumb Carol
Newcomb is a Senior Consultant with Baseline Consulting. She
specializes in developing BI and data governance programs to drive
competitive advantage and fact-based decision making. Carol has
consulted for a variety of health care organizations, including Rush
Health Associates, Kaiser Permanente, OSF Healthcare, the Blue Cross
Blue Shield Association and more. While working at the Joint Commission
and Northwestern Memorial Hospital, she designed and conducted
scientific research projects and contributed to statistical analyses.



Posted June 17, 2010 6:00 AM
Permalink | No Comments |


By Carol Newcomb, Senior Consultant


Newcomb_Graphic_01b

They say that Data Governance is about People, Process and Organization.   Much of the design work in planning for data governance is around people’s roles and responsibilities, then designing the organizational structure that will provide authority for decisions to be made and enforced.   The processes, however, are not new.   They are probably already being practiced within your organization, just in a decentralized, informal way.   In this blog series, I discuss the processes for 1) investigating and isolating the data quality issues—Root Cause Analysis—, 2) starting to collect complete Metadata Definitions, and 3) performing Data Quality Analysis.   Only when your governance group has worked through each step, in order, will you be more likely to design the appropriate solution.

Root Cause Analysis

The process of data governance is fundamentally very simple.


  1. Identify the data quality issues to address

  2. Prioritize the portfolio of issues to isolate/tackle the most important

  3. Perform Root Cause Analysis to determine the true source of the data issue

  4. Design the corrective action

  5. Formalize the correction through consideration & approval by the Data Governance organization

  6. Implement the fix

  7. Monitor the results


It seems like when we start to map out the discrete steps involved in the data governance process, much of the work is already being done in informal ways throughout the organization.   What some folks don’t realize is that data governance is often nothing more than formalizing a whole bunch of informal processes that either don’t get communicated, or aren’t accepted as a data standard.

Root Cause Analysis is the process of identifying probable causes of a data issue, and isolating the contributing factors.   In order to resolve any particular issue, root cause analysis involves fact-finding, drilling into details of the problem, talking to the right people, and separating out other associated (but not contributing) factors.

A standard tool for supporting the detailed findings is the Ishikawa Diagram, below.   


Newcomb_Graphic_02
To conduct a thorough Root Cause Analysis, use the following checklist:

  • Diagnose the problem as if you are a physician or a detective. Consider all possible sources of the symptom. Don’t rule anything out yet!

  • Boil the ocean—be exhaustive and creative.

  • Don't practice problem solving before collecting all possible causes.

  • Practice the ”5 Why’s”—don’t stop asking ”Why” until you have exhausted every conceivable potential reason.

  • Rank the factors if possible.   Identify the Primary causes versus the Secondary or associated factors.

  • Rule out each possible factor one at a time.   Justify why (you may need to come back to this later).

  • Find all potential business process and data owners to involve them in your understanding of the possible sources of the problem.

  • Share the findings with everyone involved in troubleshooting. They could rule out certain factors with their knowledge.

  • Test your hypotheses with actual data.     

  • Fix the problem and test again.

  • Publish/share your findings and fixes.   Communicating your findings may reveal additional factors you hadn’t considered.


After a thorough Root Cause Analysis has been completed, Data Stewards should proceed to Metadata Analysis and Data Quality Analysis.   These two techniques will be discussed in my next blogs.


CarolNewcomb_thumb Carol
Newcomb is a Senior Consultant with Baseline Consulting. She
specializes in developing BI and data governance programs to drive
competitive advantage and fact-based decision making. Carol has
consulted for a variety of health care organizations, including Rush
Health Associates, Kaiser Permanente, OSF Healthcare, the Blue Cross
Blue Shield Association and more. While working at the Joint Commission
and Northwestern Memorial Hospital, she designed and conducted
scientific research projects and contributed to statistical analyses.



Posted June 10, 2010 6:00 AM
Permalink | No Comments |


By Caryn Maresic, Senior Consultant


Parents on Vacation via Flickr (Creative Commons)

Julia’s parents were planning a vacation.   Her mother thought Pensacola would be a great destination—she’s heard so much about the wildlife, especially the dolphins!   Her father wants to see the National Naval Aviation Museum and the Blue Angels.   Since Julia’s traveled extensively, her parents asked her to make all the arrangements.   While having dinner with them to discuss plans, she jotted down the following notes:


  • Location:   Moderately-priced hotel close to water/sights.

  • Budget: $3,000 for transportation and accommodations.

  • Activities:   Beach and nature activities (Mom), science/historic sights (Dad)

  • Duration: 10 days.


Julia felt honored that her parents trusted her to get the job done.   After doing some online research, she made all the reservations and met with her parents to review the reservations.   She eagerly awaited the look on her parents’ faces as they scanned the vacation itinerary and read through the glossy brochures.

”Hawaii?”, they said in unison.   ”We didn’t want to go to Hawaii!"

"Honey, we chose Florida because we can drive there.   I don’t want to fly anymore.   Flying is such a pain,” Dad grumbled.

”I appreciate what you’ve done, Julia, but an old friend of mine lives near Pensacola and I was hoping to visit while we were there.” said Mom.

”But, Mom!”, exclaimed Julia, ”You said you wanted beaches, dolphins, sunny weather.   Dad, you like science and history—what about Pearl Harbor?   You two can’t go to the gulf coast—what about the oil spill?”

What happened here is typical of what happens to IT projects all the time.   It’s easy to say that we wouldn’t do what Julia did.   Would we?   Don’t we oftentimes:


  • Interview the business and record the requirements in an abstract way.

  • Believe that the we can deliver something better than what the business asked for.

  • Assume that the business lacks the capability to understand the technology.

  • Fail to get all of the requirements.   Not exactly our fault, but still a problem.

  • Neglect to keep the business involved in the process.


There has been a lot of buzz on IT-Business alignment of late, including this article on some specific companies that are going the extra mile: Beyond Alignment—as well as this one on lack of user involvement: Why IT Projects Fail: Lack of User Involvement.   Most companies aren’t as progressive.   The willingness to work together has to occur at all levels. Only when we let them drive can we deliver, if not what they asked for, then at least something useful.

photo by stevendepolo via Flickr (Creative Commons license)



Caryn_50x50
Caryn has over 20 years experience in providing high-quality data
solutions to clients in the areas of Business Intelligence, Data
Warehousing and System Integration.   Caryn has expertise in across
industries with an emphasis in Pharmaceutical, Manufacturing, and
Insurance.   Prior to joining to Baseline, she ran her own consulting
company.



Posted June 3, 2010 6:00 AM
Permalink | No Comments |


By Rob Paller, Consultant


Buried_in_sand by eden pictures via Flickr (Creative Commons License)

Recently at a client, the data warehouse administrator was asked to define a sandbox environment in the production data warehouse for   analysts and developers working on a small project. The idea behind this sandbox was to allow the team a working area for collaboration and intermediate storage of results while working with the data in a purely ad hoc capacity. Instantly it was recognized this could be the start of something bigger within the organization—something that could not currently be provided by the incumbent business intelligence tools. The response had to be formulated quickly in order to avoid stifling the creativity of the analysts—or worse, the progress of the project—but care had to be taken as well; if managed incorrectly it could get out of hand and become a waste of system resources and a drain on human resources that had already been spread thin.   The business unit in question is looking to move from the confines the current business intelligence environment and push the edges.

This was a group of analysts that wanted to get their hands dirty and weren’t afraid to fail. They wanted to mash data together that previously could not be done by the business intelligence tools in their controlled ad hoc environments. This was data mining for the next set of KPIs that would shape the way business moves forward.

The concept of agile analytics is not new, eBay presented on and blogged about this concept in 2008. The idea at this client was simple. By leveraging the existing enterprise data warehouse system to house their sandbox environment the duplication of data is all but eliminated. Groups interested in sharing data between their sandbox environments are strongly discouraged until the data has been properly integrated into the production environment. The sandbox environments would also be given a short life expectancy at their inception to prevent the prototypes from becoming production and data ending up in a wasteland. This all sounded great on paper.

In the midst of a development architecture overview, a brief conversation among a few enterprise architects uncovered the potential Screw-Me Scenario that could bring the concept of agile analytics to an untimely demise. ”The users of the data warehouse are not permitted to write ad hoc queries outside of a controlled business intelligence tool. They might write a bad query.” Thanks for the warning, we’ll be sure to refine our pitch to the enterprise architects to diffuse this scenario before it turns ugly.

In Oliver Ratzesberger’s presentation for eBay’s Analytics as a Service, he acknowledges that the metrics we already know are cheap and the unknown metrics are expensive. But the known metrics are not pushing the edges. Known metrics are found in the middle of the box. Agile analytics is about pushing the edges about how your enterprise data warehouse is used to improve response to the needs of the business. It is about the evolution of the user community from one who plays in controlled ad hoc environments to encouraging them to experiment with new ideas and not to fear failing along the way. Agile analytics is about encouraging your users reach out for the edges and P U S H. Only once the edges are stretched can the middle of the box redefined.

photo by edenpictures via Flickr (Creative Commons
License)


RobPaller_bw_100Rob Paller is an expert at business analytics and database
administration. Since joining Baseline, Rob has been responsible for
developing a case analysis system to streamline the oversight of food
assistance benefits, implementing a common citizen data model, and
assisting in the rollout of a new public assistance data model
integrating data from over 10 years of legacy with a new benefit
eligibility determination system.


Posted May 27, 2010 6:00 AM
Permalink | No Comments |


By Caryn Maresic, Senior Consultant


Logical-Data-model

Logical Data Models (LDMs) were the standard means of recording business rules and data definitions back in the ‘80s and ‘90s.  Business and IT partnered to learn the art of data modeling and 3rd normal form in hopes of finding a common ground to record requirements.  Over time, it became evident that in today’s fast-paced development environment the business doesn’t have the time to digest all the nuances in a data model.  It is IT’s job to gather information via interviews.  Many times, we skip the LDM task because we don’t have time and LDMs are too academic and we find ourselves asking...

Are Logical Data Models really necessary?  Yes!

LDMs drive BI design by defining business rules independent of physical implementation.  There are many tools (CA ERwin Data Modeling, Embarcadero Technologies), and techniques (UML, IDEF, Bachman) in use today. To illustrate the difference between the benefits of a PDM and an LDM, let’s look at an example:  Customer/Address relationships.

Graphic_01

The PDM above (created using) supports many Customer/Address business rules, but it does not tell me the Customer/Address relationships as defined by the business.

In contrast, the following LDM is more rigorous and definitive:

Graphic_02

What does the LDM above tell me that the PDM did not? 

  • There are two types of Addresses – Physical and Mailing.
  • The Customer Bill To Address can be either a Physical or Mailing Address
  • Customer Ship To Addresses must be Physical Addresses.
  • Every customer must have a Bill To Address.
  • A Customer has zero, one or many Ship To Addresses.

All of these relationships can be implemented using the PDM, but they aren’t dictated by the PDM.  The LDM explicitly defines each relationship and prompts us to ask better questions of the business, such as:

  • Are there any Customers who do not have a Bill To Address?
  • Are there any Customers who do not have a Ship To Address?
  • When there are multiple Ship To Addresses, how do you know which one to use?
  • Are there any Customers who have more than one Bill To Address?
  • Do you know of any upcoming business initiatives that might change these rules?
  • Was there ever a time when the business rules were different?

The answers to these questions may require changes to the model, but without the LDM, we may not have asked the right questions!  The best part about the LDM is how valuable it is to the rest of the development process.  It will be used in production of training materials, test plans and test cases, data profiling and migration efforts and, finally, physical database design.


Caryn_50x50 Caryn has over 20 years experience in providing high-quality data solutions to clients in the areas of Business Intelligence, Data Warehousing and System Integration.  Caryn has expertise in across industries with an emphasis in Pharmaceutical, Manufacturing, and Insurance.  Prior to joining to Baseline, she ran her own consulting company.



Posted May 20, 2010 6:00 AM
Permalink | No Comments |


By Carol Newcomb, Senior Consultant


Buzz Lightyear via Simone Ravella on Flickr

This is the final
installment in a 3-part blog series, discussing the opportunities that
cloud computing offers in healthcare. I present futuristic scenarios
from each healthcare contingent’s vantage point:   patients, providers
and payers.     A myriad of technologies exist today.   It will be up to
healthcare organizations of all types to get their data ready to meet
the demands for data integration, security, portability, transparency
and accountability in this brave new world.   Mature data governance
systems and enterprise-wide data integration will be critical in this
endeavor.

III: Payers

I’ve worked in the insurance industry for 30 years now, and I’ve never seen anything like this!   For years, we had consultants running around here, designing portals, databases, dashboards and training us on new software to use in running analyses.   But half the time, we just went back to using the antiquated systems that most of our data servers were designed to use.   In the last couple of years, we have completely stopped talking about those different hardware platforms.   Now, the Risk Management division, Actuarial, Medical, Quality, Marketing and Claims Processing all use one software package; all the data management (since we get millions of transactions and claims each day) is handled offsite.   It’s never been this simple!   We still have our Data Governance Committee that works with each department to flush out any issues, but things have really improved.Since all the data is now integrated on what they call ”The Cloud”, we get daily dashboard updates on our desktops, showing use trends, costs, disease prevalence around our region, geographic mappings of where the large employers have families, results of marketing activities, and new health and wellness information that is also shared with our subscribers.   We had a period of some pretty large layoffs, since our entire IT department is now outsourced, and all those consultants have disappeared.

Our Fraud and Risk Management analysts used to have hundreds of spreadsheets on their desktop computers, which they would constantly be updating and piecing together to look for patterns or gaps.   Now they can use more sophisticated statistical detection algorithms because the data is updated and fed to them each night.   The simplest algorithms are actually run offsite, and those results are delivered to them daily.   From those, they can then drill into services or providers that look suspect.   They can now turn around reports in about 2 days, where it used to take them 2 months!

Our business model has shifted from using claims history to cut anticipated high-cost cases, to using the data from those same claims to design health and wellness programs stressing prevention and care management.   If we see pockets of infectious diseases in one region, for example, we do further analysis on the saturation and specialty mix of our network providers, the employer group mix, and the educational factors that may be contributing to higher medical costs.   We then work with the schools, employers and practitioners to combat the spread of that particular disease.   We have been able to avert hundreds of hospital admissions through lower cost prevention measures, which we track through our ”Population Health” Program.   Across our company, there is a competition to target and drive down claims rates, and each group gets an annual part of their bonus based on their results.   It’s pretty interesting.

Even though we don’t get as much data through claims-processing as the local hospitals get in their clinical systems, we participate in national research studies of medical effectiveness.   Some procedures that we used to consider experimental, we now collect data on and enlist our Actuarial Department to help with statistical analysis and cost-accounting.     Our Medical Quality Department works with other government-sponsored research groups to compare the results of our findings, and often they lead to some pretty surprising conclusions.   Where we thought we were saving money in the past, we were actually driving up utilization in higher cost medical facilities, but now we’re helping encourage subscribers to get medical attention sooner, which prevents some pretty catastrophic claims.   We couldn’t do this before because our Utilization Review desk would deny those services that we now know help reduce lifetime costs.

Probably the best benefit we’ve gained from using Cloud computing services is that our claims processing is now practically effortless.   Twenty-five percent of our cost of business was dedicated to claims handling, denials, reviews, exceptions, and remediation.   Not only have we saved money, we have reduced premiums to our subscribers, and the healthcare providers have reduced the cost of their back-office operations, which used to handle all the churn of rejected and resubmitted claims.   They now get paid faster and we save money.   How’s that for amazing! We’ve also noticed another side benefit: our subscriber turnover rate has dropped by 10%!   Customer satisfaction rates have never been higher!




Integration of claims data with population-based data and actuarial
model results has significant business impact.   Clearly, departmental
governance representation was required to mesh all the different data
types.   Several large data warehousing systems can be housed and
integrated using Cloud technology, but the rules in how to match and
align different data collected for entirely different purposes is the
key to analytic power in this example.   As companies offload daily
transaction processing, which can be automated and scaled, business
dollars can be better deployed to more strategic purposes, and those
resources then achieve more with the data at hand.


photo by Simone Ravella
via Flickr (Creative Commons license)


CarolNewcomb_thumb Carol
Newcomb is a Senior Consultant with Baseline Consulting. She
specializes in developing BI and data governance programs to drive
competitive advantage and fact-based decision making. Carol has
consulted for a variety of health care organizations, including Rush
Health Associates, Kaiser Permanente, OSF Healthcare, the Blue Cross
Blue Shield Association and more. While working at the Joint Commission
and Northwestern Memorial Hospital, she designed and conducted
scientific research projects and contributed to statistical analyses.



Posted May 13, 2010 6:00 AM
Permalink | No Comments |