Monday, November 10, 2014

Data accuracy and trends- How to measure the anomaly and resolve it

There is always a disconnect between applications team and Data-warehouse teams in an enterprise about the accuracy of the data from source systems. Data team has its own arguments with their own merits as to why they need accurate and predictable data from source systems and applications team have their own as to why it cant always be 100%.

Primary reason for application team to have problem providing 100% accurate data is the continuous evolution of business applications and systems and ongoing bugs  in applications which tend to create anomaly in the data. Data team has its argument that unless they get accurate data its hard to do meaningful analysis and build reports on it.

There are some ideas in my mind how this gap can be bridged if not eliminated totally:

  1. How much accuracy is a good level of accuracy: Lets face it that unless you are in some kind of transactional application such as banking application you will always have to rely on un predictable and random end user behavior. This aggravates when its a B2C application where primarily data is generated by end user behavior and interaction with the application. There will always be edge cases, forget the bugs in application( which would also never go to  zero). that means its a good point for two teams to get together and define the business level adherence to the accuracy of the data.
  2. Fix it by process: Another way to reduce data in accuracy in an ever evolving business application is to have a tighter integration of data team with app development team in their development process. Its always a good idea to have these two teams communicate to each other on an ongoing basis of up coming development plan and changes to the application.
  3. QA it!. Another way to reduce data gaps introduced by bugs in any release of an application to have QA plan and execute test plans with reporting teams input. Any good QA org should build test cases keeping data needs in mind and with data teams input.
Please provide other ideas if you may have.

No comments:

Post a Comment