Sunday, January 12, 2014

Innovating with Cloud & Big Data: 2013's People for a Smarter Planet A Smarter Planet Blog

Building a Smarter Planet takes people who are passionate about using technology to improve the world and who are eager to innovate and take risks. During the past year, we profiled eight of these "People for a Smarter Planet," putting the spotlight on researchers, engineers, inventors and innovators who are focused on the future.
They include people like Lisa Seacat DeLuca , a young, prolific software engineer on the cutting edge of advanced cloud solutions; Andy Stanford-Clark , a pioneer in smarter energy solutions; Marie Kenerson , who's using cloud technology to bring quality healthcare to Haiti; Uyi Stewart , chief scientist at IBM's first research lab in Africa; and Michelle Zhou , who sees Big Data as a means for world peace and not just corporate profits.
They also include people like, IBM software engineer and master inventor, Brian O'Connell , who has been architecting the television and online sports infrastructures with Big Data analytics; Thomas Schaeck , IBM distinguished engineer, who's at the forefront of social networking and collaboration in the enterprise; and IBM researcher, Jeffrey Nichols , who is exploring the value of social media beyond sentiment.
Take a few minutes and meet these people who are working with IBM to build a Smarter Planet.

Saturday, July 23, 2011

NoSQL, NewSQL and MDM


Fixing poor data quality at its source and managing constant change is what Master
Data Management is all about. As new database technologies are evolving they will change the MDM solutions landscape improving performance and scalability of working with large datasets(billions of rows). Most MDM solutions use RDBMS(MSSQL, DB2, Oracle) for managing data and we all know performance tuning is a pain point in today’s MDM solutions, lets look at the exciting new things happening in the database world as these improvements will come to the MDM landscape sooner or later.

Friday, July 22, 2011

HTML5 and MDM – What you need to know?

 

There is nothing you need to know about HTML5 from an MDM perspective. In future maybe HTML 5 might have some graphing tags which could be useful for Business Intelligence but at this point it is a far fetched dream. Anyways it is good to be aware of new technologies as you never know what you can benefit from.

Thursday, May 6, 2010

What does business need ?


An excerpt from Data Intelligence Gap 

So, exactly what is it that the business needs to know that the data can’t provide? Here are some examples:
What the Business Wants to Know
Data needed
What’s inhibiting peak efficiency
Can I lower my inventory costs and purchase prices? Can I get discounts on high volume items purchased?
Reliable inventory data.
Multiple ERP and SCMsystems. Duplicate part numbers. Duplicate inventory items. No standardization on parts descriptions and numbers. Global data existing in different code pages and languages.
Are my marketing programs effective? Am I giving customers and prospects every opportunity to love our company?
Customer attrition rates. Results of marketing programs.
Typos. Lack of standardization of name and address. MultipleCRM systems. Many countries and systems.
Are any customers or prospects “bad guys”? Are we complying with all international laws?
Reliable customer data for comparison to “watch” lists.
Lack of standards. Ability to match names that may have slight variations against watch lists. Missing values.
Am I driving the company in the right direction?
Reliable business metrics. Financial trends.
Extra effort and time needed to compile sales and finance data – time to cross-check results.
Is the company we’re buying worth it?
Fast comprehension of the reliability of the information provided by the seller.
Ability to quickly check the accuracy of the data, especially the customer lists, inventory level accuracy, financial metrics, and the existence of “bad guys” in the data.

Again, these are some of the many reasons where data lacks intelligence and can’t provide for the needs of the corporation. 

Wednesday, May 5, 2010

Suspect Duplicate Processing in IBM MDM Server

The task of evaluating data, finding suspects in the data and collapsing them based on rules is an exhaustive process. If the suspects do not have a high possibility of a match then what action should be taken? How can automated merge be leveraged so the manual process of collapsing data can be minimized?

There are a lot of questions which the business wants the answer for before it can make an informed decision. Lets talk about the basic terminology which business should know when talking about Suspect Duplicate Processing(SDP) or Duplicate Suspect Processing(DSP).

What is SDP ?
IBM MDM can identify the duplicate parties in real-time, as part of adding or updating the party data or offline as part of Evergreening. Suspect Duplicate Processing (SDP) feature provides mechanism to identify these duplicate parties. Terminology Business users should know:
- Critical Data
- Match/Non-Match Score
- Match Category
- Match Matrix

What is Critical Data ?
The term ‘Critical Data’ refers to data elements that are selected by business to be used for comparision in SDP. If all Critical Data fields match between two records then they are considered exact match. For Example: Last Name, SSN, Address Line One

What is Match/Non-Match Score?
Each critical data element is given a score
For example:

Critical Data
Match Relevancy Score
Non-Match Relevancy Score
Last Name
1
1
SSN
2
2
Address Line One
8
8


What is Match Category ?
Match category is based on the Match/Non-Match Score.Out of the Box(OOTB) there are 4 Match Categories:
A1 - Match/Non-Match score indicate that a definite duplicate party has been found.
A2 - Match/Non-Match score indicate that high probability that a duplicate party has been found.
B - Match/Non-Match score indicate that it is fairly unlikely that a duplicate party has been found.
C - Match/Non-Match score indicate that the suspect party is not a duplicate.
These categories can be customized.

What is the Match Matrix?
Match Matrix brings together Match/Non-Match Scores and Match Categories.
- 0 means data element not present in either or both the new and existing records
- Negative value means data element is present in both the new and existing record and it does not match
- Positive Value means data element is present in both the new and existing record and it matches

Last Name
SSN
Address Line One
Match Score
Non-Match Score
Category
1
2
8
11
0
A1
-1
-1
-8
0
11
C
1
2
0
3
0
A2


The categories in the match matrix are decided by business. Hopefully this provided a basic overview of the Suspect Duplicate Processing concept in IBM MDM. Feel free to leave comments or ask questions.

Wednesday, March 3, 2010

Five Pillars of MDM

The first pillar, content, is not only the backbone of MDM, but also the foundation of any
data model. Master data content is usually enterprise-oriented and highly specific to
your business, relying on terminology that is often unique to your company.
What separates MDM from other technologies is that, in mastering the subject area,
it goes beyond the mere content of your data and presents relationships between
different data elements, including hierarchies and groupings. For instance, a “household”
frequently refers to a group of people living in the same location. Such a grouping
requires its own set of rules and definitions, as established in a data model. While a retailer
may merely group all the related people living at the same physical address, a bank, for
example, may only consider wage earners living at a given address. Again, it depends on
your business. Establishing the parent/child hierarchy within a given household may be
crucial for some businesses, such as insurance companies, but not others.

Similarly, MDM is able to support access rules to deal with the details of data flowing
into and out of the MDM hub. The delivery (or provisioning) of the data content and
relationship details to other systems means enforcing the rigor associated with security
and access policy at an individual data-element level to ensure that all rules and details
associated with data access is maintained. While most believe that data access or
linkage is limited to in-house or corporate systems, MDM can support any data source,
so integrating with third-party or external data is no more complex than an internal
billing system. As more and more data is available and linked, tight access control is
essential to ensure that data doesn’t fall into the wrong hands.

Because corporate information is always changing, mastering data also involves
supporting a robust change control component within MDM. Business data changes
in real-time, and an MDM environment must recognize which changes are acceptable.
A mature MDM hub understands when a data element change can be supported
in an automated fashion, or when it requires intervention from an external process
(e.g. another system, a data steward, etc.). Unlike traditional IT-based change control
methods that focus on application changes or business process changes, MDM focuses
on data value changes.

MDM also involves the actual processing of data, from basic matching and identification
to data correction and CRUD (create, read, update, delete) processing. Processing rules
contain the details for determining whether two records are the same. For example,
are customers Robert Smith and Bob Smyth the same? If so, should one of the values
be updated? If Bob calls and updates his address, should that change be propagated
throughout the other databases in the system, including the master record? Does it
matter if the address change comes from a third-party data vendor instead? Which
update should the system trust? And if it turns out that the data was incorrectly entered,
the data model should support unwinding the match to set the record straight.
This discussion of the five pillars is helpful in understanding the differences between
MDM and other strategic technology solutions, and will help to put the data model in
better perspective.

in reference to: Data-Modeling-and-MDM.pdf (application/pdf Object)

Wednesday, February 3, 2010

Analytic Auteurs

This is an interesting excerpt from SmartDataCollective article by Steve Bennett calledAnalytic Auteurs. To read the complete article please visit SmartDataCollective

"Now I never thought of myself as an auteur, but my favourite analogy for 'doing analytics' was making a movie. The analogy runs something like this:

As the director (think of me as a cross between Tim Burton and Spike Jonze) I assemble a team for 1 - 2 years and together we create the analytic solution the organisation needs. The solution is generally made up of two components. The first is a suite of information products (data marts, cubes, reports, dashboards, etc.). The second is the skills for the organisation to 'self serve' and create new information products without us.

We then we disperse to different places. Only to meet again when the next 'movie' opportunity arises.

My trusted experts are:

* Business Analyst - Script Writer
* Data Integrator - Special Effects
* Solution Designer - Cinematographer
* Information Designer - Set Designer/Head of Props
* Quality Control - Post-Production
* Change Agent - PR and Marketing

Why do the same people agree to work with me more than once? Well, you'll have to ask them to really know. But I can tell you what I tell them when I'm trying to get them them for one more gig. In no particular order I promise them:

* A real intellectual challenge
* A core team they can trust
* That they will learn from both the challenge and their fellow team members
* Very good remuneration - at least as good as they will get elsewhere
* Fun!

Don't get me wrong. The core team does get new blood each time and sometimes people just muscle their way into the team by being damn good at what they do. Never underestimate serendipity and the pleasure at meeting another person of whom you think 'wow - they're good!'"