Hello
Thanks for the question. I've worked on a couple of projects to build
Data Warehouses, so here is the benefit of my "wisdom".
Databases today can range in size into the terabytes more than
1,000,000,000,000 bytes of data. Within these masses of data lies
hidden information of strategic importance to businesses. But the main
problem with so much data is seeing the wood for the trees. Two
overlapping concepts have been recently deployed to make sense of the
forest of data and improve the quality of access to the data.
The Data Warehouse
_________________
The data warehouse concept evolved from a need to quickly analyze
business information. Older corporate systems were not set up to
optimize data retrieval and getting at the data was difficult and
subject to a variety of errors due to:
- Lack of historical data in some operational systems.
- Data required for analysis resides in different operational systems
- Complexity of running queries across numerous disparate databases
A data warehouse improves on quality of access to older data sets by
storing the current and historical data from disparate operational
systems in a single, consolidated system. This makes data readily
accessible to the people who need it without interrupting on-line
operational workloads and gives a single, unified view of the data.
For example, I recently worked on the development of a Data Warehouse
which took data from 4 international banks customer systems, and
consolidated it into one large database that treated them all as
customers of one central bank that had "taken over" the four. Although
all four banks had wildly different IT systems and database designs,
the warehouse enabled one view to be taken, and it made a "corporate"
environment of data in which staff could be trained on customer data
without having to learn four different systems.
Data Mining
__________
Alongside the physical storage of the data warehouse, it was quickly
realized that means would have to be developed to access the huge
databases in a way that would give meaningful, quality, business
reports. The answer is data mining, which is being used both to
increase revenues (through improved marketing) and to reduce costs
(through detecting and preventing waste and fraud). Worldwide,
organizations of all types are achieving measurable payoffs from this
technology, improving on their "legacy" systems which are often
unwieldy when trying to get "forecast" data from their datasets.
Heres a good overview from a business definition of Data Mining at
"DIGGING UP $$$ WITH DATA MINING -AN EXECUTIVE'S GUIDE by Tim
Graettinger - Discovery Corps, Inc.
(http://www.tdan.com/i010ht01.htm )
"We define data mining as "the data-driven discovery and modeling of
hidden patterns in large volumes of data." Data mining differs from
older technologies because it produces models - models that capture
and represent the hidden patterns in the data. Via data mining, a user
can discover patterns and build models automatically, without knowing
exactly what she's looking for. The models are both descriptive and
prospective. They address why things happened and what is likely to
happen next. A user can pose "what-if" questions to a data-mining
model that can not be queried directly from the database or warehouse.
Examples include: "What is the expected lifetime value of every
customer account," "Which customers are likely to open a money market
account," or "Will this customer cancel our service if we introduce
fees?"
quality of business decisions."
In addition to the modelling algorithms, data mining software usually
has features to simplify the graphic representation of the data
(visualization tools) plus interfaces to common database formats, all
of which improve the quality of accesss to the data, and bring benefit
to the business through the improved "views" they can take of their
business processes.
The potential quality (and financial) improvements are enormous.
Businesses are using data mining to locate and appeal to higher-value
customers, reconfigure their product offerings to increase sales, and
minimize losses due to error or fraud. Among the highest-profile users
of data mining are the banking, financial, and telecommunications
industries, but the full spectrum of users is very broad.
Youll find a list of data mining applications here:
Two Crows Data Mining Applications
(http://www.twocrows.com/applics.htm )
Here is an example of improvements achieved, taken from DIGGING UP $$$
WITH DATA MINING -AN EXECUTIVE'S GUIDE by Tim Graettinger - Discovery
Corps, Inc.
(http://www.tdan.com/i010ht01.htm )
Expanding your business:
Keystone Financial, a Williamsport, PA company, wanted to expand their
customer base and attract new accounts through a LoanCheck offer. To
initiate a loan, a recipient just had to go to a Keystone branch and
cash the LoanCheck. Keystone introduced the $5000 LoanCheck by mailing
a promotion to existing customers.
The Keystone database tracks about 300 characteristics for each
customer. These characteristics include whether the person had already
opened loans in the past two years, the number of active credit cards,
the balance levels on those cards, and finally whether or not they
responded to the $5000 LoanCheck offer. Keystone used data mining to
sift through the 300 customer characteristics, find the most
significant ones, and build a model of response to the LoanCheck
offer. Then, they applied the model to a list of 400,000 prospects
obtained from a credit bureau.
By selectively mailing to the best-rated prospects determined by the
data-mining model, Keystone generated $1.6M in new revenue from just
three promotions.
The article above also goes on to give examples of using data mining
to reduce costs, and in improving sales effectiveness and
profitability.
This is a huge subject, that I've given you a starter to. The links
below will take you further if you need to. If there's anything that
you need clarification on, just ask.
willie-ga
Google have a whole directory on each:
Data Warehousing here:
http://directory.google.com/Top/Computers/Software/Databases/Data_Warehousing/
Data Mining here:
http://directory.google.com/Top/Computers/Software/Databases/Data_Mining/
A paper on how to measure the Quality of your data in a Data Warehouse
here:
"Data Warehouse Quality Management"
( http://www.users.qwest.net/~lauramh/resume/dwqual.htm ) |