Back

October 2003

 

short circuits

Your Engineering Heritage:
Up for the Count

World Bytes:
The Measure of a Person

viewpoints

reader feedback

archives

archive search

 

 

 

Data Mining and Privacy Issues

by George W. Zobrist

E-mail this page
to a friend

Tell us what you thought of this article

Data mining, a technique people use to gather information by looking for hidden or obscure relationships in data, continues to generate considerable debate, especially about privacy issues. For example, data mining could take the form of searching through company orders, purchase amounts or zip codes to determine customer preferences. A company’s marketing department could then use the data for new product development or to determine who would most likely purchase certain products.

Data mining is similar to pattern recognition or artificial intelligence as applied to a database. But while the standard database use is more straightforward, with users generally searching to find out information they already know exists (such as the number of employees near retirement or the number of employees in a certain salary range), data mining involves searching for information that is not known to exist on the surface.

Consumer Issues

Where is the problem? In data mining, the privacy and legal issues that may ensue are key to the conflict. Over the years, both corporate entities and the government have collected tremendous amounts of data, storing it in “data warehouses.” Today’s data mining technology can extract various patterns and relationships from these data warehouses, putting consumers’ privacy in jeopardy.

The heart of the matter is that consumers are aware that collected data is used for bill payment, for example, and they explicitly agree to that use. They do not necessarily implicitly agree to allow the corporate entity to use the data in a data mining scenario; it exceeds the original intent of the data collection.

Some privacy advocates believe consumers should be given various levels of “opt-out” choices: no data mining allowed; for internal use only; or information being given is for both internal and external uses. Many credit-card companies and others have begun offering their customers such opt-out choices. No matter what, the government, as well as public- or privately-owned companies should inform customers about how they will use any data collected from or about them.

Government Action

The Defense Advanced Research Projects Agency (DARPA) has come under particularly heavy criticism recently. Its Total Information Awareness (TIA) program, set up to scour the Internet and various public and private databases to expose patterns of suspicious behavior by individuals and track potential terrorists, has been cited as one that could potentially violate Americans’ civil liberties. The Bush administration has denied that such a potential exists.

A 7 February Department of Defense release noted that two boards (one internal, the other external) will provide oversight of the TIA program. These boards would work with DARPA to ensure that the TIA program is consistent with constitutional and statuary law, and American values related to privacy.

DARPA has said that it does not plan to generate a gigantic database and is not collecting intelligence information, since that responsibility rests with U.S. foreign intelligence and counterintelligence units, operating under congressional oversight. Further, DARPA also said it has never collected privately held consumer data.

Nevertheless, Sen. Ron Wyden (D-Ore.) attached an amendment onto a recent spending bill that would block funding for data mining aspects of the TIA program, until the administration details the scope of TIA’s activities and their impact on civil liberties in a report. The bill is currently in joint Senate-House committee, where differences are being worked out. Many expect, however, that the Wyden amendment will stay attached. A 14 July report noted that congressional funding for the TIA program is all but “dead.” And as of mid-September, DARPA was lobbying to get funding for TIA or parts of it restored.

Vocally expressing the opposing viewpoint, the Heritage Foundation thinks the Wyden amendment has gone too far, and that it would restrict law enforcement efforts to deter terrorist activities.

Other Pending Legislation and Activities

A Citizens Protection in Federal Databases Act, submitted 29 July, would require the Pentagon, the Central Intelligence Agency and the U.S. Departments of the Treasury and Homeland Security (DHS) to report to Congress their use of commercial databases to track terrorists, fugitives and "deadbeat" parents within 60 days.

In addition, according to Sen. Charles Grassely (R-Iowa), the FBI is working on a Memorandum of Understanding with DARPA for possible experimentation with TIA technology. The FBI denies this collaboration, stating only that the organization seeks to improve its information technology.

Finally, Sen. Russ Feingold (D-Wis.) has introduced legislation that would freeze all Defense Department and DHS data mining programs until Congress can evaluate and authorize each one. This bill could be instrumental in increasing public debate on data mining.

The public is just beginning to address the relationship between data mining and privacy. Most likely, we may raise even more concern in the near future, as we become more cognizant of data mining techniques and implications, especially where privacy issues are concerned.

Additional Resources

These sources will provide more information on data mining and the surrounding issues:

 

Back

 


Dr. George W. Zobrist is professor emeritus at the University of Missouri-Rolla, Department of Computer Science. He is IEEE-USA's Member Activities editor.

 

 

© Copyright 2003, The Institute of Electrical and Electronics Engineers, Inc.