The Database and the Telephone Company

| 1 Comment
Kim DuToit has the definitive smackdown on the whole NSA Telephone database hype. From The Other Side of Kim:
Database ClueBat
I don�t know much about a lot of stuff, but I know a great deal about databases and how to use them�and I especially know a great deal about how to manage usage of terrabytes of data. In a past life, I ran a customer database of grocery purchases (those annoying little loyalty cards that most supermarkets use to collect your data).

Just so we�re all clear on this concept: the average supermarket carries about 40,000 different items (called stock-keeping units, or SKUs), and the average supermarket processes about one million transactions (sometimes called �baskets") a year. The chain I last did this for on a full-time basis had just under 300 stores, and a database of about 3 million active customers ("active" defined as anyone who shopped with us at least once over the past six months).

A lot has been written about how these programs intrude on people�s privacy, and how this means that your shopping purchases can be tracked. Allow me to reassure you: almost nobody ever looks at a single customer�s item purchases�there are just too many items, and too many customers.

What I did was design ways to make data management easier�it�s what I still do�and I always operated on the 80:20 principle (that 20% of the people will account for 80% of the activity).

Which meant that I ignored 80% of all customers� information. I was only interested in those people who spent a lot of money with us (the 20%), because the data showed that not only did those people account for 80% of sales, they accounted for about 98% of our profits.

And the reason I only looked at that group was that if I could effect a change in their behavior (get them to spend a little more each week, for instance), the effect on the entire business was disproportionate to the effort involved.

More to the point, in all that time, I can count on two hands the number of times each year that I ever looked at any single customer�s purchases�and even then, it was to check the data, or for a merchandising purpose. Here�s an example: suppose the buyers decided that a particular item wasn�t selling, and they decided to discontinue ("de-list") the item in favor of one which was selling more, or to give the slow item�s shelf space to an existing best-seller. Good, sound merchandising.

However, if that item was being bought by our best customers, then I would argue for the item to be kept in stock, because if the customer didn�t find it at our store, she would go and find it somewhere else and we could, potentially, lose that �best� customer to our competitor�which was our biggest nightmare.
Bona-Fides established, he then gets on to the meat of the matter:
The reason they�ve been collecting this data since 9/11 was because someone at NSA was being really, really smart: if terrorists are communicating by phone, it�s possible to establish linkages between numbers, and install pattern-recognition software to collect those linkages. And the reason that this was a smart thing to do is a simple one: the phone company doesn�t store this data beyond (maybe) a few years�the amount is just too massive to hold forever�and lest we forget, we�re coming up on the 5th anniversary of 9/11 already.

Note that none of this requires any names, nor the content of the calls�that would be the privacy of the thing, and that�s where it seems that the NSA, if they�re telling the truth, has been quite circumspect.

But what this data gives the smart analyst is that when you establish that (357) 243-3006 belonged to Abdul El-Bomba, who received a call from his brother Aziz, a known member of Hezbollah in Syria, you now have the ability to focus only on all the calls Abdul made and received, to see who was calling him and whom he was calling. That would be a couple hundred calls, out of the (literally) tens of billions of records you�ve collected.

Here�s the Big Clue for the Clueless: if you don�t collect all the data, you can�t narrow the search at all. And it�s only once you�ve established that Abdul is a Bad Guy that you ascertain his number, and the numbers of his correspondents, and their names. Most of the calls will be innocent: the dry cleaners, the gas company, the liquor store, whatever.

But out of the couple hundred calls, you may find five that are to Mohamed Semmteks, and to Tariq Pilota, who are also terrorists, and whose calls you can now start investigating.

So from tens of billions to a couple hundred to five. And in these cases, it�s NOW when you, as the investigator, can get a warrant for a wiretap so you can start listening to actual content, which, out of all the data mentioned so far, is the only part protected by the First Amendment.

That�s how to do it�and more importantly, that�s the only way to do it when you�re starting from scratch.
No violation of 1st Amendment and it is helping to keep this nation secure. What's not to love... If you feel outraged by this, go here now. Or emigrate like many of you promised but so few of you actually did.

1 Comment

Not the point. The point is that we have a history of misusing every tool for political purposes. That is the nature of man, an untrustworthy creature. It is why the Founding Fathers designed the Constitution to protect us from our own government.

Remember your Benjamin Franklin: "They who would give up an essential liberty for temporary security, deserve neither liberty or security."

Leave a comment

October 2022

Sun Mon Tue Wed Thu Fri Sat
            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31          

Environment and Climate
AccuWeather
Cliff Mass Weather Blog
Climate Depot
Ice Age Now
ICECAP
Jennifer Marohasy
Solar Cycle 24
Space Weather
Watts Up With That?


Science and Medicine
Junk Science
Life in the Fast Lane
Luboš Motl
Medgadget
Next Big Future
PhysOrg.com


Geek Stuff
Ars Technica
Boing Boing
Don Lancaster's Guru's Lair
Evil Mad Scientist Laboratories
FAIL Blog
Hack a Day
Kevin Kelly - Cool Tools
Neatorama
Slashdot: News for nerds
The Register
The Daily WTF


Comics
Achewood
The Argyle Sweater
Chip Bok
Broadside Cartoons
Day by Day
Dilbert
Medium Large
Michael Ramirez
Prickly City
Tundra
User Friendly
Vexarr
What The Duck
Wondermark
xkcd


NO WAI! WTF?¿?¿
Awkward Family Photos
Cake Wrecks
Not Always Right
Sober in a Nightclub
You Drive What?


Business and Economics
The Austrian Economists
Carpe Diem
Coyote Blog


Photography and Art
Digital Photography Review
DIYPhotography
James Gurney
Joe McNally's Blog
PetaPixel
photo.net
Shorpy
Strobist
The Online Photographer


Blogrolling
A Western Heart
AMCGLTD.COM
American Digest
The AnarchAngel
Anti-Idiotarian Rottweiler
Babalu Blog
Belmont Club
Bayou Renaissance Man
Classical Values
Cobb
Cold Fury
David Limbaugh
Defense Technology
Doug Ross @ Journal
Grouchy Old Cripple
Instapundit
iowahawk
Irons in the Fire
James Lileks
Lowering the Bar
Maggie's Farm
Marginal Revolution
Michael J. Totten
Mostly Cajun
Neanderpundit
neo-neocon
Power Line
ProfessorBainbridge.com
Questions and Observations
Rachel Lucas
Roger L. Simon
Samizdata.net
Sense of Events
Sound Politics
The Strata-Sphere
The Smallest Minority
The Volokh Conspiracy
Tim Blair
Velociworld
Weasel Zippers
WILLisms.com
Wizbang


Gone but not Forgotten...
A Coyote at the Dog Show
Bad Eagle
Steven DenBeste
democrats give conservatives indigestion
Allah
BigPictureSmallOffice
Cox and Forkum
The Diplomad
Priorities & Frivolities
Gut Rumbles
Mean Mr. Mustard 2.0
MegaPundit
Masamune
Neptunus Lex
Other Side of Kim
Publicola
Ramblings' Journal
Sgt. Stryker
shining full plate and a good broadsword
A Physicist's Perspective
The Daily Demarche
Wayne's Online Newsletter

About this Entry

This page contains a single entry by DaveH published on May 16, 2006 12:27 AM.

Heat was the previous entry in this blog.

Problems in Brazil is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Monthly Archives

Pages

OpenID accepted here Learn more about OpenID
Powered by Movable Type 5.2.9