Sam Trenholme's webpage
This article was posted to the Usenet group alt.hackers in 1995; any technical information is probably outdated.

Re: Kevin Mitnick

Article: 7443 of alt.hackers
From: (Dave Dubin)
Newsgroups: alt.hackers
Subject: Re: Kevin Mitnick
Date: 23 Feb 1995 19:36:22 GMT
Organization: University of Pittsburgh
Lines: 60
Message-ID: 3iio3m$
X-Newsreader: TIN [version 1.2 PL2]
Status: RO

Mad Mann ( wrote:
: Am I the only one who thinks Tsutomu Shimomura is a bit of
: an ass, and not a hero for helping the FBI catch Mitnick?

  I might think he was a bit of an ass if he hadn't been provoked, and
had aided the FBI just to be a cowboy. But as it stands, I consider him
neither a hero nor an ass. Just my opinion.

: The press is out to make Shimomura a hero, but I mean Mitnick
: hacked into his supposedly secure system.  Doesn't this make
: Mitnick one up on Shimomura?  They REALLY suspected Mitnick
: in the first place anyways because he is so high profile.

  Guess that depends on your criteria for one-upsmanship. Not being under
arrest is high on my list. You can argue who's one up on whom if you think
that either breaking system security or tracking down suspected felons is
kind of a cool thing to do. I'm not really interested in either.


: Aww, can we dispense with this OB Hack bullshit?  It is really
: cutting down on the actually hacker talk/gossip that I would

  I can't, since they're why I read this group.

  Game designer in France posted a request for lists of English words. Each
had to be five letters long, and be ranked by "difficulty level"
evidently, on how easy or difficult they'd be to guess). Did anyone have such
a list with several thousand words?
  I pulled about 3 megs of Etext over from Project Gutenberg, and put together
a crude concordance program using the hashing functions in K&R and
some nice
lexical analysis code from a book by Frakes and Baeza-Yates. It was then
a simple matter to rank all five-letter words by their frequency of occurrence
(a very rough predictor of difficulty).
  Since the word frequencies follow a Zipf distribution, there were many
more rare and difficult words than common/easy ones. Also there were lots
of proper names of people and places that didn't belong. I was able to
address both problems by simply theta-joining the alphabetized list with
/usr/dict/words. Most of the proper names dropped out, and the distribution
became much more linear. That made it easier to choose arbitrary cutoff
points for the difficulty levels.
    1) Frequency is only a rough indicator of difficulty
    2) I should really have used recently written text, since frequencies
       in _Ivanhoe_ (and even _Tarzan_) aren't representative. For example,
       'wrath' ended up more frequent than 'anger.' But Stephen King novels
       aren't available yet.

  If you're doing any text processing, I recommend _Information Retrieval
Data Structures and Algorithms_, (Prentice Hall '92) edited by W. Frakes
and Ricardo Baeza-Yates. You can get the code at:

  ...including the lexical analysis code I mentioned. If you like it, I think
you'll find the book is worth the price. (I'm not a contributor, just a
satisfied customer).



Back to index