I wrote a reply by email which I'm converting into a draft
web page here below:
It's Probably happening already
Just because you'r only starting to see it now, doesn't mean
other firms who had programs that understood Word format (or
Word Perfect, or whatever) couldn't see such information
years ago !
Proprietary formats such as Word docs are usually some
kind of semi binary, & are much less readable than HTML
(web pages), so could always have held such information as
author & date etc, just perhaps until now not visible to
normal end users, but always visible to editor vendor, it's
subsidiaries, & developer licensees. It would have always
been simple to encrypt & hide such information in files,
so only those who knew how, would have been able to view
them, & not other end users. I'm not saying it was done,
just that it was always possible.
What Might Be Monitored
Certainly a word processing program written 10 years ago (or
more) to run on DOS, could, if a commissioning company (such
as WP Inc or MS Inc) required, fairly easily have kept an
encrypted secret list, inside each (or some) files, that end
users wouldn't see, & couldn't easily decrypt, but that
they & friends/licensees could, containing information
such as:
- log of when edited, & for how long &
- how many keystrokes per minute each session (nice for
bosses to know the Tuesday secretary types more slowly than
the Thursday secretary, less nice for the typist)
- how much RAM your computer had, & which version of
DOS/Win (embarrassing if you're not licensed for it)
- list of other copyright programs of interest found
while running a background search on your computer
(simultaneously running the editor in foreground, so's
you'd not notice)
- note of what percentage of data files on your computer
had interesting/`dubious' key words in ... heroin, tax
evasion, explosive, civil liberties, communism etc, after
all, companies spying on customers for the government is
viable. (Could be done in cumulative scans over successive
runs, not accessing the disc too often, so's not to alert
user, storing the data centrally in a hidden file, &
encrypting & hiding inserted copies of the secret
information, when other files were written with the editor
! )
Export Of Data
The technology for nearly all of this was possible by 1986,
if any company had wanted to do it. Only limitation was, In
the old days, one would have had to wait to receive
proprietary editor files in the post from corresponding
firms, then scan them for hidden data.
Easy to encourage the floppies really, the firm merely says
to its customers: we offer faster/discounted services if
your correspondence/orders/trouble reports etc are filed
with us on floppy, not paper.
These days we have modems & ISDN, that makes life much
easier for the illicit export of data from your computer,
unknown to owner of computer, merely bury a few extra bits of
data in the stream of legitimate data when connections are
established by the user !
User Names
MS-DOS didn't have the concept of user names, so it would
have been hard to include your name easily, (but I bet for
instance that the name Fred Smith appears more than any other
name on files on a computer owned by Fred Smith, & can be
scanned for in the background: So even with DOS, although you
don't know the name of the owner of the computer using the
software being monitored, you (the spy company) can take a
pretty good automated guess!).
Now Microsoft have I believe finally introduced more
personalisation to computers with Win98, so there's more
information easily available to export. (BTW Decent operating
systems (such as Unix etc) had personalised login names over
a decade before MS-DOS was ever available, the problem lies
not in the local (PC) collection of personal information, but
in the export (floppy & modem)).
HTML
Another example of people giving away more information than
they realise, is web pages:
Web HTML files produced by web page composer programs often
contains extra information such as author name, editor
program name, last edit date etc. information doesn't show
up when viewed with a web browser, but is plainly visible
in a text editor. Embarrassing if it says (to to quote an
actual example)
meta name="GENERATOR" content="Microsoft Front-Page
2.0
unless you have a software licence for that product)
Other examples of security concerns include "cookie files"
& browsers in general.
What Can You Do ?
- Lie to your computer, tell it your name is Napoleon
Bonaparte, tell web registration servers a pack of lies,
'cos they're their to serve their masters, not you :-)
- If you catch a program up to illicit tricks, tell the
editors of computer magazines (bear in mind where their
advertising revenue comes from though).
- In Britain, British Telecom used to prosecute phone
piracy under the rather arcane law of "Stealing Post Office
Electricity". I presume any software package sold to you as
an editor, that lengthens your call by smuggling out even
one byte of information, is guilty of stealing Your
electricity (ie phone bill), & as such you could
prosecute the software vendor. In many countries, there
will also be extra laws granting you right to privacy.
- Switch to Free Software: Then
you will have no licencing worries, & won't care who
knows what software you use. You'll also usually have
access to source code, so you (or some friend or firm you
trust) can check & see what information is being
possibly hidden & or sent.
- https://www.torproject.org/