[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Auditing subcommittee meeting at clerk convention in Colorado



On Tue, Jan 13, 2009 at 05:10:34PM -0700, Neal McBurnett wrote:
> On Sun, Jan 11, 2009 at 10:52:31AM -0700, Paul E Condon wrote:
> > On Sat, Jan 10, 2009 at 12:30:25PM -0700, Neal McBurnett wrote:
> > > Attached is the final version of the "Colorado Law Improvements"
> > > document we've come up with and will present shortly.
> > I read your document using OpenOffice, the numbering/lettering of the
> > sections were not bound to any particular convention that OOW
> > understood, so I can't refer to seciton by their numbering.
> 
> Thanks for your response, Paul.
> 
> I thought it worked for me in Open Office, with what I took to be cute
> statistical numbering for some points (alpha, beta etc), but I didn't
> look closely.  Can you say more?

I run OO on a Debian Lenny box, using OO package from Debian repository.
It opens the document fine. There is no problem with the arrangement of
the paragraphs, but where the paragraph numbers (or letters?) should be
there is only a little rectangular box. I take this to indicate that
the document actually does not have bound values for the heading numbers,
but only some indication of what rule should be computed to bind them.

It is an inconvienence, only because one can't refer to sections by the
short designation. But it is indicative of what happens with proprietary
file formats.


> 
> More below....
> 
> > Under "Implement batch reporting:", first sub-section:
> > 
> > "Batch reports must be machine-readable."
> > 
> > You should expand to:
> > "Batch reports must be machine-readable in a computer file format that
> > is open, and non-proprietary, e.g. CSV."
> > 
> > HTML with cascading style sheets (CSS) is 'machine-readable', but
> > hardly audit analyst friendly. But it is also 'open', so my wording
> > needs some tweeking.
> > 
> > Again, under "Aggregate state-wide unofficial data:"
> > 
> > "Such reports must be machine-readable."
> > 
> > What is needed, IMO, is a separate section discussing
> > 'machine-readable'.  Maybe words about character-based vs. image-based
> > are needed. And, yes. HTML is character based. This work is not yet
> > done.
> 
> This is all too true.  Some other folks are preparing a separate
> document on that that I'll forward.
> 
> > re. the footnote:
> > "If a computer pseudo-random number generator is used to help select
> > the audit units, initial values or "seeds" for the generator should be
> > chosen using a publicly observed, physical source of randomness, such
> > as rolls of fair dice; also, the generator's algorithm must be
> > published so that the public can verify that a valid algorithm was
> > used."
> 
> Re: your suggestion to use timestamps for randomness: if you read the
> references cited for the procedure we used in Boulder's audit:

SSUE is, of course, not random. But it is unfudgable, which is good
for fending off conpiracy theorists. But there are other ideas. OK. I
read Rivest's paper. I think the suggestion that using hand
calculators for the actual selection of precincts is --- crazy.

It is nice to have a calculation that can be repeated by a paradoid
person, but to do the actual work that way is only OK when the process
is still highly experimental and provisional. Later it should be
reduced to computer code. There exist suitable software emulations of
eight digit calculators.

The use of ten sided dice is peculiar. How does one make a fair ten
face die? The die cannot be one of the Platonic solids, which have
face counts of 4, 6, 8, 12, and 20.

> 
>  http://bcn.boulder.co.us/~neal/elections/boulder-audit-08-11/procedure/
> 

With real people doing the work, and suspicious people watching, there is
a lot of room to misunderstanding. The people who have influence over
what key gets generated are all in one room. The procedure is complicated
by human standards, An observer has no chance of watching every participant
with undivided attention. It is easy for an observer who is unhappy with
the result to believe that the participants pulled some trickery. 

But it might work. It has some possibility of becoming a public ceremony
with photographs of participants published in the news paper and whatnot.
Ceremonies are good for the community.

> you'll see that we really do want more than timestamps in order to get
> secure, publically verifiable random numbers.  And it is important to
> have evidence that folks from different parties, etc generate the
> numbers privately and unveil them at the same time as described there,
> so email won't do.
> 
> And the PRNG that we used (Rivest's SSR) is less complicated and can
> easily be verified by calculator which again makes it well suited to
> what we need.

Random numbers for statistical analysis are very different from random
numbers for generating secure encryption keys. 

Rivest has some handwaving about how to get the right total number of
precincts choosen to properly use the total level of effort that has
been budgeted. I have an algorithm: 

1. choose a-priori weights Wi for each precinct Pi, i=0..(N-1). N is
   the number of precincts. (Weights are merely un-normalized 
   probabilities.)

2. Form a partial_sum of the set of Wi. The partial_sum() is available
   in the STL Libraray. The partial_sum output sequence is a sequence
   PSi.  The last number in the PSi sequence is the total sum of Wi, 
   call it Wtot, is the total sum of all the Wi,

3. Loop over all precincts, until enough are selected. 
   a.	Generate a random number Rj, j=0..N-1, in the
	range [ 0 : Wtot ).
   b.   Find the index k, for which PSk > Rj. (Another STL function.)
   c.   If k==j, choose precinct Pj for audit.
   d.	Iterate 3. (There will be a single precinct selected on each
	iteration.)
   If a precinct is selected more than once, there needs to be more
   iterations of the loop to achieve the target number of selections.

The code for this sort of thing can be written in C++ and published.
The seeds that are used are also published. The whole thing can be
checked by the paranoids. Things that involve humans keeping paper
records of what they did, or merely think they did, can't be checked
so well. 

> 
> Neal McBurnett                 http://neal.mcburnett.org/
> 
> > Consider also a computer based seed: 
> > Use the number of seconds since the Unix Epoch at the time of
> > initiation of the post-election audit process, or some other
> > convenient public event. The seed doesn't need to be truly random,
> > just beyond the power to scam by any participant.  Seconds since Unix
> > Epoch (SSUE) is surely the most available, ever-changing, non-repeatable
> > number in the world. And record this seed, so that calculations that
> > are based on it can be repeated precisely for software verification.
> > This verification is important for checking any corrections to
> > software bugs. 
> > 
> > But I wonder, is it practical to have only one PRNG process for the
> > whole body of work. Maybe just use, and record the SSUE each time a
> > seed in used. If someone tries to scam the system by repeatedly
> > running an analysis, the record of too many seeds in a short time,
> > could be used by auditors-of-the-auditors to catch the cheat.
> > Computers are good at keeping records of their operations
> > automatically, so this could work, IMO.
> > 
> > Another suggestion for seed: 
> > By prior arrangement with election managers of the two major parties,
> > they will each send email to the audit manager after the polls are
> > closed. Each email time-stamp is a value of SSUE. Use the product of
> > these two different SSUEs as a 64bit seed. 
> > 
> > My preferred PRNG is the one published by Press, et. al., "Numerical
> > Recipes ,,, ". Read this book when it comes time to make a selection.
> > 

I don't see the need to make the audit decision at the precinct, as
Rivest seems to want. Workers at the precinct are unlikely to be able
to do an audit after a 14 hour day supervising the conduct of the
vote. But maybe this is something that is really important to someone.

Some research needs to be done on what kinds of ceremony actually
comfort people, and are not wasteful busy work. More ceremony
gives more opportunity for trivial, but observable mistakes in the
performance of the ceremony, which mistakes lead to suspicion.

-- 
Paul E Condon           
pecondon@xxxxxxxxxxxxxxxx