Mathematical Oddities to Anger and Inspire

- by Paul Murphy -

The risk with doing security work is that navel gazing causes such a loss of perspective that what is forward starts to look backward while users mutter angrily about your behavioral resemblence to another anatomical feature entirely. Hence today's attempt at stress relief focuses on two mathematical constructs, one from Willard Quine and the other ultimately from Claude Shannon.

To quote from David Malone's a page about quines "A “quine” (or “selfrep”) is a computer program which prints its own listing." Thus the canonical approach is to create a bunch of variables representing the lines of the program that prints them and then append the lines that do so. For example (again from the quines page)

char *s1="#include <stdio.h>%c%cint%cmain (void)%c{%c";

and later

printf(s1,n,n,n,n,n);

It's kind of a silly trick, fundamentally inelagant because brute force and incomplete because it doesn't print out the listings for either printf or the compiler used. It's no better, in other words, than my own favorite quine - typing a 1 followed by a carriage return in APL.

A more sophiscated variant on this is to write a program that prints out its input as both text and hex, use it to process itself, and use an include to compile in the hex output as the array to be processed in the final version. That looks way more "scientific" and is actually a lot easier to understand, but is still a cheat because the program uses library resources whose code you should, in theory, show but actually don't.

The boot loader on a SPARC workstation or small server is actually a Forth environment so a truly elegant, and minimalist, quine would consist of ">:" -with or without a body trailer consisting of booting the machine, running it for a year or so, and then shutting down!

Quines are interesting to security researchers because it's proveably not possible to develop a general purpose computer language in which quines can not be written - and therefore it's impossible to make self replicating code impossible.

"Self replicating code", of course, is just the long way of saying "virus" - meaning that Java may be a whole lot more secure than C, but only because it enforces standards of behavior and not because it actually is more secure.

If your goal is either to protect or steal secrets, the java virtual machine looks a lot like a limited access machine running inside a real machine - and if that's where the secrets are the real machine may be of no importance because it's not the one you want to either protect or corrupt. In theory a network consisting of a few dozen java virtual machines is every bit as vulnerable as one consisting of a few dozen real machines.

To succeed an attacker has to get his code operating on least one member of the network, that code then has to replicate itself on the others, and there has to be some mechanism for getting the stolen information out without being caught. I haven't had a chance to try it, but my guess is that DTrace will enable a legal user on a Solaris machine to look inside at least his own java sandbox during program execution - meaning that it could provide both the entry and exit points for the quine and the information.

Maybe the ultimate security recommendation is to stick to pencil and paper but, for now, the business corolary is that java security is illusory; really just a sophisticated variation on security through obscurity. Re-considering C for high security work may seem stupid, but actually makes sense if you think of it as eliminating a layer of obfusication between you and the bad guys.

In 1949 Claude Shannon delivered a talk on Programming a Computer for Playing Chess to the society of radio engineers in New York. In it he laid the groundwork for what later became known as the general branch and bound algorithm. In the multi-player games version of this, branch and bound really consists of little more than the mathematical expression of the injunction that each player should try anything, and if it's dumb, try something else.

Two years earlier, Stanford's George Dantzig had found the simplex method for solving linear optimization problems of the kind immortalized by the feed mill example beloved of introductory texts - you know: given the cost, protein and energy content, and maximum availablity of various constituents, find the cheapest way of mixing up a set quantity of dog food meeting some predetermined minimal nutritional standards.

The formal branch and bound algorithm applies to problems, like those involved in two player games, which cannot be reasonably represented as simple linear programming [LP] problems, but can be represented as a set of LPs none of which individually yeilds an overall optimal solution. The strategy therefore is to repeatedly solve these sub problems, keeping partial results that branch toward optimality in the general problem and treating the results from those which don't as boundaries not be crossed again.

In British English a "bounder" is a social charlatan; someone of "good family" whose word means nothing and whose every action is dictated by a mixture of personal cowardice and short term self interest. One of my favorite puns, therefore, involves describing the typical slashdot discussion as branch and bounder reasoning -in which the main topic is repeatedly lost and regained as some people create extensive subthreads by branching off to explore their own neuroses.

Branch and Bound provides a very good model for an attack strategy aimed at multi-site network operations where you don't know who's got the stuff you want.

Suppose, for example, that you wanted to steal secret data returned by the Spirit rover. In this case you know the data exists because there are large gaps in the public record and no evidence of transmission failures to account for them. What you don't know is who has the data or why it's secret - but you want to find out.

It's impossible to develop a top flight geologist without leaving a public record, so getting together a list of the fifty or so experts NASA would be likely to go to wouldn't be hard. Since you'll find most of the at Universities you'll usually have little or no trouble penetrating their networks. From then on, it's almost mechanical: monitor the external communications of the people you find, follow through with those whose contacts lead toward the contractor or project owner, abandon the others. At least some of your candidates will be using PCs and their access will ultimately let you locate the data and map the project's collaborators. It may take some time, but it's just branch and bound: abandon the dead ends: quine the others, and eventually you'll have the information needed to plan and carry out the actual data theft.


Paul Murphy wrote and published The Unix Guide to Defenestration. Murphy is a 20-year veteran of the IT consulting industry.