Conficker C and a future with self-evolving computer viruses

I’m absolutely enthralled by the Conficker C virus after reading this analysis from SRI International. The C variant is the third major generation of the Conficker virus and demonstrates the highest level of sophistication found in any computer virus or worm to date. What excites me most about it is the decentralized nature of its peer-to-peer method of quickly propagating updates to itself in the roughly 12 million infected computers around the globe.

Researchers have identified the date of April 1st as when the virus wakes from hibernation. The event has been simulated with a copy of the virus in computer science laboratories, but since the virus gets its instructions through the peer-to-peer network and rendezvous points, it is impossible to tell what kind of code will be executed until it happens. Speculation has ranged from the biggest April Fools Day joke ever to a massive dragnet or “Dark Google” allowing the virus’s authors to search for sensitive information on infected machines en-mass and sell it to criminal organizations or governments. Which is especially worrisome since Conficker has infected many government and military networks.

The authors have demonstrated the most cutting edge knowledge in multiple disciplines so this is not just a kid sitting in his mothers basement. This is a closely coordinated effort between a group of extremely talented individuals and I personally wouldn’t be surprised if the authors, if caught, turn out to be a part of a government initiative. The SRI report says “those responsible for this outbreak have demonstrated Internet-wide programming skills, advanced cryptographic skills, custom dual-layer code packing and code obfuscation skills, and in-depth knowledge of Windows internals and security products.”

Conficker C does a mutex check with pseudo-randomly generated names when it initially installs to avoid overwriting itself. Then it patches the win32 net API to inhibit antivirus software and block antivirus websites. The fact that it patches only in memory DLL files and not persistently stored DLLs means that removal tools can’t simply replace the compromised files with clean ones. It also does a patch of the same windows vulnerability it initially used to enter the system but leaves a back door so that only new variants of Conficker can use it. This prevents other viruses from piggybacking on Conficker and competing for control of the system.

Conficker spawns a thread with the purpose of searching for a static list of known anti-virus applications and terminating them to defend itself from attack and blocks services that allow anti-virus software to auto-update. It also deletes all windows restore points and removed safe mode as a boot option. Conficker uses dual-layer encryption and code obfuscation to hinder efforts at reverse engineering it. Conficker released an update just a few weeks after a new md6 hashing method became publicly available from the original researchers at MIT.

The authors uses two similar methods of propagating updates to infected machines. Previous Conficker variants use a clever “rendezvous” system to randomly generate a huge list of possible locations for rendezvous locations where the authors may have placed a distribution server that changes on a daily basis. The randomized nature and the large number of possible locations make efforts to block those domains impractical. Once a machine has an update it can also assist in spreading it to other machines via the peer-to-peer network. Currently, almost all p2p networks require some kind of “seed” or predefined peer list to be introduced into the network, but Conficker doesn’t. It uses the same kind of pseudo-randomly generated destination list as the rendezvous system to generate an initial peer list, which essentially bootstraps itself into the network. There is absolutely no bottleneck that can be attacked to stop Conficker from communicating with its peers.

Some of my own ideas

Antivirus filters and coordinated strategies that to thwart the spread of viral software utilize patterns to identify uninvited guests. I see future decentralized malware using a randomized approach to avoid detection. If the same application is also capable of virally updating its peers, a system of natural selection will evolve. This is essentially how genetic algorithms work. In this case, the natural fitness function is the ability to infect new systems (spawning offspring) and its defensive ability to ward of efforts of removing it from the host (self preservation). In this sense, randomized configurations (the genetic code) of the virus will be propagated to new systems and over existing instances at a higher rate, and thus the more successful variants would become the most prevalent. And there we have a model of evolution.

The process works the same way as HIV and flu viruses and has the effect of a self-healing, growing network that can autonomously adapt to new countermeasures developed by antivirus companies. A self-evolving computer virus has the advantage over biological evolution of electronic speed. Time is the greatest enemy of evolution. Digital organisms have the chance to excel beyond anything we have observed in nature.

Fortunately, genetic programming hasn’t yet advanced to the stage where the scenario I proposed is practical. Current genetic programming algorithms are limited to a constant set of numerically variable attributes that are randomly modified in each generation. Those attributes could never be complex enough to resemble an actual evolving organism and too much of the logic for a computer virus has to be pre-programmed and static. I expect that to change over the next decade as more work is done in this area so watch out.