Secret codes have been around for hundreds of years. Everyone from pirates to diplomats have used them to lock away messages from prying eyes. In recent years, mathematicians have built even better algorithms that are harder than ever to break.
Along the way, mathematicians also started discovering that the algorithms could do more than secure a message or protect the location of treasure. They could enforce complex rules and synchronize people who were working together.
The best algorithms now take on many roles beyond just protecting a message. Some can stop cheating. Others can ensure fair decisions and help teams build consensus. Some offer a neutral decision-maker that works without bias or favor. New use cases are appearing often for the same basic algorithms.
These algorithms can transform workflows everywhere. In the interest of promoting a better, more secure IT infrastructure for any company, here are seven of the most promising approaches for improving the security, fairness, and efficiency of your entire data stream — each of which might also offer unexpected next-generation advantages.
Blockchain
The word is often used as a synonym for cryptocurrency, but the concept is much broader. It’s a general solution for getting competitors to arrive at consensus. Cryptocurrency issuers use it to track a ledger of who owns which coin, but you might use it to track any asset or decision.
The most capable chains can run arbitrarily complex algorithms in a shared public system that allows everyone to audit the computations. They rely on cryptographic algorithms such as Merkle trees or the Elliptic Curve Digital Signature Algorithm (ECDSA) to process all transactions in a highly regulated process. Everyone from colleagues to competitors can be sure that the results were found openly and honestly.
IT leaders can rely on the algorithms for updating the chain for any scenario that requires building trust between users who may be suspicious. Some are building out ways to invest or wager on events. Others want to simplify complex transactions such as buying a car that require synchronizing multiple parties, like lenders, insurance agents, or title service agents.
The costs for using some of the most prominent chains such as Ethereum can be significant, but there are now a number of good secondary or tertiary chains like Arbitrum that offer much the same security at dramatically lower prices. Some options to check out are Solana, Arbitrum, Gnosis, or Skale, but there are too many to list here.
Private information retrieval
Securing a database is fairly straightforward. Protecting the privacy of the users, however, is a bit more difficult. Private information retrieval algorithms make it possible for people to search the database for specific blocks of data without revealing too much to the database owner.
This extra layer of protection relies on scrambling larger blocks of data into a complex and inscrutable mathematical blob. Only the right user can unpack the particular blocks they want to see. The database can’t track which particular bits were requested because the blob includes so many.
The algorithms are useful in realms where even a database query can reveal too much. Stock trading desks, for example, might want to prevent insider trading by hiding their investigations from the back office that maintains the databases. Secure government agencies can protect compartmentalized information that’s stored in common infrastructure.
SealPIR, MuchPIR, and FrodoPIR are three examples of libraries that can be incorporated to provide private information retrieval services.
Snarks
Digital signatures are a well understood feature of modern encryption math. Someone with knowledge of a secret key uses it to certify some collection of bits. Software installations, database transactions, and DNS entries are just some of the many collections of bits that are certified.
ZK-Snarks offer a more powerful way to certify something with the additional feature of not revealing it. The term is an acronym for “Zero-Knowledge Succinct Non-Interactive Arguments of Knowledge.” A snark in a digital id may guarantee that someone is old enough to drink alcohol without revealing their exact age.
Some of the most obvious uses may be in digital contracts. One side can vouch for certain factors without revealing sensitive or personal information. A digital voting system may tabulate choices in a way that can be audited without revealing anyone’s vote.
The algorithms are generally very fast and some applications rely on them more for their speed than their ability to withhold data. Sometimes checking a ZK-Snark for some transaction can be much more efficient than combing through all the data in the transaction.
Some implementations include libsnark, DIZK, or ZoKrates.
Post-quantum cryptography
Traditional public-key cryptography algorithms will be broken easily when a quantum computer of sufficient size appears. No one has publicly described much real progress toward creating a machine that can break the systems currently being used, but some researchers have devoted their time to preparing for that moment. They’ve been creating new algorithms with a different structure that wouldn’t be immediately crackable.
The National Institute of Standards and Technology has been organizing contests to develop good algorithms and they’ve already identified several potential options. These are a good foundation for any enterprise with reason to worry about the appearance of good quantum computers.
But even those who aren’t so worried might want to explore the algorithms because they are sufficiently different to offer some advantages. SPHINCS+, for example, relies on basic hash functions that are well-studied. Some chips have the hash functions implemented in silicon.
The work at NIST is a good place to begin. Its website points to draft standards, discussions, and reference implementations.
Federated learning with encryption
One of the biggest challenges for training AI algorithms is the need to collect all the data in one place. This is not just expensive; it’s also impractical because the data set is usually much larger than even the biggest machines. It’s also a dangerous privacy risk to store all that information in one place where data thieves can enjoy one-stop shopping.
Some AI scientists are finding ways to split learning chores into separate locations so the data doesn’t need to be aggregated. They’re also mixing in layers of encryption to add extra privacy.
Some recent advances can be found in projects such as IBM FL, OpenFL, PySyft, NVFlare, and a number of others because it’s an area of much active research.
Differential privacy
Instead of simply scrambling data, differential privacy algorithms provide secrecy by adding random distortions and noise. The result is a data set that should be statistically similar to the original, but without personally identifiable information in the clear.
For instance, a database of cancer victims might identify the street the victim lives on but won’t have an accurate street number. Or, if the privacy factor (identified with a greek letter epsilon) is turned up more, the street might be replaced with one nearby. Data scientists can still study the results, but identity thieves won’t find much value.
Google and IBM are distributing libraries for transforming your data. MIT Press is releasing a new book on the topic in their Essential Knowledge series.
Fully homomorphic encryption (FHE)
Traditionally, encrypted information is completely inscrutable because the goal is to ensure there’s no way to tell what’s inside that encoded packet without the key. But lately, some mathematicians have been finding clever ways to have our cake and eat it too. (I wrote about some of the early techniques in my book Translucent Databases.)
Today, there are many ways to reason about data without decrypting it, something that makes it possible to boost security for the enterprise. Some algorithms can search for specific database records. Others can do basic arithmetic.
The shining goal is to create fully homomorphic encryption (FHE) algorithms that allow arbitrary, Turing-complete computations on encrypted data without unscrambling it. There are some good algorithms that offer this, but the efficiency leaves much to be desired.
Some of the newest algorithms can be found cataloged in this list.