Smart Counter-Collusion Contracts for Verifiable Cloud Computing (2/2)

Previously, we showed the Prisoner’s contract and how it would force the two clouds to behave honestly, by creating a Prisoner’s Dilemma. However, this only works if the two clouds cannot make credible commitments. The fundamental problem in the Prisoner’s Dilemma and in game 1 (in the previous post) is that each player cannot believe the other, a selfish utility maximizer, will follow the collusion strategy.

Now if someone does not believe you, you cannot convince him/her by just talking. What convinces a rational player is by showing that lying will make you worse off. If lying is not in your interest, you will not lie. And if you want someone to do what you expected, you have to show doing so is in the best interest of him/her.

That is the idea behind the Colluder’s contract, in which both clouds show their loyalty to collusion (i.e. sending the agreed wrong result r) by promising that I will suffer a loss if I cheat, and any damage caused by my cheating behaviour to you will be compensated. The one who initiates the collusion can also give a slice of his own profit to the other as an additional incentive. The contract again is based on deposit:

  • Each cloud pays a large enough deposit into Colluder’s contract;
  • Anyone who does not follow the collusion strategy will lose its own deposit, which is transferred to the other cloud.
  • The ringleader commits to giving a bribe if the other follows the collusion strategy.

This colluder’s contract, when in place, will change the game into:

As you can see, now the equilibrium (bold path) for the two clouds is to collude and both follow the collusion strategy.

This is bad. After trying to prevent collusion using smart contracts, we found smart contracts actually can be used to enable collusion. And if the client tries to counter that by another contract, the clouds can have another contract to counter back. This is an endless loop.

What can we do then, if we cannot counter directly back? In the end, we came up with the idea that uses a smart contract to incentivize secret betrayal and reporting. This leads to the Traitor’s contract. In this contract, the first cloud who reports collusion will not be punished by the prisoner’s contract and will get an additional reward if the collusion attempt does exist (there is a motivation to report). However, if someone tries to report a non-existent collusion case, it will have to bear the consequence and suffer a loss (there is a motivation not to abuse the system).

The consequence of reporting is that the client can call the trusted party, and find out who cheated. Once the trusted party is called, there is no point to counter back using another contract because the payoff of each cloud now only depends on whether it cheats or not, not the other’s behavior. So we break the loop. More importantly, the Traitor’s contract creates distrust between the clouds because “agree to collude then betray” is the best responding strategy if one cloud tries to initiate collusion. Now both clouds understand that, then no one will want to initiate collusion because they know they will be betrayed and end up with a worse payoff. Then both will behave honestly in the first place.

The contract works again by manipulating deposits paid upfront, by the client and the reporting cloud. Details can be found in the paper. Here I just show the full game tree:

Implementation of the contracts in Solidity is available here. We actually tested the contracts on the official Ethereum network. There are challenges when implementing the contracts, one being that the transparency of a public blockchain. This means everything you put on the blockchain is visible to anyone. To make it worse,  a blockchain is append-only, which means later there is no way to delete the data if you change your mind.

To preserve data privacy, we used some light cryptography, including Pedersen Commitment and Noninteractive zero-knowledge proof (NIZK). Pederson Commitment allows us to put a “commitment” (ciphertext) of a value on blockchain rather than the value itself. The commitment has the property that it leaks no information about the value it is committed to, and is bounded to that value in the sense that you cannot find a different value and convince other people that the new value was committed in the commitment. One problem caused by the “hiding” property is that the miners cannot see the values committed in the commitments and thus cannot compare them to determine whether the values are equal or not (which is needed to execute the contracts). Fortunately, we can use NIZKs, which are cryptographic proofs that can be checked publically with commitments as inputs. There are already NIZK protocols that allow proving equality/inequality of committed values, which we can simply use.

The cost of using the smart contracts comes from the transaction fees paid to the miners for storing and executing the contracts. In our experiments conducted on the official Ethereum network, the transaction fees are small. Depending on the transactions, the fees range from $0.01 to $0.40. This was done in May 2017, when the price of Ether was about $90. Today the Ether price is about $360, so transaction fees would be higher. Luckily, the most expensive operations are cryptographic ones, and the recent Ehtereum hard fork has made ECC cryptography (which we use) cheaper than before. So the increase in transaction fee should not be steep as the increase in Ether price.

The End.

Smart Counter-Collusion Contracts for Verifiable Cloud Computing (1/2)

(Previous post)

The idea of our counter-collusion contracts is to make collusion a bad choice and leads to loss, so that the clouds will avoid it like a plague. Collusion has been studied for many years by economists, and they have made several key observations:

  • Collusion is profit driven. Note that collusion is often illegal, without additional profit, no one would have the motivation to collude.
  • Colluding parties have their own interests. And often those who collude are also competitors. This is very true in our case (cloud computing).
  • Cheating is a big problem in collusion. Often the cheating party who deviates from collusion can get an even higher profit, so is motivated to do so.

Collusion is delicate and requires trust among the colluding parties. If we can take away the trust, the clouds cannot collude.

Everything I say below is based on certain assumptions. The most important ones include:

  • The clouds are rational, which means two things: they try to maximize their payoffs and they understand all consequences of the games.
  • There exists a trusted third party that can be called upon to re-compute and find who was wrong, if the two clouds return different results. What interesting about this trusted party is that the analysis shows that if the two clouds are rational, the trusted party will not need to be involved at all.
  • The task to be outsourced must be deterministic or can be reduced to be deterministic (e.g. by including a seed as input and use a pseudorandom number generator for random choices).

There are other less important assumptions, check them in the paper.

Prisoner’s Contract is where we started. The contract is designed for outsourcing and needs to be signed by a client and two clouds. Informally, the contract requires each cloud to pay a deposit before it can take the job. The deposit, of course, needs to be large enough (which we have derived a lower bound in our paper). Then the clouds get the task to compute, and each returns a result before the deadline. An honest cloud will be paid a “salary” for the computation, a cheating cloud (if caught) will be punished by losing its deposit, and if one cheats one is honest, the honest cloud will get an additional reward (from cheater’s deposit). In cases where the client cannot decide who is honest, the trusted party will be called to resolve the dispute. The cost of dispute resolution will always be born by the cheating cloud(s), from the deposit(s). This means for the client, its cost is bounded and will never be more than the two salaries, even in the unlikely case that the trusted party has to be involved.

What is the consequence of the contract? The highest payoff each cloud can get comes from the case where it is honest and the other cheat. What does that mean? Let us play the role of clouds, you and me:

  • Me: let’s collude and cheat together!

What would you do?

  • A: decline to collude
  • B: collude with me and sent the agreed wrong result
  • C: agree to collude but later remain honest
  • D: decline and try to cheat

If your choice is A, you are a good person. And your honest behaviour will force me to behave honestly because I will be punished if I cheat and you do not cooperate.

If your choice is B, you are too naive. Of course collusion, if it succeeds, will lead to a higher profit (than being honest) for you . But have you ever considered the possibility that I am lying? I can take advantage of your trust, and later send the correct result to get a higher profit.

If your choice is C, you are a smart badass, with a sense of “business acumen”. This is actually a choice that is no worse than A and could lead to the best payoff for you in the game (if I am naive or mad).

If your choice is D (I hope not), you are dangerous because this choice makes no sense (it is the worst thing you can ever do), and who chooses this must be out of his mind.

Anyway, you should get the idea and can understand the game presented below:

C1 and C2 are two clouds, the label on edges are the actions they can take: f(x) means to send the correct result, r means to send the agreed wrong result, other means any other actions. Below the leaf nodes are the payoffs: u1 is the payoff of C1, u2 is the payoff of C2. No need to pay attention to the payoffs now, you can find how they are derived in the paper. The bold edges show the best choices of the players (C1 and C2). A bold path from the root to a leaf node is an equilibrium, a stable state in which both parties do not want to change their strategies if the other does not.

In the game, there is only one equilibrium, and in the equilibrium, the choice of each party is strictly better than the other choices (dominant strategy). In the equilibrium, both clouds will play honestly because

  • If no one asks me to collude, being honest leads to the best payoff.
  • If the other cloud asks me to collude, then I should agree to collude but later remain honest, in order to get the highest payoff.
  • Anyway, behaving honestly is always the best choice.

If you are familiar with game theory, you should have noticed that this is somewhat a Prisoner’s Dilemma game. For both clouds, the reasoning is the same and both will stay honest. If both clouds stay honest, everyone is happy and dispute resolution is not needed.

So far so good and it seems that we have solved the problem. But unfortunately, no. In the next post, you will see how the clouds, using another contract to change the game completely and make collusion the best choice, and how we solve this problem.

Smart Counter-Collusion Contracts for Verifiable Cloud Computing (Prologue)

People say smart contracts is the next big thing in the blockchain space. In the simplest term, a smart contract is a piece of program stored and executed in the blockchain. The fancy things about a smart contract are that its execution is (or will be) always correct (if you believe in the consensus protocol that maintains the blockchain), it is self-enforcing (executed and enforced by peers), it is trustless (no central authority) and it is cheap to use. It sounds so good, but what can smart contracts do? Of course, we want something more than ICOs. And this is I will write about.

A Short Summary in case you are impatient: we use smart contracts to implement mechanisms designed based on game theory, to enable cost-effective verifiable cloud computing. The paper (co-authored with Yilei Wang, Amjad Aldweesh, Patrick McCorry, Aad van Moorsel) was presented early this month in CCS 2017, and here are the full paper and slides.

The Need for Verifiable Cloud Computing comes from distrust. Everyone using cloud computing probably knows that “the cloud” is just a bunch of computers belongs to someone else. Then when I outsource something to the cloud, how can I be sure it is done properly in the cloud? In current practice, I cannot. You can imagine how annoying this would be when that outsourced computation is important to me. It is not necessary that the clouds are malicious, it is simply a consequence of uncertainty: I do not know what happens exactly in the clouds, and I have no control over that either. So the best I can do, as a matter of due diligence, is not to trust the clouds and verify all results returned by the clouds. But how? Verification can be as expensive as recomputing the task, and I might not have the resource to do that (if I have, I can avoid using the cloud in the first place by computing it by myself).

The Current State of verifiable computing is more or less divided into two streams. Some verify by using cryptography, some verify by using replication. In the cryptography based approach, the cloud must generate a proof that the computation is done correctly. Cryptography ensures that, unless our beloved cryptographic assumptions are wrong, the cloud cannot generate a valid proof if the computation is wrong. By checking the proof, I can be assured the correctness of the computation. In the replication based approach, I give the same task to several clouds, and later collect results from them, and cross-check the results. If the results from all replicas match, I can assert with a high confidence that the computation was done correctly. Of course the more replicas I use, the more reliable my assertion would be. More replicas can also help me to find the correct result, should there is something wrong in some replicas.

What is Missing in all existing verifiable computing techniques is a sense of economy. Surely they are technically sound, but with an unaffordable price. The problem is that cloud is not free. You pay for what you compute. Generating a cryptographic proof is much more expensive than what you would think. Currently, the overhead is 3 – 6 orders of magnitude more than the computation being verified. Simple primary school math:

  • The costs of my computation: £500 per month
  • The costs of getting the proofs: £500 * 1000 = half a million per month
  • What I get: bankruptcy and out of business

For replication based approach, since I have to pay each of the replicas, the cost is blown up by a factor that equals the number of replicas. Of course, it soon becomes unaffordable when the factor grows up.

One, perhaps the most important, reason people want to use cloud computing is cost saving. When there is no advantage in term of cost over on-premises IT infrastructure, which you have control and don’t need to worry much about correctness, many would not be that keen on the cloud.

The Question then is: can we have cost-effective verifiable cloud computing after all? Well, for cryptography based approach, I am a bit pessimistic. The gap is just too big. Unless there is a big breakthrough, we won’t be able to use it in practice in the near future. For replication based approach, the might be some hope, if the number of replicas we pay is small. How small the number can be? The least is 2. In fact, that might work. The idea is that using cloud computing is cheaper than using your own trusted on-premises IT infrastructure. “Two cloud replicas” means doubling the cost, and cost-wise this may not differ much or may be even lower than using your trusted IT infrastructure. Given the other good qualities cloud computing processes, people would have the motivation to use the cloud.

This is straightforward, but why has not anyone came up with something? Let us forget all engineering difficulties such as synchronization, replication, latency etc., and focus on the idea. It has a fatal weakness: collusion. In replication based approach, verification is done by comparing the results. What if the two clouds collude and give you the same wrong result? You know nothing and you cannot verify anything. Can the clouds collude? Of course they can. Remember, it is not about whether they will collude or not, it about whether you believe they will collude or not. You don’t trust the clouds, then collusion is a threat to you. In the face of collusion, verification based on 2 replicas is insecure.

How to Prevent Collusion is then our objective. The technical details will follow. A spoiler from the abstract of the paper: a client “uses smart contracts to stimulate tension, betrayal and distrust between the clouds, so that rational clouds will not collude and cheat”.

DRE-ip: A Verifiable E-Voting Scheme without Tallying Authorities

Next week in ESORICS 2016, I will be presenting our paper (co-authored with Feng Hao) on a new e-voting system which we call DRE-ip. The system is designed for e-voting in supervised environments, that is, in polling stations, but alternative implementations of the underlying idea can also be used for remote e-voting (sometimes called i-voting for internet voting).

DRE-ip is an end-to-end verifiable e-voting system that guarantees vote secrecy. These properties are similar to those provided by state-of-the-art verifiable systems like VoteBox, STAR-Vote, and vVote designed to be used in elections in the US and Australia. However, crucially DRE-ip achieves these properties without requiring trusted tallying authorities. These are entities holding the decryption keys to encrypted ballots.

In almost all systems with tallying authorities, the votes are encrypted to provide vote secrecy. These encrypted ballots are posted on a publicly accessible bulletin board to enable vote verification. In some systems, the votes are shuffled (using mix-nets) before the tallying authorities decrypt them individually. In some other systems, they are aggregated (using homomorphic encryption) before decryption and the tallying authorities only decrypt the tally. These two techniques are used to protect vote secrecy from tallying authorities. However, there is nothing to prevent tallying authorities to get together and decrypt ballots on the bulletin board, and even worse, there is no way to detect if this happens. So at the end of the day, we are trusting the tallying authorities for vote secrecy.

DRE-ip works based on a simple observation: if a message is encrypted using randomness r, the ciphertext can be decrypted using the secret key or the randomness r. Now, imagine a situation where multiple messages are encrypted and say, we are interested in finding the sum of these messages. One way would be to decrypt the ciphertexts individually and then find the sum. Another way, if we use a homomorphic encryption, would be to aggregate the ciphertexts first and then decrypt the encrypted sum. These two ways are what other systems are doing. But our observation above tells us that there is a third way: whoever is encrypting the messages can keep an aggregation of all randomness used in encryption and release it at some point, which would enable decrypting the sum of the messages. DRE-ip is built on top of this observation.

In DRE-ip the direct-recording electronic (DRE) voting machine that captures the votes and encrypts them, keeps an aggregation of randomness used in the encryptions as well and at the end of the election releases this value to the bulletin board along with announcing the tally. This enables the public to verify the tally. No secret keys are involved in the process of verifying the tallying integrity, and hence no tallying authorities are required. In fact, the system is set up in a way that no one knows the secret key of the encryption scheme. This means that no one is able to decrypt individual ballots. The election tally is the only information that can be verified given the encrypted ballots and this computation is public.

Having the idea is perhaps the easy part, but the main work is to design a system carefully such that it provides full end-to-end verifiability and at the same time one can argue rigorously about it guaranteeing ballot secrecy. In the paper we give proofs of why using encryption is such a way is secure.

DRE-ip achieves levels of security comparable to those of state-of-the-art systems, but crucially with one less group of trusted authorities. To argue the significance of this, it would be sufficient to just quote Ross Anderson‘s definition of a trusted third party: 

A Trusted Third Party is a third party that can break your security policy.

A week in Darmstadt – Our Attendance in Security and Privacy Week (SPW’16)

Being in a world-class conference is always exciting. Now imagine Security and Privacy Week is a conjunction of a few well-known conferences that happen within an intensive week of parallel sessions. This would triple the excitement! This year, the conference was held in Darmstadt, Germany. The whole event was a week long from July 18 to July 22. The major conferences involved were listed as follows:

  • Wisec, from July 18 to July 20
  • PETS, from July 19 to July 22
  • IFIPTM, from July 18 to July 22

Alongside, This week included a few parallel workshops such as EuroSec, ECRYPT, PLLS, SPMED, Infer, The Dark Side of Digitization, CrossFyre and HotPets. The main reason of attendance was to present our paper published in EuroSec’16 about the user’s perception about sensors in modern smart phones. Fortunately, our presentation and paper was well-received and parts of our results was used in the PETS keynote presentation by Angela Sasse.

I attended several talks in the conference. However, some of the keynote talks stand out in my notes. The PETS keynote was presented by Angela Sasse of UCL “Goodbye Passwords, Hello Biometrics – Do We Understand the Privacy Implications?” that discussed the new challenges by the biometric authentication in mobile phones and other conventional devices around us. Another keynote talk for WiSec, “The Ultimate Frontier for Privacy and Security: Medicine” by Jean-Pierre Hubaux from EPFL explained the vital importance of Genetic Sequences and the inadequate attention from security researchers to offer ideas to protect this data. He argued that the large scale sequencing is still a challenge for attackers but the lack of sufficient protection in the infrastructure could help them attack easily as soon as they have enough power. An interesting piece of information from his talk was the “US Wall of Shame” website. The American medical institutes that breached more than specific number of patients, must announce their hack publicly in this website to get some reductions in their penalties. He pointed out the key differences between medical researcher and security researches in his presentation. DigiDark keynote talks were important in terms of carrying governmental viewpoints to the new challenges of security. Susan Landau (Worcester Polytechnic Institute) presented “Crypto Wars: The Apple iPhone and the FBI” and introduced a brief history of previous wars between the governments and big companies to access their data and she specifically emphasized that the most recent Apple vs FBI court to access the mobile data of the San Bernardino Attacker could potentially open another to new legislation for such access in the future. She also highlights the importance of current cyber-wars and argued the strategies possibly could be involved in such a digital conflict. The last keynote that I attended was “Networks of ‘Things’ – Demystifying IoT” by Jeff Voas (National Institute of Standards and Technology). He discussed the lack of standard documentation on the principles of the IoT and he mentioned the attempts in the NIST to fulfill this goal. He announced that the official draft of the NIST documentation is now available to download in here.  However, he argued the correct term for IoT is Network of Things because it fits better to the nature of the concept. I somehow agree with him in principle, but I still think “Internet of Things” is cooler!

Furthermore, numerous amazing ideas and researches were presented during the past week. Among them, some were novel in terms of ideas. Miro Enev et al. from Washington University proposed the idea of “Automobile Driver Fingerprinting”. They recorded the sensors embedded in a modern automobile by different drivers in various circumstances and extracted a fingerprint of the driving style based on their sensor records. Their research showed that the brake system usage is the most distinguishing feature in driving style. Their proposal has already gained attention in well-known blogs such as here, here or here. On another research, Vijay Sivaraman et al. from university of South Wales proposed a novel idea to attack the smart home sensors by leveraging the lack of authentication in Universal Plug and Play (uPnP) protocol. They managed to intrude into a smart home and control the devices based on this vulnerability. The above mentioned talks were only a highlights of what happened in SPW’16. A list of all the talks, including the reviewers’ opinions are available here. Our team members discussed about the attended talks in our wiki page, here. Furthermore, the live report of the talks by Ross Anderson could be reached via here. Some more relevant tweets could be found with #PETS16 and #SPW2016 in twitter.

Apart from exciting researches presented in the conference, I have to confess it was a very interesting week for me in different aspects. Most important of all, I get to meet a combination of intelligent people from industry, governments and of course academia. I can say the number of attendance from industry or government surprised me. The organizers tried to provide different social events, including a delightful evening in a Bavarian Beer Garden and a dinner gathering in Frankenstein Castle both around Darmstadt. The only noteworthy drawback of the whole event was the mobile app that they developed. The fact that users need to be connected to the internet to see the conference schedules was not the best we could get since as travelers, we are bound to use roaming services which might not be available all the time. Excluding this, it was an excellent experience and a great chance to meet talented people all around the world!

ISO/IEC SC 27, NSA, Shrimp and Crab – A Trip Report

The 52nd meeting of ISO/IEC SC 27 was held last week 11-15 April, in the beautiful city of Tampa, Florida State, USA. Many readers may not be familiar with ISO/IEC SC 27 and the security standards that it develops. So in this post I’ll provide a brief overview of SC 27, its organization structure and the process of taking a new technique to becoming part of the ISO/IEC SC 27 standards. Also, I’ll give a short account of some discussions occurred in Work Group 2 in which I am a member.

SC 27 is a technical subcommittee under ISO and IEC, with the specific mission to standardize security-related techniques. It is further divided into five working groups (WGs) with different working areas

  • WG1:  on information security management systems
  • WG2: on cryptography and security mechanisms
  • WG3: on security evaluation, testing and specification
  • WG4: on security controls and services
  • WG5: on identity management and privacy technologies

To standardize a new security technique, there are multiple stages to go through. A typical process is summarised as follows (also see ISO/IEC stage codes): Study Period (SP) -> Working Draft (WD) -> Committee Draft (CD) -> Draft International Standard (DIS) -> Final Draft International Standard (FDIS) -> Final publication. Exception FDIS, all other stages are compulsory. There are two ISO/IEC SC 27 meetings every year. In the six months between the meetings, national body experts are invited to provide comments on working documents received at each of the above stages. Comments are then discussed in the subsequent meeting, and hopefully are resolved to everyone’s satisfaction. If the document is considered stable (e.g., the comments received are mainly editorial changes, and technical comments are few and trivial) , the document can move on to the next stage, e.g., from the 1st Working Draft to the 1st CD; otherwise, it remains in the same stage with a new version of the document, i.e.,  the1st WD to the 2nd WD. The new document will be circulated among national body experts, with another cycle of discussing comments in the the next meeting. To standardize a new technique typically takes 3-4 years at least.

There are several criteria for deciding whether to include a new technique into the ISO/IEC standards. The first is the maturity of the technique. It’s generally expected that the proposed technique should have been published for a number of years (typically at least 4-5 years) in a stable state, and that it has received a sufficient level of cryptanalysis and no weakness has been found. The second is the industrial need — whether there is a practical demand for this technique to be standardized. Finally, it is considered desirable if the technique comes with security proofs. But “security proofs” can be a very tricky thing, as different people interpret what the “proof” should look like in different ways. Usually, the best security proof is still the proof of “time”, which is why the proposed technique should have been published for a number of years before it could be considered for standardization.

The ISO/IEC SC 27 standardization process may look dauntingly formal and lengthy, but once you get a grip of it, you will find it is actually easier than it looks. I started attending the ISO/IEC SC 27 meetings as a member of the UK national body delegation from April 2014 in the Hong Kong meeting, where I first presented J-PAKE to ISO/IEC SC 27 WG2 for inclusion into ISO/IEC 11770-4. The J-PAKE paper was first published at SPW’08 in April 2004. So it was 6 years old when I first presented it to ISO/IEC. We were open about the public analysis results of J-PAKE, and a full discussion track record was publicly viewable at the Cambridge Blog. My presentation was well received in the first meeting, with the agreement to start a study period and call for contributions from national bodies to comment on the possible inclusion of J-PAKE into ISO/IEC SC 27. Comments were discussed in the next Kuching meeting (Oct 2014) and it was unanimously agreed by national body delegates in that meeting to start the 1st working draft for the inclusion of J-PAKE into ISO/IEC 11770-4, with me and another member of WG 2 appointed as the editors.  After two working drafts, the J-PAKE proposal was formally accepted by ISO/IEC SC 27 for inclusion in the Jaipur meeting (Oct 2015). This was the 1st CD stage.  At this meeting, all comments received on the 1st CD of ISO/IEC 11770-4 were discussed and resolved. It was then agreed in this Tampa meeting that we would proceed to the DIS stage. It is expected that the final ISO/IEC 11770-4 standard that includes J-PAKE will be published in 2017. So it will take approximately 3 years in total.

Attending ISO/IEC SC 27 has been a great personal experience. It’s different from usual academic conferences in that the majority of attendees are from industry and they are keen to solve real-world security problems. Not many academics attend SC 27 though. One main reason is due to funding; attending two overseas meeting a year is quite a financial commitment. Fortunately, in the UK, all universities are starting to pay more attention to research impact (a new assessment category that was first introduced in the 2014 Research Excellence Framework). The research impact concerns the impact on industry and society (i.e., how the research actually benefits the society and changes the world rather than getting how many citations). I was fortunate and grateful that the CS faculty in my university decided to support my travels. Newcastle University CS did very well in REF 2014 and it was ranked the 1st in the UK for research impact. Hopefully it will continue to do well in the next REF. The development of an ISO/IEC standard for J-PAKE may help make a new impact case for REF 2020.

Tampa is a very beautiful city and it was such a long journey to get there. I would fail my university’s sponsorship if I stop here without sharing experience about other happenings in SC 27 Working Group 2.

In the Tampa meeting, one work item in WG 2 attracted a significant attention and heated debates. That item was about the NSA proposal to include SIMON and SPECK, two lightweight block ciphers designed by NSA, into the ISO/IEC 29192-2 standard.

Starting from the Mexico meeting in Oct 2014, the NSA delegate presented a proposal to include SIMON and SPEKE into the ISO/IEC 29192-2 standard. The two ciphers are designed to be lightweight, and are considered particularly suitable for Internet-of-Things (IoT) applications (e.g., ciphers used in light bulks, door locks etc). The proposal was discussed again in the subsequent Kuching meeting in April 2015, and a study period was initiated. The comments received during the study period were discussed in the subsequent Jaipur meeting in Oct 2015. There was a substantial disagreement among the delegates on whether to include SIMON and SPECK. In the end, it was decided to go for a straw poll by nation bodies (which rarely happened in WG 2). The outcome was a small majority (3 Yes, 2 No, all other countries abstained) to support the start of the first working draft, and meanwhile, continuing the study period and call for contributions to solicit more comments from national bodies. (However, the straw poll procedure at the Jaipur meeting was disputed to be invalid six months later in the Tampa meeting, as I’ll explain later.)

Comments on the 1st WD of SIMON and SPEKE were discussed in this Tampa meeting. Unsurprisingly, this, again, led to another long debate. The originally scheduled 45 minute session had to be stretched to nearly 2 hours. Most of the arguments were on technical aspects of the two ciphers. In summary, there were three main concerns.

First, the NSA proposal of SPECK includes a variant that has a block size of only 48 bits. This size was considered too small by crypto experts in WG 2. Some experts worry that a powerful attacker might perform pre-computation to identify optimal search paths. The pre-computation (2^48) is way beyond the capability of an ordinary attacker, but should be within the reach of a state-funded adversary. Also, the small block size makes key refreshing problematic. Due to the birthday paradox, a single key should not be used for encrypting more than 2^24 blocks, which is a rather small number. This bound is further reduced under the multi-user setting as some experts pointed out.

Second, the SIMON and SPECK ciphers were considered too young. The ciphers were first published in IACR eprint in June 2013 (yes, technical reports on IACR eprint are considered an acceptable form of publication according to ISO/IEC). When NSA first proposed them to ISO/IEC for standardization in the Mexico meeting (Oct 2014), the two ciphers were only 1 year and 4 months old. Both SIMON and SPECK are built on  ARJ, which is a relatively new technique. The public understanding of security properties of ARJ is limited, as acknowledged by many experts in the meeting.

Third, the public analysis on SIMON and SPECK was not considered sufficient. The supporting argument from NSA in the Jaipur meeting (Oct 2015) was that the cryptanalysis results on SIMON and SPECK had “reached a a plateau”. However, within the next 6 months, there has been progress on the analysis, especially on SPECK (see the latest paper in 2016 due to Song et al). Hence, the argument of reaching a plateau is no longer valid. Instead, in the Tampa meeting, NSA argued that the cryptanalysis results became “more uniformly distributed” — i.e., now the safety margins for all SIMON and SPECK variants, as measured against the best known public analysis, are roughly centered around 30%, while 6 months ago, the safety margins for some SPECK variants were as high as 46%.

Most of the arguments in the meeting were technical, however, it was inevitable that the trustworthiness of the origin of SIMON and SPECK was called into question. There have been plenty of reports that allege the NSA involvement in inserting backdoors in security products and compromising security standards. The infamous random number generator algorithm, Dual_EC_DRBG, was once added into ISO/IEC 18031:2005 as proposed by NSA delegates, but later had to be removed when the news about the potential backdoor in Dual_EC_DRBG broke out. In this meeting, NSA delegates repeatedly reminded experts in WG 2 that they must judge the inclusion of a proposal based on the technical merits not where it came from. This was met with scepticism and distrust by some people.

Given the previous troubles with NSA proposals, some experts demanded that the designers of SIMON and SPECK should show security proofs, in particular, proofs that no backdoor exists. This request was reputed by the NSA delegate as technically impossible. One can point out the existence of a backdoor (if any), but proving the absence of it in a block cipher design is impossible. No block ciphers in the existing ISO/IEC standards have this kind of proofs, as the NSA delegate argued.

When all parties made their points, and the arguments became a circling repetition, it was clear that another straw poll was the only way out. This time, the straw poll was conducted among the experts in the room. On whether to support the SIMON and SPECK to proceed to the next (CD) stage, the outcome was an overwhelming objection (8 Yes, 16 No, the rest abstained).

Near the end of the Tampa meeting, it also merged that the straw poll in the previous Jaipur meeting on starting the 1st working draft for SIMON/SPECK was disputed to be invalid. According to the ISO/IEC directive, the straw poll should have been done among the experts present in the meeting rather than the national bodies. The difference is that in the latter there can only be one vote per country, while in the former, there can be many votes per country. In the Tampa meeting, it was suggested to redo the straw poll of the Jaipur meeting among experts, but the motion was rejected by NSA on the grounds that there were not enough experts in the room. This matter is being escalated to the upper management of ISO/IEC SC 27 for a resolution. At the time of writing this blog, this dispute remains still unresolved.

OK. That’s enough for the trip report. After a long and busy day of meetings, how to spend the rest of the day? A nice dinner with friends, and some beers, should be deserved.

2016-04-14 19.40.122016-04-14 19.40.072016-04-14 19.39.56

On the Trust of Trusted Computing in the Post-Snowden Age

At the 8th CSF Workshop on Analysis of Security API (13 July, 2015), I presented a talk entitled “on the Trusted of Trusted Computing in the Post-Snowden Age”. The workshop produces no formal proceedings, but you can find the abstract here. My slides are available here.

In the field of Trusted Computing (TC),  people often take “trust” for granted. When secrets and software are encapsulated within a tamper resistant device, users are always educated to trust the device. The level of trust is sometimes boosted by the acquisition of a certificate (based on common criteria or FIPS).

However, such “trust” can be easily misused to break security. In the talk, I used TPM as an example. Suppose TPM is used to implement secure data encryption/decryption. A standard-compliant implementation will be trivially subject to the following attack, which I call the “trap-door attack”. The TPM first compresses the data before encryption, so that it can use the saved space to insert a trap-door block in the ciphertext. The trap-door block contains the decryption key wrapped by the attacker’s key. The existence of such a trap-door is totally undetectable so long as the encryption algorithms are semantically secure (and they should be).

trapdoor_encTo the best of my knowledge, no one has mentioned this kind of attack in the literature. But if I were NSA, this would be the sort attack that I would consider first. It is much more cost-effective than investing on quantum computers or parallel search machines. With the backing of the state, NSA could coerce (and bribe) the manufacturer to implant this in the TPM. No one will be able to find out since the software is encapsulated within the hardware and protected by the tamper resistance. In return, NSA would have the exclusive power to read all encrypted data at a mass scale with trivial efforts in decrypting data.

Is this attack realistic? I would argue yes. In fact, according to Snowden revelations, NSA has already done a similar trick by implanting an insecure random number generator in the RSA products (NSA reportedly paid RSA US$10m). What I have described is a different trick, and there may well be many more similar ones.

This attack highlights the importance of taking into account the threat of a state-funded adversary in the design of a Trusted Computing system in the post-Snowden age. The essence of my presentation is a proposal to change the (universally held) assumption of “trust” in Trusted Computing to “trust-but-verify”. I gave a concrete solution in my talk to show that this proposal is not just a theoretical concept, but is practically feasible. As highlighted in my talk, my proposal constitutes only a small step in a long journey – but it is an important step in the right direction I believe.

Topics about NSA and mass surveillance are always heavy and depressing. So while in Verona (where the CSF workshop was held), I took the opportunity to tour around the city. It was a pleasant walk with views of well-preserved ancient buildings, the sunny weather (yes, a luxury for someone coming from the UK) and familiar sound of cicadas (which I used to hear every summer during my childhood in China).

The Verona Arena is the area that attracts most of the tourists. The conference organizers highly recommended us to attend one of the operas, but I was eventually deterred by the thought of having to sit for 4 hours and listen to a language that I couldn’t understand. So I decided to wander freely. As I entered a second floor of a shop that sold hand-made sewing items, my attention was suddenly drawn by someone who passionately shouted while pointing figures toward outside the window, “Look, that’s the balcony where Juliet and Romeo used to date!” Wow, I was infected by the excitement and quickly took a photo. In the photo below, you can see the Juliet Statue next to the balcony. (Of course a logical mind will question how this dating is possible given that the two people are just fictional figures, but why should anyone care? It was in Verona, a city of legends.)


Perils of an Unregulated Global Virtual Currency

We (Dylan Clarke, Patrick McCorry and myself) recently presented a position paper at the 23rd Security Protocols Workshop (SPW) in Cambridge. Our paper, titled Bitcoin: Perils of an Unregulated Global P2P Currency, makes the case that the ideological and design choices that define Bitcoin’s strengths are also directly responsible for the Bitcoin-related crime that we encounter in the news so often today.

In a nutshell: Bitcoin’s anonymity and lack of regulation are key to freeing users from central banks but they also empower drug dealers and money launderers. Using virtual assets as money reduces dependence on banks as users can handle their own wealth, but this opens the door to hackers and malware. Mainstreaming an entire global financial infrastructure to trade virtual assets cuts banks out of the picture entirely, but also de-risks crime, exposes users to threats from all over the world and opens a Pandora’s box of ethical and legal dilemmas.

We do a quick survey of the landscape of Bitcoin-related crime and observe that crime is thriving with rapid growth and increasing sophistication. Dark markets are taken down often but they continue to grow in numbers and volume. Bitcoin also de-risks crime: drugs can be ordered almost as easily as pizza, and criminals no longer need to take the risks traditionally associated with setting up and protecting illicit financial flows. Bitcoin exchanges are a regular target for hackers and customers routinely end up losing their coins. Malware that steals bitcoins from victim’s computers is booming. The ransomware industry is also thriving. In a short space of three years, CryptoLocker and CryptoWall have claimed hundreds of thousands of victims and successfully made tens of millions of dollars. There’s now even a DIY ransomware kit out called Tox – customers download an executable, secretly infect someone’s computer, and then share the ransom with the makers of the kit.

Flipping Bitcoin’s positive strengths also gives us insight to anticipate future threats: Governments and law enforcement are already sounding the alarm that Bitcoin’s anonymity and lack of regulation is ideally suited for tax evasion and money laundering. Non-currency exploits can piggyback on the Bitcoin network infrastructure. Researchers have already demonstrated how to deliver malware and operate botnets by crafting Bitcoin transactions embedded with malicious payloads.

There are no easy answers to this. If Bitcoin becomes ubiquitous, this will be the new normal. It is not possible to ‘tweak’ Bitcoin to make the negatives go away without affecting its key strengths. This is similar to the TOR dilemma – i.e. an anonymity network for activists living under repressive regimes will also empower hate speech and illegal pornography. This tradeoff, for Bitcoin, has yet to be explicitly acknowledged.

This theme – that we must recognize certain security threats do not have solutions in the technological domain – emerged periodically on the three days in the workshop in talks on disparate topics, including browser fingerprinting, TOR deployment and software design.

Apart from that, it was good weather in Cambridge. This was my first time at SPW, this particular workshop was hugely inspirational during my own PhD, and I was very excited to participate in it for the first time. The food was spectacular. A big and surprising highlight was – I’m a big fan of Lord Protector Oliver Cromwell – and during the course of the workshop I discovered not only did he study in the college where our workshop was being conducted, Sidney Sussex college – but even more astounding – that Oliver Cromwell’s head was buried in a room right next to where we were convening. (Cromwell died in 1658, his body was disinterred after the British monarchy was restored in 1659 and was hung and decapitated. The head passed into the hands of private collectors and was finally secretly buried in Sidney Sussex College in 1960).

Plaque marking burial site of Oliver Cromwell's head in Sidney Sussex College, Cambridge

The technical report for our paper can be found here and all SPW talks are liveblogged here (courtesy of Ross Anderson).

Every Vote Counts: Ensuring Integrity in Large-Scale Electronic Voting

Last week, at USENIX EVT/WOTE’14, in the beautiful city of San Diego, I presented a paper that was jointly co-authored with my former colleague at Thales (Mr Matthew Kreeger) and colleagues at Newcastle University (Prof Brian Randell, Dr Dylan Clarke, Dr Siamak Shahandashti, Peter Hyun-Jeen Lee). The title of our joint paper is “Every Vote Counts: Ensuring Integrity in Large-Scale Electronic Voting” (presentation slides here).

In this paper, we first highlight a significant gap in the e-voting research field that many people seem to have ignored: while the End-to-End (E2E) e-voting systems have been extensively researched for over twenty years and have been commonly heralded as a rescuer to many controversies in e-voting, in practice few of those systems have actually been implemented and almost none of them used in real-world national elections.

We are motivated to find out the root cause and to narrow the gap. Our hypothesis is that the existing E2E systems’ universal dependence on a set of tallying authorities (who are assumed to be from parties of conflicting interests, be expert in cryptographic key management and be expert in computing) presents a significant hurdle towards the practical deployment of those systems.

We then show that the involvement of tallying authorities is not strictly necessary at least in some election scenarios. In particular, we focus on DRE-based (Direct Recording Electronic) elections conducted at supervised polling stations. This is perhaps the most common election scenario in national elections around the world, e.g., USA, India and Brazil.  We present a new cryptographic voting protocol called Direct Recording Electronic with Integrity (DRE-i). The DRE-i protocol provides the same E2E verifiability as other E2E voting protocols, but without involving any tallying authorities. Hence, the system is “self-enforcing”. By comparing with related E2E protocols that are dependent on tallying authorities, we demonstrate that a self-enforcing e-voting system is significantly simpler, earlier to implement, more efficient and has better usability – all of this is achieved without degrading security.

We welcome interested readers to scrutinize our paper, point out any error or discrepancy that you can find, and feel free to write your feedback in the “Comments” below.

Quantitative Workflow Resiliency

John Mace will present at ESORICS 2014 some joint work with Aad van Moorsel and myself, on the problem of quantitative workflow resiliency.

Intuitively, a workflow is a partially ordered set of tasks, such that if a task t is less than a task t’, then t must be executed before t’. Each task is executed by a user, and different tasks can be executed by different users. For instance, consider the reviewing process for a journal: the paper is assigned to some reviewers by an editor, then each reviewer submits a review for the paper, and the editor makes a decision. Clearly, the decision can only be made after each review has been submitted, and each review can only be submitted after the assignment has been done. However, the different reviews can be submitted in any order.

A security policy for a workflow consists of multiple constraints over which users can execute which tasks.  For instance, some tasks can only be executed by some users (e.g., only an editor can assign a reviewer), some tasks are bound by a separation of duty (e.g., someone submitting a paper cannot be reviewing that paper), while other tasks are bound by a binding of duty (e.g., the editor making the final decision must be the same than the editor in charge of the initial reviewer assignment). Assigning each task to a user while respecting the security policy is known as the Workflow Satisfiability Problem (WSP), and we refer to “On the Parameterized Complexity and Kernelization of the Workflow Satisfiability Problem”, by Crampton, Gutin and Yeo, from Royal Holloway, for a recent paper on the WSP (Jason Crampton is giving a Colloquium at Newcastle University on the 22/07 about this line of work).

To some extent, the WSP is rather static: the workflow is analysed together with its policy, and either a correct assignment is found, in which case the workflow can be executed, otherwise there is no possible assignment. However, in practice, the availability of the different users can vary from the start of the workflow until the end. For instance, a reviewer can become unavailable after being assigned a review, or the editor in charge of the paper at the beginning could develop a conflict-of-interest with the authors during the process. Wang and Li introduced the notion of workflow resiliencyroughly speaking, a workflow is N-resilient if the WSP can still be solved while N users becoming unavailable. Resiliency can be static, when users can only become unavailable before the start of the workflow, decremental when users can become unavailable at any point, or dynamic, when users can become unavailable at any point and then potentially become available again. 

Resiliency is a binary notion: a workflow is N-resilient if any subset of N unavailable users blocks it, and not N-resilient otherwise. In the ESORICS paper, we suggest that in practice, user availability is not a strictly deterministic property, but a probabilistic one. Indeed, if we can know in advance that a reviewer is not available at a particular time, then we can simply make that reviewer unqualified for the corresponding task, thus reducing resiliency to solving the WSP; Instead, we might just know that a reviewer has some probability to become unavailable at some point (e.g., sickness, over-scheduling, travelling, etc). In other words, a workflow has some probability to terminate execution, depending on whether some users become unavailable or not.

Now, consider two workflows both satisfying the WSP (i.e., assuming full user availability, it is possible to find a user assignment ensuring that each task is executed), such that, over 100 instances execution, the first one fails to terminate 1 time, while the second fails to terminate 99 times, for a similar user availability. Although both are not resilient, we believe these two workflows should have different measures, since the first one could actually be acceptable in many situations, while the second one could effectively be unusable. Of course, one could argue that, if possible, a workflow terminating in all cases is ideal, but in practice, if we had to choose between the two previous workflows, the first one is likely to be better.

Hence, we define the notion of Quantitative Workflow Resiliency, which measures the likelihood of a workflow to terminate given a probabilistic user availability model, and considering that users can be reassigned at runtime, as long as the policy remains satisfied. More precisely, we encode the problem of selecting a user for a task as a Markov Decision Process, where each process state consists of a context (containing the previous task assignment and the user availability) and a task, and each action corresponds to selecting a user for the task of the state. The transition function govern which task is to be executed next and how the context evolves, in particular concerning user availability. The termination of the workflow is associated with a reward of 1, while any other transition is associated with a reward of 0. The value of the initial state then corresponds to the probability of the workflow to terminate. We also showed that by associating each transition with a reward of 1 and termination with a reward of 0, the value of the initial state corresponds to the expected number of tasks executed by the workflow.

We believe these two metrics are useful to evaluate workflow resiliency, and provide more information than a binary notion of resiliency. In particular, we provide in the paper an  illustrative example, where the assignment optimising termination is different from that optimising the number of tasks, and both are different from that optimising the computation time required at runtime to calculate the assignment. In other words, finding an optimal user assignment might require to solve a tradeoff between termination, task execution and computation time.