Requirements for an Industry Blockchain

We see a large variety of technical blockchain developments on one side and business models tested against the blockchain approach on the other. Not always do these two fit. This article lists some blockchain features that I consider essential or at least useful should blockchain technology be used as an industry-grade infrastructure.

There is an increasing blockchain hype out there: New start-ups every day and marketing buzz everywhere. Admittedly, most conference presentations become more knowingly than a year ago, but all-in-all, we mostly see two types of contributions:

  • Marketing noise whenever great ideas for finance, energy, and other sectors are praised. OK, a good thing is that the blockchain sets free all creativity of business developers, process engineers, and visionaries. But the link to the actually feasible, i.e. to implement this technology, is mostly weak. We understand “Implement” as going beyond just building a prototype which often ignores restrictions like performance, access rights, or immature code of the blockchain itself.
  • On the other hand there are the many core technology experts, annoyed by the above noise, and who do not want to throw overboard sacred values such as immutability, transparency and publicly hosted nodes. But exactly this narrows the usage potential of today's blockchain implementations.

Who should be asked in more detail is the user. And a quite relevant user is the industry, who is used to use a software and service infrastructure that just works – at least at 99.9 percent. And such infrastructure has to meet a range of industry requirements – the usual “-ilities”: usability, reliability, maintainability, …

It should be clear that we need to think differently if we think of applying the blockchain to industry requirements:

  • In the traditional software development process, users / organisations / consortia analyse their business processes, derive a “to-be” specification and hand it over to the IT guys to implement just that. I.e., software follows the business. That is fine, it works and it has been practised more or less well over 50 years.
  • But with technologies like the blockchain it is all different: It provides high potential if applied “as-is”. I.e., with publicly transparent transaction content, a low-cost hosting infrastructure, automatic node recovery, etc. But what won’t work, at least not efficiently, would be to coerce the capabilities of the blockchain to given business requirements. As said: in such case, encryption keys would have to be managed, the blockchain itself would need to be managed, adding new nodes may requires hours for re-synching with the others, evolution of data content would require a general reset, and a blockchain growing into terabytes may reach system limitations quite soon. Businesses would rarely benefit from this.

 

If we turn around the direction, actual disruption takes place if a new business model is developed that just fits the blockchain: e.g., using it as a notary’s document register, using it to track ownership of artifacts, etc. And of course, the mother of all blockchains, Bitcoin itself, is definitively a success story. But its own blockchain is build-in, restricted, and prone with a lot of weaknesses when it comes to industry requirements.

With the introduction of the blockchain technology new potential but also new restrictions are introduced. E.g., we could encrypt data written to the blockchain to establish access restrictions, but that would be very inefficient, and if third parties need access to that data, keys need to be transferred to them through a secure channel – which would introduce an additional security threat.

On the other side, potential for using the blockchain would drastically increase if some useful features could be added or modified to better match industrial requirements. This way, such an Industry-grade Blockchain could increase its versatility by addressing the requirements listed further down.

As this is a “living document” I’ll add or extend this list from time to time as I expect further requirements over time.

Proof of Stake

This is already a commonplace with some implementation: instead of wasting Ireland’s energy consumption for mining bitcoins – or less for ether – a more efficient way to create, find consensus and distribute new blocks is essential. Any effort for mining drives block time up beyond seconds or even minutes. If the blockchain is, however, used as a platform for communication and collaboration, minutes appear archaic in today's world of P2P communication. In connection with industrial blockchain use, there should be a trusted third party role accepted that just confirms blocks, as it is the case, e.g., with Tendermint or Hyperledger. If consensus effort is reduced to some milliseconds, block time can be driven down to a second – which creates completely new opportunities for many-to-many communication patterns.

One implication of this should be noted: those who see smart contracts as the blockchain’s crowning glory may face a restriction here: it is not sure which block time is minimally required to perform the execution of a large number of code snippets on the chain. Maybe this can be driven down to a minute or even 30 seconds, but here we are still outside the realtime zone.

But, as said – this is a requirement already met today by some blockchain implementations.

Typed Blocks

This idea is directly derived from the above analysis. As both smart contracts and high-speed blockchains make sense, would it be useful to support a generic blockchain layer that allows to put “purpose block types” on top? So, blocks for high speed payment transactions (1 second), blocks for industrial B2B transactions (10-60 seconds), block that contain smart contracts between individuals (1-10 minutes). Similar to the data formatting requirement further down, block types may vary in different ways: block time, consensus algorithm, number of Tx per block, content data types etc.

Sharding

Sharding is the separation of content across subsets of nodes such that not all nodes need to carry all processing load. This is also subject of the current discussion of a “blockchain 2.0” infrastructure, see also Vitalik Buterin’s  Mauve Paper.

Sharding can be understood in different ways:

  • Sharding of smart contracts. Here the main goal is to take the burden from nodes which all execute all contract code today. The bottleneck is the slowest computer. Sharding is instead a load balancing approach that distributes load according to the processing capacity of nodes in such a way that a consensus mechanism still applies to the few nodes that execute a contract in parallel. This feels like an elastic cloud that reacts to adding new nodes by re-distributing overall load with each node being less burdened.
  • Sharding of content in general. This focuses less on load from executing smart contracts but on the number of transaction that need to be confirmed. Such a distribution of work could reflect geographical distribution in an IoT scenario or for distributed grid operation in the energy sector. Here, local balancing activities within a local power grid in Toscana is not really of interest to local grid stakeholders in Hamburg. But the technological infrastructure may be the same as well as applications that access the blockchain. I.e., an application software should be sold by a vendor to both users in Toscana and to those in Hamburg without having to adopt any localities of the infrastructure.
  • Hierarchical blockchain. This is “sharding with a linking layer”. Principally, the regions (i.e. shards) are separated and most participants do not need to tell each other anything. But in a congestion situation, there is a need to link local grids through the hierarchy of distribution and transmission system operators. Not locals, but possibly intermediaries buy and sell in each region and balance demand and supply across them at a higher level. Such transregional participants need linked local booking zones as well such that for a transaction between them delivery and settlement can be performed with least possible effort.

Think of “Scenario 2030” on page 30 in my book chapter and how hundreds of Billions of transactions could be processed each day. If a basic zone consists of 1.000 local participants with 1.000 such zones per regions and 1.000 regions across Europe this may work from a load perspective if 99% of transactions are executed within each zone or region.

Access rights management

In the blockchain world, this requirement is a contradiction in itself: on one hand side, a blockchain is transparent to an open or closed group of users, on the other, exactly these users don’t always want to grant read access to others. Or maybe not to all others. So the questions is how to establish Chinese walls between data objects owned by separate users?

The first response in the sense of the abovementioned is here to avoid hiding information from each other. Forget the traditional business model. As I have written in my book chapter, the business model should be appropriate. Sophie (in the prologue) doesn’t care if her neighbours (i.e., “competitors”) know that she earned 230 EnerCoins with her PV roof panel. However, today Vattenfall, E.On, Alpiq, ENEL, CEZ, etc. do definitively care, but if we think of “scenario 2030” and a perfect market, all these market participants will trade the same products with the same cost structure, the same knowledge on weather forecast and consumer demand – so they will probably trade energy also at the same price.

But if we still do want to prevent data access by other participants – what options do we have?

  • One is to use a distributed database approach as proposed by BigchainDB. Just replace the file-based blockchain by a distributed database approach and apply database mechanisms to define user rights and for access restrictions.
  • Another is to introduce an access management layer between the “raw” blockchain and the application which handles content encryption and the distribution of keys between authorised participants. This can be solved in different ways, one being, e.g., MIT’s ChainAnchor project which uses group encryption to allow encrypted content be disclosed to defined third parties.

Cut the blockchain history into pieces

“blockchain” implies an endlessly increasing file size. When writing this, the bitcoin blockchain is reaching 80 GB size with a block rate of just 10 minutes. If block time is reduced to one second size would 600fold, i.e., the bitcoin blockchain would exceed 48 terabytes – which isn’t a manageable piece of data any more. Even worse, synchronising 42 terabytes is already a challenge if it is just copying the file. But synchronisation means more: individual blocks are exchanged and validated again and again whenever somewhere a new node is started. And how long would it take to synch a new node into the existing network? Nobody knows – but for sure too long!

On the other hand: is there really a need to treat historic transactions the same as the most recent ones? Not as long as they can still be verified as being valid transactions within valid blocks. So why not cut the blockchain into smaller pieces? Since it is just a file today, such a separation would not keep nodes from being able to validate the content.

If the blockchain is run privately it is assumed anyhow that there is a dedicated trusted role. This may still be shared by a number of node operators who do not fully trust each other, but the blockchain’s consensus mechanism can be applied also to chunks of blocks in order to separate them from the live part of the blockchain, and authenticate them as a file.

In other words: Every new hour, the expired hour’s blocks are move to a file like “2016-09-16-15.bc”, here representing all blocks created between 15:00 and 16:00. And then again, the file “2016-09-16.bc” could be created after midnight. Further, monthly and yearly chunks are created and signed by the node operators.

Such a fragmentation would allow a new (or restarted) node to recover within a minute by synchronising only the life part, i.e. the most recent blocks. I.e., the strict two-layered blockchain approach “transaction” and “block” could be relaxed in one transaction layer and N nested block containers – like a Russian doll. This way, a node could easily catch-up with its peers and also efficiently load historic data in parallel.

Standards for managing permissioned blockchains

Today, it is already a nightmare to do B2B integration among members of a business consortium if they do not use a shared PKI. Verifying public key certificates alongside certificate hierarchies across certification authorities still tends to be a challenge – even 30 years after public key cryptography has been invented.

However, the blockchain can help here: if public CAs stored their certificates to a public blockchain, including their root certificates and revocations, the public could not only access them in a standard way but thanks to the blockchain’s immutability all information could be obtained in the right chronological order. No local CA directories would be needed any more, and thanks to the 100% availability PKI users in connection with above fragmentation could always access an always up-to-date node.

And publishing certificates is just the beginning: standard user profiles could be written to the blockchain as well. Data elements such as IP address, participant names and codes, supported XML releases and communication processes would be easily published within the consortium such that everyone could be informed on updates or new participants.

Standard data formatting

I would bet that the same logical transaction data is physically represented in as many ways as blockchain usages exist. E.g., an order, a payment transaction, a purchase, a smart contract for renting an appartment is implemented individually by every blockchain programmer out there.

It would be over-the-top to expect a standardised data format for all possible situations in life. But what would really make sense is a standard representation for data values such that a number can be understood as a number and a date as a date. If this is achieved, generic blockchain browsers would make sense which could access to blockchains of different makers. Of course through a unified API that allows to access blockchain data. In other words, we need a standard, again.

Sometimes I feel like being transferred back in time by 20 years: in 1997 the XML standard was nearly finalised as a data representation means that helps exchange data between organisations in a better way than it was done back in the EDIFACT age. The good thing with XML is that this standard is packaged with its own data definition, called XML Schema. The flipside is here that using XML would let the blockchain explode with redundant mark-up overhead. Instead of representing a number as x01 it would be marked-up in XML as, e.g., <amount>1</amount>. So we are back to square one: use a standard mark-up language but avoid the massive overhead of XML. JSON is a good step forward here, but maybe others would fit even better.

Anyway, whatever the standard finally is, it will unlock the power of genericity that we know from HTML, XML, and other languages. Competition for developing the most useful monitoring or inspection tool will automatically emerge. And as this focusses on the content layer, it could be decoupled from evolution of the more physical plumbing of blockchain stack.

Updateability

A quite unknown term at the business layer – why care about a system not able to be updated? Usually the IT staff spends a weekend in the data cellar of a firm and gets the new version running with all migration efforts and pains of unforeseen obstacles. This may cost a lot of pizzas and some extra money for the weekend work – but where’s the issue?

Updating from data format release 2.1 to release 3.0 could be N times more complex on the blockchain, N being the number of nodes or actually the number of users. Imagine a P2P trading tool like Enerchain: what if the 78th regulation of energy trading kicks-in and the current trade data format in release 77.2 needs to be updated to release 78.0? Does this need a big-bang approach? Stop all systems for 10 minutes and then restarted them again? Will all 2000 participants be willing or able to do this? At night or on a weekend? Probably not.

So data migration is indeed a harder issue on the blockchain than if application logic is centrally hosted. An industry blockchain should in a way support evolution of data and code in a systematic way, specifically if both are coupled as it is the case with smart contracts.

P2P encryption between blockchain nodes

Modern B2B integration uses end-to-end encryption and authentication. So using the blockchain today means also in this regard a leap back to the year 2000: when transferring transaction data and entire blocks data is mostly exchanged through an unencrypted channel. It is easy to apply end-to-end encryption as a part of the blockchain protocol for the exchange of transactions and blocks – but it is not (yet) implemented for all products. And setting up a VPN would be a mirroring of the blockchain network at the infrastructure layer – this could be avoided if security was already built into the blockchain’s p2p protocol.

In the future this list may be extended as we come across new challenges. It might be a bit frustrating for the reader not to find immediate solutions here but only issues – but as said, this is mainly a requirements statement. Solutions will surely follow over the coming years.