Open Document Repository Use Case: Open Data

The amount of data we generate about ourselves is increasing exponentially. Government and corporations know who our friends are, where we are at any time of the day and what we pay for. All (meta) data collected about us (citizens) is machine-readable, it’s (near) real-time, it can be stored as long as required and analysed whenever necessary. Governments and corporations have near unlimited capacity to store data and to process it.

When it comes to access to data collected on citizen behalf and at taxpayers expense by our government, the data is inconsistent, outdated and not machine-readable. But even if data is open:

  • how do we know that what’s published is what actually had been published?
  • how do we know that data which is accessible today will be available tomorrow or in 10 years time?
  • How we can ensure that meta-data permanently recorded.

Open Document Repository (ODR) addresses data authenticity, permanence, persistence and security.

How does it work?

  1. Put a document on an open distributed storage system: IPFS
  2. Address the document by its hash, so changes will be obvious
  3. All document versions will be put on IPFS and the references to them on an immutable public ledger (blockchain), so they can be traced back
  4. Documents are signed by the authors
  5. Everything is put into a search engine for easy access

Publisher:

  1. The publisher selects “Upload new document” or “change existing document”
  2. The publisher updates document metadata (Author, keywords, etc)
  3. The publisher signs the changes
  4. Document and metadata are uploaded on IPFS
  5. IPFS reference is secured on public ledger (blockchain)
  6. Document and metadata is indexed on document repository for easy search and retrieval

User:

  1. The user navigates to the web document repository search engine
  2. The user selects keywords and fills in search field, or browses based on categories
  3. The user selects the document from list of results
  4. The document is presented with:
    • Metadata
    • Timeline with links to previous versions including public ledger references
    • Digital signatures

Open data needs to be seen as a digital public infrastructure, not a series of projects.

This infrastructure needs to be linked, secure and permanent.

[ * ] ODR demo code: https://github.com/kubrik-engineering/opendocumentrepository