The collection and retention of data has been a core function of the modern state since its inception. From the aggregation of mortality records kept by the church in the seventeenth century, to the patchy registers of slaveowners held in its colonial offices – which most often did not also include the names of the enslaved – the nature of information gathered by the British government has always been historically contingent. That is to say, decisions about what is worth collecting, how it is analysed, who gets access to it and for what purpose have emerged in particular political economic contexts.
Beginning in the late 1970s, the UK government embarked on a process of neoliberal reform, ushering in the increased involvement of commercial interests in the public sector and policies to deregulate finance. The sale of various publicly-owned companies under Margaret Thatcher, such as the railways and the steel industry, are perhaps the best-known examples of radical neoliberal economic policy change from this earlier period, though measures to outsource public services and commercialise public assets were also introduced under the 1997-2010 Labour government, and have continued under the various Conservative coalitions since then.
The expansive data produced by the public sector, which includes health, meteorological and fiscal information, has not been immune to the transformation of the state over the past few decades. But the opening up of government datasets to commercial actors and the implications of this for the public sector and society more widely have been largely overlooked. Government, industry and civil society bodies, often under the banner of “open government data” (OGD) campaigns, have long advocated unrestricted and free access to public sector information, hailing it as indispensable for government accountability and digital innovation, among other causes. And while the democratic claims of certain groups advocating OGD are unquestionably important, the political economies of the public sector and of digital infrastructure over recent decades require us to consider whether the existing structures for accessing publicly-held data are the most appropriate for harnessing common wealth, collective ownership and public good.
This paper introduces “digital public assets” as a concept for rethinking how we value the data created in society that is held by the public sector. In today’s economy, characterised concomitantly by growth crises and the ascendancy of big data in global markets1, questions of who decides how publicly-held data is used and for what ends have perhaps never been more important.
Building on widely adopted legal definitions of public sector information (PSI)2, digital public assets can be defined as all registries, databases and information collected, produced and/or held by public sector actors available in digital format. Public sector actors responsible for the production and use of digital public assets at present include government departments, local authorities or other bodies, such as NHS England. Most of this data is textual or numerical, though it also comprises visual and audio recordings, such as images produced in healthcare settings. Policies enacted by the UK government under both Labour and the Conservatives has led to the digitisation of vast amounts of PSI in recent years, contributing, in its own way, to a particular historical understanding of public data. The use of the appendage “digital” in the proposed term is to acknowledge the significant economic and societal implications that this transformation entails.
In practical terms, digital public assets include:
The current modes of describing and conceptualising the data held by government bodies within policymaking fail to capture the weight of the potential social and economic value stored within it. This matters because unlike the sprawling data infrastructures of big tech companies such as Facebook4, how publicly-held data is used – and which interests profit from its use – has largely failed to capture the public’s and policymakers’ attention. This is despite private companies having in recent years increasingly sought access to datasets developed and held by public bodies, such as the NHS5. Often, these datasets are far more structured and comprehensive than anything the private actor can collect by itself, meaning they can more accurately and easily be utilised.
Conceptualising publicly-held data as a “public asset” pays heed to its potential importance not just for commercial actors, but also collective forms of wealth and social endeavours. Understanding how value is created and harnessed through digital public assets should be the first step in the development of any policy that concerns its ownership, governance or use.
Two decades ago, no one could have foreseen how lucrative the digital technology industry would become. Today, Microsoft, Apple, Amazon and Alphabet (which owns Google) are the four largest companies in the world, and their supremacy in capital markets is unlikely to abate any time soon. The factors underlying this startling rise to power have been described elsewhere,6 but crucially include developments in data processing techniques and the mushrooming of markets in the intangible goods that this innovation has enabled, such as cloud-based databanks and machine learning technologies.
Capital investment in these markets has increased sharply in the years since the banking sector-made financial crisis of 2008. But Mark Zuckerberg was barely out of his Harvard dormitory when OGD campaigns first began to gain steam in the UK. These were the early years of the Iraq War, when scrutiny of government actions was at an all-time high, and the idea that the company behind the search engine on your Windows XP might one day hold more information about your health than your GP surgery was entirely unimaginable. The OGD movement at this time was a disparate coalition that had the support of IT specialists, free speech advocates, social democrats, Marxists, liberals and libertarians.7 Media outlets from across the political spectrum lent support to its early calls, which centred most vocally on democratic accountability of the state and the liberating potential of digital innovation.8
And while these goals remain pertinent today, the revolution in the digital technology industry since then forces us to both reflect on the interests that have come to benefit from the open data policies that have been introduced in the UK in the ensuing years, and rethink how social value might best be harnessed collectively from those technologies.
Dr Jo Bates from the University of Sheffield has published extensive research on the role of commercial and financial interests in driving the “transparent government” initiatives introduced under the 2010-2015 coalition government.9 New OGD policies were initiated within a few weeks of the coalition government’s coming to power. They became a central feature of both the so-called Transparency Agenda, which purported greater public scrutiny of government spending within the new confines of austerity; and later the Open Public Services Agenda, which, after successfully fostering popular agreement that public services were inefficiently delivered, pledged to open them up to private providers.1011
Data.gov.uk, which had been established in the final years of the New Labour government, quickly became one of the largest open data repositories in the world. Today, it hosts over 40,000 de-identified datasets, developed across the public sector on topics including defence, the economy, and crime. The European Commission, which publishes a ranking of EFTA countries based on their “open data maturity”, boasts of the UK’s success in promoting a culture of open data.12 OGD in the UK is used by actors across society, and often by civil society organisations, academic researchers and others whose work does not directly generate market value but may nonetheless be regarded as socially valuable. Its use by commercial and financial actors has skyrocketed, and public sector datasets that are not publicly available have also been accessed by commercial actors through public-private partnerships; the deal between DeepMind and the Royal Free NHS Trust in London, which saw the transfer of - to the Google-owned company, is perhaps the best-known example of this.13
It is this new relationship, in which the costs of producing digital public assets fall largely on the public sector and society, but the surplus value so often comes to be realised by large digital platform companies and the financial services industry, that perhaps most deserves our attention today. The next part of this paper will consider how we value the end-uses of digital public assets, before exploring how that value is created throughout their development.
In the late 1990s, a coalition of US energy industry giants, including Enron, developed a new financial instrument that would act as cover against non-extreme weather events, such as unusually warm winters. ‘Weather derivative’ contracts enabled companies to receive a pay-out if the temperature diverged significantly beyond the yearly average. Before long, a secondary market for weather contracts was established by sellers of contracts to manage risk in these derivatives. These financial markets saw huge growth in the mid-2000s.
When the ‘weather risk’ market first developed, the data most sought by its traders was produced by public meteorology bodies, such as the UK’s Met Office. This data had long been freely and publicly available in the US; that was not the case in the UK, where the Met Office had been under pressure as a recently converted Trading Fund to commercially exploit its resources. Important players in the financial services industry lobbied against this practice. Eventually, in the Autumn Statement of 2011, Chancellor of the Exchequer George Osbourne announced that the government would publish the data sought by the weather risk market actors with unrestricted access. It has been suggested that this enabled the UK’s weather risk market to become a multi-billion-dollar market, generating colossal returns for financial traders at no extra expense to them – value that was created through but not directly recaptured by the public purse.
Dominant narratives about publicly-held data presuppose that its value for society can only be measured based on the prices fetched by the goods produced by its use on the market. The implication of this is that what yields economic value for market actors, such as big tech companies, is also necessarily valuable for wider society. In reality, digital public assets can be harnessed by all the actors described above to produce both market value and/or social value. These are of course not necessarily mutually exclusive; the use of publicly-held data to create private profit may also produce outcomes that are socially valuable. Conversely, actors that use the data for primarily social ends, such as healthcare NGOs, may also indirectly benefit financially from doing so. It is nonetheless an important distinction, given the influence of conceptions of value that measure value on the basis of market prices alone in contemporary society.
There are a number of ways in which digital public assets might produce market value for a private actor. Broadly, these fall into two categories:
Where market value is generated from the exchange of these products or services, the public sector does not directly capture its surplus within the current ‘open’ data model. It has been argued that these uses nonetheless generate indirect economic returns to the public sector – and thus wider value for society – by stimulating growth in the economy and fostering opportunities for taxation. Evidence to support such claims is, however, limited, not least because existing tax systems themselves rely on the market value of end-use product transactions. On the contrary, we might contend that the establishment of government infrastructure to collect and publish publicly-held data for use by the private sector constitutes taxpayer subsidisation of a commercially valuable resource17, both as an intermediate and final good. It is also noteworthy that questions of tax avoidance and evasion have long circled around a number of the companies that benefit from (and have long been at the forefront of lobbying to open) UK government datasets, such as Google.18
While estimating economic value of digital public assets is no mean feat, how we decide what is “good” for society – what produces social value – is also a highly contested matter. The Friedmanite view that pursuing the interests of shareholders is the ‘social responsibility’ of business actors exerts significant influence across our existing economic system.22 But it is certainly not how most people understand social value. Democratic structures can allow us to discern what is in the collective interest of the many, while also elevating the voices of groups most marginalised by capitalist and colonial power structures. Indeed, there are numerous non-commercial uses to which digital public assets are currently employed that would be widely considered socially valuable. These include, for example, analysing air quality data to assess the potential health implications of inner-city fossil fuel emissions. Or using de-identified A-Level grades to assess regional variability in education outcomes.
It is also easy to think of ways in which digital public assets could be used without an obvious profit motive for ends that are ultimately harmful to members of society – and history provides us with plenty examples of that. Governments themselves have often used the data they possess disingenuously.23 Today, crime and biometric data are being harnessed by state police and intelligence agencies to develop algorithmic “predictive policing” and new surveillance technologies, which Jackie Wang and others show are used to further oppress people of colour and working-class populations.24 Activist and anthropologist Nanna Dahler has described how biometric data collected today by states in Scandinavia – so widely lauded as archetypical social democracies – has been utilised to limit the movement of asylum seekers and enforce border controls across Europe.25
It is precisely because the information collected about us holds within it so much potential power and force that its use should be democratised beyond existing representative structures. The present model of open government data, whereby many digital public assets are available for all to utilise, with little collective governance over the uses to which they are put – nor equitable distribution of the huge profits realisable from them – is antithetical to what motivated ambitions for a ‘data commons’, first envisioned by early OGD advocates.
Transport for London (TfL) has published real-time, open data through a free unified API (application programming interface) for just over eleven years. This data has been used to develop over 675 mobile phone and online apps, overwhelmingly within the private sector. Ahead of its initial public offering (IPO) earlier this year, which valued the Silicon Valley-based company at $82.4 billion,19 Uber integrated TfL data into its ride-hailing app, promising investors that it would become the market leader in journey planning.20 Undoubtedly, as Uber itself predicts, many Londoners will use the newly transformed app, and consider its features useful. However, the increased economic value that it is anticipated the app will generate from using this open data will not be returned directly to TfL.
And of course, those working for Uber as drivers and couriers are unlikely to see improvement in their pay, at least while their demands for liveable conditions continue to be dismissed by the company.21 Like many financialised companies in the so-called “gig economy”, Uber’s anticipated market growth will not be distributed among its workers. Rather, this wealth will mostly be given to the shareholders who invest in the company on the basis of its promise to extract ever greater value from workers and wider society – including through our data and, in this case, the data produced by couriers and workers as they move through the city – to increase the value of their shares.
Recognising that the production of publicly-held data involves actors from across society and is impossible without a public sector in possession of the means of collection is also central to considering how collective control over its use might be realised in future.
The dominant understanding of value in publicly-held data omits the role of non-market actors in creating its value, and overlooks important processes necessary for its commercialisation by market interests.26 This is reflected in recent government policy documents, which suggest that it is “innovative businesses and entrepreneurs” that create value from accessible publicly-held data – through the eventual sale of commercial products and services.27 As the academic and open data advocate Rob Kitchin argues to the contrary, “[open] data might well be a free resource for end-users, but its production and curation is certainly not without significant cost.”28
Before publicly-held data is even collected, the government body has to decide – or “identify”29 – that it is information worth collecting at all. This process is invariably political and iterative, and can take place through informal structures. Perceived unmet data needs can also be identified by public sector bodies through the analysis of existing datasets internally. Sometimes, data is generated incidentally, and it may be held in an unstructured or messy format; this is true of a lot of data that was collected by public sector bodies before computer analysis tools were widely available. In these cases, the infrastructure that enabled the data to be collected was nonetheless indispensable for the eventual or potential realisation of economic and social value.
A significant amount of publicly-held data – and indeed, most of the data sought by commercial actors today – was collected purposively. Once it has been identified that a dataset is worth collecting, the government actor needs to develop the infrastructure and capabilities that enable it to do so. In earlier days of the NHS, for example, this entailed the creation of standardised paper forms that could be filled out by patients and healthcare professionals when needed. Today, it often involves the establishment of complicated digital infrastructure and the recruitment of technological and statistical expertise. Without this, publicly-held data could not be collected nor maintained at all.
Only after this is in place can the data be collected. Depending on the objective of the dataset, a host of actors facilitate or enable the public actor to collect or extract data. Where information about social relations is of interest, such as high street spending, religious attitudes, or the use of transport routes, it is members of the public that embody the collective movements, behaviours and views necessary to generate data. We might regard the commercial and financial data of businesses and banks as also being created collectively, enabled as they are by the transactional and saving habits of the wider public and business sector. Sometimes, such as in the collection of housing and homelessness data (at present), data ‘subjects’ are bound by local geographies, though often local public datasets are aggregated nationally. The UK government also collects data on non-human activity, such as weather patterns (to the extent that they can be regarded as non-human today!) The collection of environmental data in this way often requires specialist expertise and extensive training.
Once the dataset has been collected – or, more commonly, as it is being collected – the public actor needs to process the data using the extensive infrastructure and expertise it has developed. Increasingly, the processing of so-called “big data” in the public sector is outsourced to commercial actors, still at a cost to the public purse. This stage of the digital public asset value chain can include cleaning the data – rendering it utilisable by a public, commercial and/or other actor in society. In some instances of open government data, the public actor will release the raw data that has been collected for analysis and use by commercial and/or non-commercial actors. Even the publishing of datasets requires infrastructure that is expensive to maintain, such as data.gov.uk. Often, the public actor will analyse the data for use internally or within government, to assist with public administration. It might publish analyses, which can then be used by other actors, for both commercial and social uses.
In reality, the value chain of digital public assets is complex and non-linear; which actors play a role in the creation of a publicly-held dataset depends on what is of interest and what has motivated the data production and collection, among other factors. A vast array of workers from across society help to create this value; not only statisticians and data scientists, but also the administrators, interns, technicians, cleaners, waste collectors, taxi drivers and others who maintain the systems they use and otherwise make it possible for them to work day-to-day. A more complete value chain schematic would extend across pages and pages, accounting for those individuals and groups around the world that produce the materials needed to collect, analyse and store data. The term “global poverty chain” has been coined to describe the relationship between the low wages30 distributed to those carrying out the dangerous tasks of minerals mining and electronics factory production, and the high profits of the digital technology companies that utilise them for products sold predominantly in higher-income countries. Digital public assets must be seen, from this perspective as an issue of decolonial politics and global justice.31
In all cases, the value realised by the end-use of the data is created throughout its development, not merely in its eventual end-use, whether this generates surplus value for a commercial actor or wider social value.32
'If open data merely serves the interests of capital by opening public data for commercial re-use and further empowers those who are already empowered and disenfranchises others, then it has failed to make society more democratic and open.'
—Rob Kitchin, academic and open data advocate
The current open government data landscape is like an area of common land that everyone has access to and works to cultivate; except that only a few have the tools and technologies needed to harvest its crops. The land has yielded resources that the few have used to improve their tools over time. But they neither share the tools nor the nourishment they reap from their use with the other ‘commoners’ – the many. The many marvel in wonder at how quickly the fruit grows and how beautiful it is, how innovative the tools are that were built using what grew on the land through collective activity. But they do not eat the fruit.
It is often proposed that simply by opening up access to the tools needed to produce valuable data – for example, by teaching Python in primary schools – the enclosure of the digital commons by Silicon Valley-based technology companies can be surmounted. But the means of producing digital public assets is more than just the ability to harness code for some tech literati-determined end-use. And in reality, we have to accept that the specialist knowledge and time investment required to collect and utilise big data effectively will always be beyond the grasp – and wants – of most people; do any of us really desire a society consisting primarily of data scientists and programmers? What do we lose in this deification of digital expertise?
The state and the public sector will continue to amass data produced by our collective movements, actions and choices. Its ability to do so has helped improve welfare services, fostered biomedical innovation and provided civil society voices with an instrument to hold its decisions to account. But collective control of digital public assets should extend beyond government accountability and transparency, which in itself does not constitute a state that works in the interests of the many, as epitomised by the recent Conservative governments’ commitment to open data policies. And as discussed, publicly-held data has historically and even recently been used for ends that further oppress workers and people of colour in the UK, at its borders, and beyond.
So what could collective ownership of digital public assets look like? The following ideas draw from examples of democratised digital infrastructures around the world, including Decidim in Barcelona, the new Plaza Pública in Mexico City and the Open Data Institute’s proposals for data trusts.33 It is worth noting that the hegemony of the old OGD ideas today means that even in polities that have introduced democratic and redistributive structures for data produced by commercial actors (such as telecommunications companies), these same collective structures often do not exist for public sector data. The importance of public sector data, especially when it is generated incidentally, is rarely acknowledged by public bodies, whether at the municipal or national level. The proposals in this paper are described below in a sequence that mirrors the digital public assets value chain described earlier. As a starting point, these proposals reimagine collective ownership of digital public assets as a way of surmounting the issues around privacy, access, and profiteering that make the existing system so untenable.
Digital public assets could be accessible to all in a way that also harnesses collective value. But that is not the same thing as publishing data in a free and open format, as is currently widely accepted in the UK. As we have seen, open government data does not always produce social or economic value that is collectively shared. An alternative to the existing OGD model could involve access structures that ensure the end-uses of data are valuable for the many.
WeDecide.gov.uk could be part of that structure; an online space for everyone living in the UK – whether citizens, immigrants or asylum seekers; employed, unemployed or under-employed – to debate and decide priorities for how the data they help produce by living in the country should be used. National and Local Digital Priorities could provide the criteria on which accountable decisions about the use digital public assets are based. They could be formally voted on via the platform over the course of a week, once a year, but discussed at all times. Individuals and organisations would propose Priorities across a range of areas, like housing and social care at the local level – Local Digital Priorities would apply to local authorities and other local or regional bodies, like NHS Trusts – or financial services and public health at the national level.
Public sector bodies wishing to utilise the digital public assets they hold for a purpose beyond what is necessary for the functioning of existing services, perhaps after being approached by a technology company or cooperative, could be required to ensure new partnerships align with the collective social values decided upon via WeDecide.gov.uk. Community Digital Reps could be recruited via a civic lottery to ensure that partnerships between a public sector body and private actor (commercial, cooperative, or non-profit) align with the National or Local Digital Priorities. In a future economy, these roles could be voluntary, perhaps even elected. But at present, given that labour constraints and depletion often affect those most marginalised in the existing capitalist society, these positions would be remunerated.
At present, there is little consensus on how to estimate the economic value of data. And within the existing economic system, we should consider this value as the result of evolving and complex social relations between finance, companies, workers, and the state. But in both today’s and tomorrow’s economic systems, modelling the value of data is likely to remain a challenging domain.34 Understanding the economic value of data and the potential surplus it enables is fundamental for ensuring its end-uses benefit the many. Recognising that this is likely to remain a specialist and contested area of economics, this paper proposes that the Treasury establishes an advisory body that develops models for calculating and estimating the economic value of digital public assets. It could provide support to public sector actors at the local and national level that develop socially-valuable partnerships. There will of course be many potentially socially-valuable partnerships that do not generate market value, such as when a civil society body wishes to access data on health inequalities for a campaign.
When a commercial actor seeks to develop a socially-valuable product or service using digital public assets, the terms of its workers would be negotiated with trade union representatives and the public actor. In this way, digital public assets could be used as a means to improve labour conditions and the governance of private actors through partnerships with unions.
At present, commercial actors are able to generate huge surplus value from the sale of products and services that have been developed using digital public assets. This skewed distribution does not acknowledge the collective nature of the production of this data. As an alternative model for establishing collective economic ownership of digital public assets, all agreements with non-public actors to develop innovations that harness their value could stipulate that the majority of profit produced through their use is returned to: 1) digital workers’ cooperatives, to allow them to build the capability needed for future, socially-valuable innovations ; and 2) the public actor, to operate WeDecide.gov.uk, remunerate Community Digital Reps, run the Digital Value Team, and invest in relevant research and education institutions. This does not undermine the potential of commercial actors to make some profit that can be fairly distributed. Rather, it fosters wider capacity for innovation through allowing non-commercial actors to build and grow capacity in a way that generates social value, with the long-term goal that this becomes a primary mode of innovation in the economy and society.
The dominance of digital technology companies in society and in the economy has quickly become a new common sense. Most young people in the UK today don’t remember the world before Facebook; they can’t imagine a time where your boss couldn’t check up on you via Instagram or when the adverts that appeared on the screen in front of you weren’t tailored to your recent search engine habits.
Transforming the existing relationship between digital infrastructures, the public sector and individuals is as much a social and political challenge as an economic one. But many people involved in the creation of value in data – who, we must remember, are at the heart of everything this paper has discussed – are already rethinking value and ownership in the platform age. It is up to not just policymakers, but also community activists and national campaigners, to support that change. Here are four ways we can do that:
The early OGD advocates recognised that publicly-held data had the potential to be used for the benefit of wider society. And the movement envisioned a world where this could be used to improve democracy. Those goals are perhaps even more important today than they were in the mid-2000s. But a lot has changed since then, and it is critical that we rethink how democratic ownership and collective value can be harnessed through digital public assets. The success of the OGD movement and the widespread support it receives across society even today should perhaps give us hope that the principles on which it developed remain salient. Working with those active in the open data movement, we can revive those principles and rethink what a digital infrastructure underpinned by them might look like.
Those with the specialist knowledge and skills in data technologies will be central to the development of ownership models that work for the many. As the workers that enable economic and social value to be realised from data, data scientists, programmers and others working with digital technologies wield significant influence over how value is distributed. Thankfully, there already exist a range of tech worker-led initiatives that are challenging the digital economy status quo. Platform cooperatives and other worker-owned digital enterprises have grown in popularity and visibility in recent years, particularly in the US. The international Tech Workers’ Coalition builds solidarity among those working in the industry through self-organization and education.
When politicians tell us they want publicly-held datasets to be “unlocked” to allow the private sector to produce innovation, as Health Secretary Matt Hancock recently did in a report published with the right-wing lobby group, the TaxPayers’ Alliance, we have to question what motivates their calls. In the coming years, we are likely to witness further disruption to our financial markets and tightening budgets for our welfare services. As happened under the Coalition government, these developments may well be weaponised to further open up both public services and digital public assets to the private sector. Our response should always be: “Who benefits?”, recognising that an agenda of transparency and openness will not necessarily serve collective interests.
The value of digital public assets will ultimately always depend on how they are used in society. As this paper has discussed, certain groups, such asylum seekers, have historically been exploited by information infrastructures built by the state. We should not anticipate that this will change – by virtue of “transparency”, for example – any time soon. It is imperative that these groups are actively involved in the development of our future collective ownership structures for digital public assets. With that in mind, this paper is intended as just one contribution – itself the product of years spent learning from others – in the cumulative, collaborative reimagining of the new economy.