Towards a Framework for Open Data Publishers: A Comparison Study between Sweden and Belgium

. Public organizations in the role of publishers publish data for anyone to reuse, which can lead to beneﬁts. However, the process descriptions for this publishing work focus on one or a few issues, which leaves out important areas and decisions. Little seems to be known about variations between publishers based on one common point of comparison. Therefore, this paper presents a comparison between two publishers: Namur (Belgium) and Link¨oping (Sweden). The comparison is based on a process framework, seven in-depth interviews, document studies, and a veriﬁcation meeting with one respondent. We learned that the OGD manager is an agent of change who need to balance implementation and guidance, the orthodox method of e-mail registration can be used to engage users and monitor impact, the organizational unit for OGD is cross-organizational, and the publisher process framework could be used as ex-ante strategic guidelines and context-speciﬁc recommendations.


Introduction
Open Government Data (OGD) refers to interoperable data that is freely shared by public organizations over the Internet for anyone to reuse without restriction [1,21].The data can come from or be parliament minutes and weather reports [1,2,20].The use of OGD could lead to benefits, such as increased governmental transparency and citizen participation [14,15,11,26], but there are also risks, such as privacy violations as well as misinterpretations of data [3,27].
This paper focuses on the key stakeholders who share data in public organizations: the publishers.A main challenge of OGD resides in the variation in how data is shared by publishers [1], which can come from differences in the publisher process that tends to be ad-hoc.Descriptions of this process vary from technical [13], lifecycle [10] to metadata and information management [18,22].They often focus on one or a few issues, which leaves out important areas and decisions.However, little seems to be known about possible variations based on one common reference point.This lack of knowledge can impede the analysis and comparison of publishing processes by researchers or practitioners.Therefore, in this paper, we apply a publisher process framework to two international cases: Namur (Belgium) and Linköping (Sweden).The application of this framework reveals similarities and differences in a structured manner and we discuss lessons learned from the comparison for research and practice.
The rest of this paper is organized as follows.Section 2 presents the background related to the publishing of OGD.Section 3 presents the methodology followed to perform the international comparison.Section 4 presents publishing processes of both cities and their comparison.Section 5 discusses lessons learned, implications for both research and practice, the study's limitations, and leads for further research.Finally, Section 6 summarizes the contributions of this study.

Background
The publisher process of OGD has been studied by academic and practical authors.Two early examples are [13] that details the steps to publish linked OGD, and [10] which focuses on the general lifecycle.[5] presents a publisher process with a focus on strategies, technical publishing, and a lifecycle.However, in recent practical descriptions, there is an increased focus on information management and metadata [18,22], but publishers are still using ad-hoc processes.In general, the descriptions have gaps, variations, uneven details, and fragmented coverage of the publisher process, which indicate the publishing is more than releasing data.In this paper, the publisher process is viewed as groups of processes in a sequence with variations, decisions, and choices, and the final output is OGD.This approach follows the publisher process framework of Crusoe and Ahlin [6].This framework is a recent attempt to synthesise previous research and empirical material on the publishers' processes and is based on the findings of [13,10,8,17,19,16,25,5,18,22].The authors explain that publishing OGD is more than releasing data and, thus, the framework comprises six process groups detailed here.The groups cover the introduction of OGD to its withdrawal by the publisher.Initiation processes contribute to a long-term and sustainable work with OGD.For example, education, appointment of an OGD manager, and creation of strategies.Inventory processes contribute to the organization's information management with the purpose to enable and help the prioritization and preparation of publishing OGD.For example, find an information center and audits of the organization's information resources.Publish processes design and implement the data publishing that extracts data from inside the organization to share it with external users.For example, prepare the data, IT-systems, and maintenance of OGD.Sustain processes maintain the data provision and monitor the internal and external impact of OGD to guide and direct the work with OGD.For example, evaluate and improve the data provision.Withdrawal processes stop the data provision or cut the connection between data production and OGD storage [19].User engagement processes are red threads through the work with OGD and can broadly be divided into raising awareness and promote reuse, which can have the purpose to identify valuable datasets to publish or build trust [17].Moreover, this framework is well theoretically grounded but still needs to be applied in practice.Also, it does not capture the variations between different publishing processes.We attempt to fill this gap in our study.

Methodology
We conducted two qualitative case studies to explore how the publisher process groups are followed in two cities.Here, we note that this research is part of verifying and further developing the framework proposed in [6].From this framework, we selected the process groups presented in Section 2. First the research process is described and then the cases are introduced.

Data Collection and Data Analysis
Between November 2019 and January 2020, three researchers collected empirical data through a combination of seven in-depth interviews with key stakeholders (Table 1), and supplementary official documents, agendas, and internal documents provided to us by the interviewees and identified on their websites.The interviews were semi-structured [9].We structured the interview guide around the main process groups of the publisher framework described in Section 2. The interviews were limited to three for Namur and four for Linköping as the interviewees stated that these were, for each case, the main functions involved in the OGD strategy and day-to-day implementation at the city level.The interviews were analyzed with process and initial coding [23].First, we transcribed the recorded interviews in memos, structured around the the publisher framework's groups.Then, we skimmed the memos to grasp their themes and highlight important sentences based on the research objective.The codes were then inserted in a table to summarize the main insights.The official documentation helped us to outline the context of each city as it provided a frame of reference regarding local policies, agendas, and strategies.It also served as a resource for additional information regarding topics that were discussed during interviews.Thanks to the diversity and complementarity in the profiles and backgrounds of the interviewees, the analysis performed by multiple researchers, and the triangulation with the official documentation, we were able to limit the subjective perception in the data.The comparison and framework were discussed with the Swedish OGD manager and follow-up questions were asked.

Case Studies
The selected cities are Namur (Belgium) and Linköping (Sweden).Namur is the regional capital of Wallonia and has 110,939 inhabitants.The service industry is dominant (presence of a university, commercial activities, etc.).Namur has 127 datasets (e.g., mobility, thermography) published on its OGD portal.Linköping is the regional capital of Östergötland with a population of 161,499 people.The city focuses on ICT and knowledge development (with a university and several large IT-and technology-focused businesses), manufacturing, and a growing service sector.It has 18 datasets (e.g., air quality, open job positions) published on its OGD portal.The cities were selected based on accessibility for the researchers, history with OGD, and perceived closeness in context: both are regional capitals with a population above 100,000, similar industries, and presence of a university.

Findings
In this section, we describe the OGD publisher process groups for Namur and Linköping.For both, the insights of the interviewees were merged, for each group of the process framework.Their comparison is given in Table 2.In order to structure the comparison, each process group was refined into themes defined from the analysis of the transcript memos.

Publisher Process: The Case of Namur
Initialization.The emergence of OGD in Namur is the result of four factors.First, the possibility to technically monitor the administration.Second, the call from the political opposition for data openness and transparency.Third, the increasing interest and hype of the smart city theme.Finally, the PSI directive from the European level.It led to the creation of a dedicated data office, based on a former geographic department.Later, a person was hired to manage the cartographic data.The work on OGD was reorganized around this person, who is now the OGD manager.Namur has set its goal at the fourth level of Tim Berners Lee's model [4].At the start of the OGD work, the OGD manager conducted numerous meetings, department by department, almost person by person, to discuss OGD.He organized trainings (first with external companies, then by himself) and found that most of the public agents were willing to learn and improve the functioning of the administration.As for the resources, there was no additional hiring of people for OGD.The data office's budget is part of the IT department's budget and is decreasing over the years.Overall, this merge causes resource issues for the OGD work.
Inventory.OGD in Namur is driven by the existing cartographic data.The publishing of data can be viewed as a migration of the cartographic data to the portal.The OGD manager is in charge of the prioritization of the datasets to publish, however, prioritization is rare.He started by publishing data "he was certain about", such as buildings or roads.The data had no potential sharing issues and personal details.The decision to publish data or not was in fact a balance between the necessity to publish it and its quality.This balance was weighted through meetings and discussions with the technical agent and the head of the concerned department.In addition to cartographic data, some datasets come from other departments, which are identified by the OGD manager and then published on the portal by the technical agent.These datasets are the most used or relevant for the entire administration (e.g.mobility data).One of the goals is to make the datasets available to the entire administration.
Publishing.The technical agent is in charge of uploading the datasets to the OGD portal.The most popular solution in Belgium (OpenDataSoft) was picked for the portal, without involving the technical agent.The data publishing is straightforward with the Opendatasoft back office.When the data is cleaned beforehand through the ETL software (SafeFME), publishing is done fast without impediments.When anomalies are spotted, the technical agent corrects them or asks other departments for corrections, which she can obtain easily.There were some issues with older data (30+ years), as their encoding had changed, or content was irrelevant or incorrect.This required a careful cleansing of the data.However, the most time consuming part of the process is to generate metadata.The technical agent manually sets column titles and descriptions of the datasets.
Sustain.The OGD portal in Namur serves both as an internal tool to allow all departments to access the data of other units and as an external tool to give citizens access to data from the administration.Some data is automatically updated (e.g.every minute for the location of bikes).On the other hand, some data still needs manual updates, which they do not have time for anymore.In order to reach the fourth level of Tim Berners Lee's model, the head of the data office is pushing towards dynamic data on the portal.The city of Namur has received no feedback on the impact of OGD (even though there is a feature on the portal where re-users can share their reuse), which the technical agent described as "pretty frustrating".The only identified reuse is a one-off collaboration with the University of Namur [7].Data corrections are based on feedback received by e-mail from citizens or public agents.The feedback is often about the metadata.
Withdrawal.The city of Namur has not withdrawn any dataset from the OGD portal.The only negative feedback they received was on the thermographic data they represented on a 3D plan.However, they explained the relevance of having this data on the portal and the complaint was dropped.As a consequence, there is now the possibility for people to ask for the withdrawal of data.
User engagement.The OGD manager used several participation methods to develop and improve the OGD portal, such as interviews and meetings for the requirements analysis and prototyping for the development.The technical agent views citizen participation as a great opportunity to foster OGD reuse.The head of data office indicated that monitoring the reuse of OGD was still at the stage of a project.Namur wants to make efforts in this direction, although the lack of available resources is a barrier to conduct OGD projects.A first step was made through the 3D representation of some open datasets on the portal.

Publisher Process: The Case of Linköping
Initialization.OGD in the municipality started for two reasons: (1) the expolitician realized the potential of OGD when visiting a hackathon and (2) a citizen requested an OGD portal for the municipality.The ex-politician advocated for this proposal and a mission was given to an official (the first OGD manager) to create the portal in 2012.The ex-politician drove the issue forward to ensure it was prioritized.The first OGD manager left and was replaced by the OGD manager interviewed for this study.The two managers have worked together.The current OGD manager is a member of the IT-unit.Part of the OGD manager's work is to inform about OGD and implement APIs.In his daily work, he comes in contact with information resources that he can evaluate.The current objective is to release 3-4 datasets per year and continue to develop the OGD portal.He and the ex-politician are in favor of the open by default principle.Moreover, no OGD strategy has been created and there are no strategic goals.They have more resources than they can consume, but the OGD manager expressed a need for more people to work with OGD.The municipality has also developed information security policy and guidelines, but have none for OGD.
Inventory.The OGD manager explained that they identify data that is interesting to citizens, which is often already published on the municipality's website.He once looked at the statistics for the top 10 visited information pages and continued to investigate them.Business representatives often have a feeling for what data is demanded by the public.Once they have identified interesting data, they meet with the unit responsible to inform them about OGD and that they seek to supplement the municipality's OGD, what OGD can and cannot do, and possible effects.Sometimes users come with suggestions about new datasets to publish and these are taken into account.For example, a citizen asked if they could publish the road works dataset and so they did.Moreover, the system manager was introduced to OGD by the OGD manager when he wanted to publish their data.The system manager and his organizational unit accepted the idea of OGD as they understood the data to be interesting to the public.He views it as a form of public service.He helps the OGD manager select parameters to publish.When publishing data, they do risk and technical analyses.The risk analysis involves studying the data, its information classification, legal parts, and data quality and its possible consequences.
Publishing.The technical analysis can involve identifying the current ITsystems and the need for an intermediate storage.Intermediate storage is important for sensitive data and can help to protect the original IT-systems, which was emphasized by the system manager.The OGD manager puts together groups where they discuss the risk and technical issues.The organizational unit often has an idea about the potential of the data and of how the citizens want the data.They also write documentation for the data and have recently started to experiment with tools for it.They control what information they can share, publish it, and then communicate about it to the users.
Sustain.They aim to have all of their data automatically updated.They tried manual updating at the start, but it was too resource consuming.Their OGD portal is custom-built, but is slowly being replaced by a private-provided solution (Entryscape).On their portal, roughly 50% of the published OGD requires an API-key.To get the API-key the user has to register an e-mail address.This setup allowed the publishers to contact users and ask questions about what they are doing with their data.This setup was not planned from the start, but came from internal worries and is perceived as orthodox from an OGD perspective.They collect statistics about the number of API calls, but it is not something they monitor.Moreover, their OGD is used internally by their decision makers in different decision-support systems.However, there is currently limited monitoring of the Swedish development of OGD.They do not follow any maturity framework, rather they focus on users' needs and on API access.
Withdrawal.The OGD manager explained that GDPR started discussions on limiting the real estate designations dataset.However, they do not want to withdraw data and the risk analysis is supposed to avoid it.They do not want data to "end up in the wrong hands" or break the law.The food data is currently not accessible as a new law is stopping it from being published.
User Engagement.The e-mails collected for their API-key registration have allowed the municipality to contact users.For example, a family uses the data in their digital home, while a developer has implemented a food inspection presenter app.The OGD manager said that the developer praised the municipality for their APIs as it was the easiest data retrieval of all municipalities.The municipality used to participate in the arrangement of a hackathon called East Sweden Hack, but it closed after five years due to a decision from the top.The municipality has since then arranged Innovations and Social impacts instead.Today, they publish news and information about their OGD, but do not actively work with the users since some instances, such as SKL (Sweden's municipalities and regions) and Vinnova (Sweden's innovation agency), focus on what data municipalities can publish instead of user engagement.The OGD manager explained that they only try to publish data and inform that it exists.They listen to users' feedback and have supplemented their dataset with new parameters based on requests.

Discussion
This section discusses the lessons learned from the similarities and differences of the cases, the theoretical and practical implications of the study, and limitations and future research leads.

Lessons Learned from the Comparison
We reflect on three key discrepancies between the case studies and previous research: the OGD manager role, user engagement, and re-use monitoring.
OGD manager role In previous literature, the OGD manager is responsible for OGD and requirements and/or coordinating and managing activities [22,19].The two cases nuance this description.They are close enough to the operations to get their hands into the manual publishing process, which they found resource consuming.They are not responsible for data production (unlike [5]) neither considered project managers (unlike [10]).Both OGD managers cause change in the organization and manage an OGD unit, which is an overlapping subunit of other organizational units (e.g., IT and Construction and Environment Committee).They are responsible for OGD and related requirements [22].For example, when needed they can take help from developers, legal advisors, and data owners [10,19,5].The managers also work with continual education and help to spread the idea of OGD in the organization.Education is a critical success factor for OGD initiatives [25].However, the degree of education between the two cases differ.Namur had external technical training, while Linköping had not, which is likely a consequence of the commonness of digital skills in Sweden.In conclusion, the findings provides new insight into the work of OGD managers.Once basic OGD infrastructure is in place, such as OGD portals and curating tools (e.g., Opendatasoft), an OGD manager works to connect the data production processes with the OGD infrastructure while simultaneously changing peoples' minds and behaviors.
User engagement Previous research stresses the importance of user engagement [10,17,25,5].The cases do not actively work with user engagement.Namur is impeded by the lack of resources, while Linköping focuses on publishing their data because of the focus of other organizations in their environment.At the same time, the cases do not currently monitor the impact of OGD.Linköping did engage the users through a hackathon, but then stopped.It seems that the cases lack resource-effective methods to engage users.
Re-use monitoring Previous literature also recommends to monitor OGD [10,25,5,18].However, it seems that a simple e-mail solution has provided Linköping with proofs of the value of their OGD in the wild, which their hackathon could not.The e-mails combine very light user engagement and anonymous monitoring.They could track API call statistics for individual users and then contact them and ask questions.This setup, while orthodox, could be a good way forward to get local proofs of valuable OGD for a publisher.

Theoretical Implications
The publisher process framework [6] is a good fit for the comparison of the two cases as it allowed us to structure the data collection and analysis.The overall structure was followed in the work of the publishers and interviews generated rich empirical material, allowed us to identify new roles, and brought some points of reflections to the participants (e.g., monitoring and user engagement).However, the framework is divided into different process groups, which created some dilemmas when analysing the data.For instance, the initiation group was revisited between inventory and publishing when OGD managers educated organizational units about OGD.The empirical material also showed a need for certain basic OGD infrastructure to be in place before any data could be published, which is not currently part of the framework.However, all mentioned activities of the publishers fit well within the framework.

Practical Implications
This study helped to provide empirical validation of the publisher process framework.This framework can be considered as a basic ex-ante strategic framework for OGD development.Practitioners within administrations could use the process groups as an actionable template for their strategies and use the diverse themes as key attention points to be taken into account.
Furthermore, the analysis of the OGD publisher processes of Namur and Linköping revealed similarities and also key differences in their process.Therefore, context-specific recommendations could be issued for each group depending on the variations of each city.

Limitations and Further Research
This study also presents some limitations.First, data collection was based on document studies, one in-depth interview with each of the participants, and a verification meeting with the Swedish OGD manager.The empirical material could be further enriched with meetings where the framework is presented and discussed between different participants.
Second, the analysis of the publishing process of the two cities did not follow a pure deductive approach, but the tentative OGD publishing process framework is based on several literature sources [6].This approach allowed for structure in the data collection and analysis, but it is possible that more process groups, activities, and variations need to be identified.
Third, the publisher process has been applied to two comparable cities in terms of size and number of stakeholders involved.Even though they constitute a first validation step, the application of the framework to a more diverse set of cities would provide more extensive validation and would allow discovering more variations in the OGD publishing processes and help to suggest context-specific recommendations.The elicitation of the factors impacting the process such as national culture [12], city context [24], or degree of OGD institutionalization would constitute a promising step.Furthermore, the impact of these factors on the process and the variations they introduce would also be an interesting next step.In this study, we focused on key variations in the process but the analysis of the impediments that publishers face would also enable to issue better recommendations for practice.

Conclusion
In this paper, through the lens of a theoretical framework, we have compared the OGD publishing process groups of two cities: Namur (Belgium) and Linköping (Sweden).This study contributes at several levels.First, we provide a first empirical validation of the OGD publishing framework suggested by [6].Second, we take an in-depth look at the processes followed by Namur and Linköping and clearly identify their similarities and differences.We identified that OGD managers are agents of change for the operations and business of the organization, while needing to balance implementation and guidance.Linköping used an orthodox method to engage and monitor users (account registration with e-mail only requirement), which allowed them to identify concrete value of their published data in the wild.Namur placed their OGD unit in a data office, while Linköping placed their OGD unit in an IT-unit.Finally, we suggest to use the process as ex-ante strategic guidelines and to open the discussion for contextspecific recommendations.

Table 2 .
Comparison between Namur and Linköping publisher processes.