Registry of State Data Sets
Registry of State Data Sets

Registry of State Data Sets

On behalf of the President, the Russian government is working on the concept of regulating access to government data. For effective research and the creation of advanced AI solutions, developers need access to a large array of data.

The government has submitted a draft law on data depersonalization to the State Duma, which aims to provide favorable legal conditions for collecting, storing, and processing data using new technologies. In January 2022, the government instructed the Ministry of Digital Development, Telecommunications and Mass Media to develop a plan to give businesses access to data for AI training.

Citizen data should be stored in a single and secure government system. The government should be responsible for storing the information while ensuring free access for AI developers.

The essence of the bill

Make it possible to create conditions for comprehensive training of AI models and their improvement on the basis of Russian corporations and government agencies;

Provides that businesses will be able to transfer anonymized citizens' data to the state 3 years after their collection for use by developers of AI solutions;

Defines how personal data will be anonymized.

Access to data sets

Information will be transmitted to developers in encrypted form to exclude access to the personal data of citizens. Furthermore, large domestic companies will transfer the data of citizens to the state after 3 years after collection. During this period, the data will lose its commercial value but will remain in demand for the construction of theoretical models of AI.

Thanks to this initiative developers of AI technologies will have unprecedented access to a vast array of data from government agencies and large companies, allowing them to create breakthrough solutions in the field of AI.

Planned results

Data sets

Federal executive authorities should prepare two marked-up state data sets each to improve the efficiency of public administration

Control system

The Ministry of Digital Development, Telecommunications and Mass Media will begin work to create a single state data lake and marketplace of data sets

Access to data

Hackathon participants will be given access to state data sets to create solutions using AI technologies and validation

Existing achievements


In 2021 26 departmental data sets were formed: 4 to be used by third-party developers for business solutions and 22 for internal needs of federal executive bodies as part of the digital transformation.