Digital industry of Central and Eastern Europe appeals to EU authorities for allowing use of public first-party data for training large language models
Economic growth and competitiveness are increasingly driven by data. One of the most promising technologies, artificial intelligence, requires access to data in particular. Meanwhile, EU regulations stop companies from fully capturing the technological potential – digital industry associations from 8 Central Eastern European countries argue. In their joint statement addressed to national governmental authorities and the European Commission, they appeal for developing a coherent and transparent regulatory framework for using data.
The appeal was signed by 9 digital industry associations from countries of Central and Eastern Europe including Bulgaria, Czech Republic, Lithuania, Latvia, Poland, Romania, Slovakia and Slovenia, cooperating in the scope of CEE Digital Coalition – an initiative aiming to boost digital transformation of the region.
According to signatories of the joint letter, development of artificial intelligence requires steady access to data. Including locally relevant data gathered in particular regions of the world. Meanwhile – as CEE Digital Coalition points out – current European regulatory framework does not allow technological companies to make progress in the field. „While we believe in the importance of protecting individuals' privacy and ensuring that their personal data is used in a responsible manner, privacy - as a fundamental right under the Charter and the GDPR - must be balanced against other fundamental rights such as freedom of expression and information, and the freedom to run a business. For this reason we believe that the use of publicly available first-party data for training large language models (LLMs) based on legitimate interest should be allowed” – signatories of the letter argue.
AI development in Europe will be difficult without sufficient access to data
Authors of the statement note that LLMs are not personal databases of user information. „They are mathematical objects that do not store copies of the data on which they were trained. Instead, they aggregate learnings from across the training data to predict typical language patterns and logical entailments between words” – the letter reads.
Secondly – the industry organizations point out – the use of publicly available first-party data for training LLMs based on legitimate interest is essential for the development of AI technology in Europe. They add that It allows companies to build accurate business decisions, optimize warehouse management, and develop innovative products that can compete with those offered by companies in other regions. „Limiting the use of publicly available first-party data for training LLMs would put European companies at a disadvantage and hinder the development of AI technology in Europe” – the digital industry experts warn.
CEE Digital Coalition also highlights that the use of publicly available first-party data for training LLMs based on legitimate interest is consistent with individuals' previous decision to make their thoughts public. It is also consistent with individuals' will regarding the use of their personal data in AI training datasets through their ability to object. „The focus should be on the implementation of the right to object” – representatives of the CEE digital industry say.
Read the full statement: here.