Generative AI and the media sector: Preliminary thoughts on a legal and policy agenda

Generative AI (or “GenAI”) is undoubtedly becoming the buzz term of the year. However, amid the excitement about new tools that have emerged, the Italian data protection authority announced in March that it was temporarily blocking ChatGPT. In May, the US Senate held a hearing on the oversight of AI where the need for regulation was extensively discussed whereas the EU institutions are about to start the so-called trilogue negotiations that will lead to the adoption of the AI Act (the European Parliament is voting on its report in its plenary session today).

GenAI is expected to create opportunities for most sectors of the economy, leading to faster and perhaps more accurate decision-making. As regards the media sector in particular, GenAI can be used to improve the production and management of content. For example, journalists could rely on GenAI tools to quickly analyse information and summarise editorial content. In fact, according to a recent survey of the World Association of News Publishers, half of newsrooms currently use GenAI tools. GenAI may also enable media companies to enhance the personalisation of audience experiences through the offer of effective search and recommendation services on their platforms. Such use cases illustrate how GenAI can help media organisations to reduce production costs and improve their monetisation capabilities. However, GenAI can also be expected to pose significant challenges a glimpse of which we have already caught. Those challenges include infringements of copyright-protected works, and the spread of illegal, harmful and manipulative content, such as disinformation and deepfakes.

Policymakers and regulators have a unique opportunity to influence how GenAI will evolve, for there are many lessons we learnt in recent years. For starters, we know that we should not wait for decades before regulating a technology (or application thereof) that is being adopted at a fast pace. In that regard, the EU AI Act, which will introduce rules to increase transparency and ensure accountability in this area, seems to be a step in the right direction. Moreover, many of the problems experienced by media organisations are neither new nor specific to GenAI. Therefore, in many cases, the solution may not necessarily consist in adopting new rules, but in sensibly revising and extending the scope of existing rules.

This blog post discusses some of the concerns that GenAI may raise and what the existing (or soon-to-be adopted) rules can do to address them.

Competition and fairness concerns

A key factor determining the economic impact of GenAI is who owns data and models. This is why a recent report on the opportunities and risks of foundation models notes that “pushing the frontier of foundation models has thus far primarily been the purview of large corporate entities. As a result, the ownership of data and models are often highly centralised, leading to market concentration” (emphasis added).  Statements that were made at the recent US Senate hearing on the oversight of AI identify concentration of market power as one of the problems that the policymaker needs to keep an eye on:

Senator Cory Booker: “[O]ne of my biggest concerns about this space is what I’ve already seen in the space of web two, web three is this massive corporate concentration. It is really terrifying to see how few companies now control and affect the lives of so many of us. And these companies are getting bigger and more powerful. And I see, you know, OpenAI backed by Microsoft. Philanthropic is backed by Google. Google has its own in-house products. We know Bard. So I’m really worried about that. [..] Sam [Altman] [..] [a]re you worried about the corporate concentration in this space and what effect it might have? Sam Altman: “I think there will be many people that develop models. What’s happening in the open source community is amazing, but there will be a relatively small number of providers that can make models at the cutting edge” (emphasis added).

Considering the dynamics and characteristics of similar and adjacent markets, it is likely that concerns about abuses of dominance and “gatekeeping” will arise. We can already think of how practices such as tying/bundling, self-preferencing, default settings, and refusal to grant access to data may be employed to strengthen existing ecosystems. Examples include bundling search or social networks with (Gen)AI tools, tying cloud services packages to AI services, etc.

The question that may arise in the not-so-distant future is whether issues such as those described above would (or should) be addressed through the enforcement of competition law or the Digital Markets Act (“DMA”).

Competition law (and in particular Article 102 TFEU) may apply to GenAI providers that abuse their dominant position. Competition law has been enforced in recent years to address harmful behavior in which tech giants engage, such as lack of transparency, self-preferencing, and the excessive processing of user data. Given that it is drafted in broad terms (to the effect that a wide range of practices fall under its scope), competition law would be capable of remedying the anti-competitive conduct of dominant GenAI providers. However, competition law applies ex post (i.e., after the harm materialises) and any remedies that may be imposed would only apply to the GenAI provider that has been found to act anti-competitively (i.e., a competition decision does not set an industry-wide standard).

The DMA applies to “gatekeeper” platforms, that is, tech companies that (a) act as gateways between consumers and businesses, and (b) provide “core platform services”, namely services that have been included in a closed list in the DMA (e.g., app stores, video-sharing platforms).

What speaks in favour of the DMA addressing the practices mentioned above (e.g., certain forms of bundling/tying, default settings) is that they have already been dealt with by competition authorities. In other words, it may not be necessary to go through the (same) learning curve that led to the adoption of the DMA.

But, could GenAI services qualify as one of the “core platform services” that are regulated by the DMA? For instance, there may be GenAI services that could meet the criteria defining search engines (which the DMA defines as “digital services that allow users to input queries in order to perform searches of, in principle, all websites, or all websites in a particular language, on the basis of a query on any subject in the form of a keyword, voice request, phrase or other input, and return results in any format in which information related to the requested content can be found”). This definition relies on a search engine’s function to index words and phrases, crawling websites. Popular GenAI tools work differently though. For example, ChatGPT is a natural processing model that is limited to the information it was trained on and does not have access to the internet. Similarly, one may wonder whether GenAI services may qualify as “virtual assistants” (which the DMA defines as “software that can process demands, tasks or questions, including those based on audio, visual, written input, gestures or motions, and that, based on those demands, tasks or questions, provides access to other services or controls connected physical devices”). GenAI services do not seem to fit squarely within the definition of virtual assistants, for they do not (always) provide access to other services (which is a requirement for virtual assistants to fall under the DMA).

Assuming that GenAI services start to perform a gatekeeping function and that they do not currently fall in scope, the Commission may conduct a market investigation for the purpose of examining whether such services should be added to the list of core platform services. However, the devil lies in the details. Contrary to the obligations established in Articles 5, 6 and 7 (which may be revised through a delegated act), including additional services in the DMA requires a legislative proposal. In other words, unless GenAI services are currently deemed to qualify as core platform services, the DMA will need to be revised for them to be covered.

Finally, it is worth noting that the fact that GenAI services may not qualify as core platform services (for now) does not bring them outside the scope of the DMA. This is because certain DMA obligations apply to “other services” offered by gatekeeper, including core platform services for which an entity has not been designated as a gatekeeper and services that do not qualify as core platform services. For example, the prohibition whereby gatekeepers should not combine data unless users grant their consent applies to all services provided by the gatekeeper. This is expected to raise many questions in terms of (a) the datasets on which a foundation model can be trained, and (b) whether, even if users consent to data combination, compliance with the GDPR (and, in particular, the requirement to extract an informed consent) is even possible in such cases.

Protection of intellectual property rights (“IPRs”)

Ensuring that GenAI evolves in a manner that is conducive to IPR protection is arguably one of the greatest challenges to address from the perspective of the media and creative industries. It is widely accepted that inadequate IPR protection chills content creativity and innovation. Moreover, considering how digital technologies (including AI) have been used to spread disinformation, it is ever more important to protect the right of (trusted) media service providers to authorise the use of (and be fairly remunerated for) their works.

In the EU, the DSM Copyright Directive establishes rules for text and data mining (TDM). TDM is defined in the Directive as “any automated analysis technique aimed at analyzing text and data in digital format having the purpose of generating information including, but not limited to, patterns, trends and correlations. The Directive establishes the so-called TDM exception, which allows GenAI tools to access large amounts of data to train the model concerned and generate “new” content.

Concretely, Article 4 of the DSM Copyright Directive establishes two conditions under which the TDM exception applies. First, the copyright-protected work must be lawfully accessible, including when it has been made available to the public online (e.g., a news article that can be accessed online). Second, the copyright holder must not have expressly reserved the extraction of text and data. This condition essentially establishes an opt-out mechanism whereby the owner of the copyright-protected work expressly reserves TDM for itself (usually through machine-readable means, such as metadata and the Terms and Conditions of its website or a service). As a result, in order to protect their content against unauthorised use by GenAI providers, media organisations should opt out of the TDM exception in an explicit and clear manner.

As GenAI services proliferate and their popularity increases, the question is raised whether the TDM exception is fit for purpose. On the one hand, the TDM exception does not strip content creators of copyright protection. The exception only applies where the copyright holder has not expressly reserved the exclusive right to engage in TDM. On the other hand, it does impose a regulatory burden on the copyright holder. In that regard, it is worth recalling that the DSM Copyright Directive regulates online content sharing platforms (“OCSPs”), such as YouTube, differently from TDM. Under Article 17(1) of the Directive, it is the OCSP that must seek to obtain an authorisation from the copyright holder in order to communicate to the public or make available to the public the protected work. As (at least certain) GenAI providers will hold significant bargaining power, one may wonder whether the regulatory burden introduced by the TDM exception should be carried by e.g., a local newspaper publisher (and not a tech giant).

However, the TDM exception (and the opt-out system it establishes) is not the only issue that merits attention when discussing copyright protection and the rise of GenAI. Another equally important issue concerns the difficulties encountered in detecting copyright infringements.  The soon-to-be adopted AI Act may provide a solution (but this has yet to be determined based on the final text that will be adopted). The report on the AI Act proposal that was adopted by the European Parliament (“EP”) in May proposes obligations for providers of foundation models, including foundation models used in GenAI systems. One of those obligations consists in “documenting and making publicly available a sufficiently detailed summary of the use of training data protected under copyright law” (Article 28b(4)(c)). This provision would essentially enable copyright holders, including newspaper publishers, to identify instances where their content has been used without prior authorisation and to claim damages.

However, the final text of the AI Act should arguably clarify the meaning and scope of this application. Notably, it is not clear whether the objective of the provision is for GenAI providers to list all or most of the copyrighted material they use. If this is indeed the case, legitimate concerns have been expressed as to whether this obligation will be difficult (if not impossible) to comply with given the territorial fragmentation of copyright (though some EU instruments harmonise certain aspects of copyright, this remains an area that is primarily regulated at the national level), the absence of a registration requirement for copyright-protected works, and the poor state of copyright ownership metadata.

Brand attribution

One the main concerns media organisations have expressed about GenAI tools (see, for instance, here and here) is that they extract much more proprietary content from the original sources than traditional search and often provide little or no attribution. Brand attribution (or lack thereof) is an important issue for several reasons. Notably (and this applies to businesses across the board), lack of attribution prevents the creation of brand awareness. In addition, lack of attribution to (trusted) media service providers prevents them from establishing a direct relationship with their audiences, which may exacerbate the trend toward zero-click. What is more, lack of attribution fails to address the spread of disinformation.

In EU law, brand attribution is addressed in the platform-to-business Regulation, which establishes an obligation for providers of online intermediation services to “ensure that the identity of the business user providing the goods or services on the online intermediation services is clearly visible”.

The term “online intermediation service” covers services which meet all of the following requirements: (a) they constitute “information society services” within the meaning of EU law (i.e., services normally (but not necessarily) provided for remuneration, at a distance, by electronic means and at the individual request of a recipient of services); (b) they allow business users to offer goods or services to consumers, with a view to facilitating the initiating of direct transactions between those business users and consumers, irrespective of where those transactions are ultimately concluded; and (c) they are provided to business users on the basis of contractual relationships between the provider of those services and business users which offer goods or services to consumers. Though GenAI services could qualify as “information society services”, it is far from clear whether they fulfil the other two criteria set by the P2B Regulation. Concretely, they do not seem to perform an intermediation function similar to app stores and e-commerce marketplaces (which the P2B Regulation currently covers) nor is there always a contractual relationship between businesses and the providers of GenAI services (see, for instance, the points on copyright protection that were made in the previous section).  

Transparency and accountability

Two of the most debated issues when discussing the regulation of the digital economy is the lack of transparency underpinning how technologies and applications work and the limited accountability of their providers. These issues clearly also concern the design and development of GenAI services that may, for instance, disseminate illegal, harmful, and manipulative content. In the EU, the main instruments that (will) establish rules seeking to promote and transparency and accountability in the digital economy are the Digital Services Act (“DSA”) and the AI Act.

The DSA establishes obligations to mitigate the spread of illegal content, such as hate speech and manipulative content. For example, platforms that fall under the scope of the DSA must establish a notice and action mechanism under which users may flag potentially illegal content, which should then be reviewed and removed. Platforms are required to establish an internal complaint and redress system and offer out-of-court dispute settlement so that users are not required to initiate court proceedings to challenge content. Platforms should also refrain from designing, organising or operating their online interfaces in a way that deceives or manipulates the users. Moreover, very large online platforms must establish a compliance system, consisting, inter alia, of proactive risk management strategies and independent audits, including by identifying and reporting on systemic risks, such as systemic risks to democracy and media pluralism.

Clearly, all the above provisions are relevant to GenAI tools. But, it is doubted whether GenAI providers largely fall under the DSA. In particular, GenAI tools do not seem to qualify as “hosting services” as defined in the DSA because the information those tools store is not provided by the recipient of the service (the information is offered by the GenAI tool itself). As a result, GenAI tools would not qualify as online platforms (which are “hosting services that, at the request of a recipient of the service, stores and disseminates information to the public”). This definitional issue essentially means that the content GenAI providers offer is left up to Member States to regulate. However, national rules vary considerably across the EU and (more importantly) do not establish the same obligations as the DSA in order to ensure the expeditious removal of illegal, including manipulative, content.

As regards the AI Act, a notable development in the legislative process that concerns GenAI is the EP’s report on the AI Act. The main change proposed by the EP in this area is the establishment of rules specific to foundation models, including foundation models used in GenAI systems. In particular, Article 4a of the EP’s report sets out that foundation models should be developed in accordance with the general principles that bind AI systems, including human agency and oversight, privacy and data governance, and transparency. Article 4a also provides that “for foundation models, the general principles are translated into and complied with by providers by means of the requirements set out in Articles 28 and 28b.”

The above amendments that (a) explicitly bring foundation models under the scope of the AI Act, and (b) establish that foundation models should observe the principles that apply to all AI systems are a response to the rapid and widespread adoption of GenAI tools (with 100 million active users in the first months after its launch, Chat-GPT has been described as the “fastest-growing consumer application ever launched”). However, it is not clear to what extent the rules proposed by the EP that are specific to GenAI will contribute towards increasing the accountability of GenAI providers.

For example, under Article 28b(4), providers of foundation models used in GenAI systems should, inter alia, “train, and where applicable, design and develop the foundation model in such a way as to ensure adequate safeguards against the generation of content in breach of Union law in line with the generally-acknowledged state of the art, and without prejudice to fundamental rights, including the freedom of expression” (emphasis added). This provision seems to establish a content moderation requirement. However, the exact meaning of this obligation would need to be given flesh and bones, for it is currently not clear what GenAI providers are expected to do.

This is an important point because the two main instruments that regulate the dissemination of illegal/copyright-infringing content in the EU, namely the DSA and the DSM Copyright Directive, do not apply to GenAI providers. Concretely, as discussed above, GenAI tools may not qualify as “hosting services” under the DSA. Similarly, GenAI tools may not qualify as “online content-sharing service providers” under the Copyright Directive because the copyright-protected works are not uploaded by the users of the service (the copyright-protected works are scraped by the GenAI tool itself). As a result, the obligations the DSM Copyright Directive and the DSA establish for mitigating the dissemination of illegal/copyright-infringing content (e.g., the obligation to establish “notice and take down” mechanisms) do not apply to GenAI providers. The AI Act should fill that gap in a meaningful manner.

Conclusions

GenAI creates many opportunities for the media sector that may reduce the costs of producing and distributing media content, enhance personalisation, and improve monetisation capabilities. However, GenAI also poses several challenges, including copyright infringements (and difficulties in detecting them), lack of brand attribution, and limited accountability. The EU legal framework may address some of the above challenges through existing rules and rules that will soon be adopted.

The list of challenges discussed in this post is far from exhaustive. However, it illustrates that, given the fast pace at which GenAI tools are being adopted, it has become an imperative for media organisations to understand the tools the existing framework offers to remedy their concerns, and to design a regulatory strategy that ensures that ongoing and future initiatives deliver for the media sector.

Leave a Reply