The copyright directive is a warning signal for Europe’s AI ambitions

The EU has set great ambitions around artificial intelligence, seeking to accelerate innovation and foster a much more competitive environment. But as the example of the copyright directive shows, much can go wrong for Europe’s AI businesses if they do not pay attention to what will be proposed.

It has now been over a month since the EU put forward its White Paper on Artificial Intelligence to harness and unleash Europe’s potential in the tech sector. This initiative can be regarded as EU’s first serious attempt to embark in the race for leading in AI, competing on the global scale with countries such as China and the US. It comes as no surprise to many that EU’s approach in becoming a leader in AI takes the form of more regulation and reconciling pieces of legislation so that Europe is ‘Fit for the Digital Age’ in the context of the rapid emergence of AI-based technologies.

Whether more regulation is the answer to unlock Europe’s potential around AI technologies remains to be seen. However, with the EU commencing on this path, companies may want to engage along the way to ensure that the end result is not another Copyright Directive - a piece of legislation that has dragged EU legislators into years of discussions, and which has done little favour to European AI businesses.

Copyright obstacles for AI

In the age of “Big Data”, access to high-quality data is fundamental for all businesses, but perhaps even more so for the AI machine learning algorithms whose development relies on feeding huge amounts of labelled data to provide accurate results. Although when it comes to data, the focus seems to be much around the protection of personal data, AI developers can rely just as much on “raw” data such as literary, typographic, and artistic creations to develop some type of algorithms.

Under EU law, this data requires not much less protection than personal data. In particular, Articles 3 and 4 of the Copyright Directive act as a potential stumbling-bloc for AI development, by limiting the exceptions from copyright rules for text and data mining (TDM), a technique which involves gathering information though an automatic process to extract and analyse previously unknown information.

Whereas Article 3 limits the exception to copyright rules to non-profit entities, such as non-profit research organisations and cultural institutions, the exception under Article 4 comprises a wider category of users, including for-profit entities, but is narrower in scope; the caveat to this is that the copyright holder has the right to opt out of the exemption. This creates a derivative market for text and data mining where content holders license, control or restrict TDM, with for-profit data miners remaining in the power of copyright holders.

With many industries relying heavily on the ability of machines to “read” and “interpret” data to develop state of the art systems (such as digital and virtual assistants, chatbots, image recognition and comparison applications, and voice or personal assistants), mining data is fundamental for their business. The “opt out” clause of Article 4 leaves, however, for-profit entities, commercial research organisations, innovators and developers with their hands somewhat tied in terms of competitiveness, if we are to compare with other copyright regimes around the world.

Glancing over at the competition

Looking at the US copyright landscape, businesses find themselves comparatively safe for TDM practices due to the flexibility of the “fair use” principle under §107 of the US Copyright Act. This concept has accommodated the growth and use of data-mining as a tool for AI, in the absence of specific provisions for machine learning and TDM techniques. Although the scope of “fair use” remains a matter of case law development, which requires assessing whether a particular use of a protected work is “fair”, the US case law has so far deemed that incidental or intermediate copying should not be considered as infringement and that TDM activities should fall under this perspective.

In other words, in the US, mining techniques are not considered to be in breach of copyright rules because they are viewed to rely on extracting the principles, facts, correlations and the informational value contained in works, and do not entail using the works as works. This stands in clear opposition to the focus around authorship as it is the case under EU copyright rules. In the era where ‘data is the new gold’, the principles governing the US copyright landscape are more favourable for AI businesses.

Is Europe ready to take the lead on AI?

With AI technologies scaling up all over the world, Europe has realised it needs to step up if it wants to remain visible on the global tech map. As a means to do so, the Commission has chosen the regulatory path, having already announced it will follow up later this year with a legislative proposal for AI.

This background may provide a window of opportunity for European businesses to guide policymakers into the appropriate type of regulation that would lead to a competitive environment for AI businesses, especially since the new Commission has been vocal about its determination to catch up on the race on AI. The risk for companies of not engaging may very well mean disproportionate regulatory measures that do not reflect the realities of the market or lead to an insufficiently innovative environment for AI.