AI

Denas Grybauskas, Chief Governance and Strategy Officer, Oxylabs – Interview Series

Denas Grybauskas is Chief Governance and Strategy Officer at Oxylabs, a global leader in Web Intelligence Collection and Premium proxy solutions.

Founded in 2015, Oxylabs offers one of the world’s largest ethical procurement agency networks – over 177 million IPs in a 195 country-wide, with advanced tools such as Web Unblocker, Web Scraper API and Oxycopilot, Oxycopilot, AI-powered scratch assistants convert natural language to structured data quality.

You have made an impressive legal and governance journey in the field of legal technology in Lithuania. What motivates you to deal with one of the most polarizing challenges of AI in your role in Oxylabs, namely ethics and copyright?

Oxylabs has been the industry’s flagpole responsible for innovation. We are the first to advocate for the industry standards of ethical agency procurement and network scratches. Now, with AI moving so fast, we must ensure that innovation is balanced with responsibility.

We think this is a huge problem facing the AI ​​industry and we can see the solution as well. By providing these data sets, we can make developments on the same page about fair AI development, which is beneficial for everyone involved. We know that we put creator rights at the forefront, and also provide content for future AI systems development, so we created these datasets as something that can meet the needs of today’s market.

The UK is in a fierce copyright war, and both sides have strong voices. How do you explain the current state of the debate between AI innovation and creator rights?

While it is important for the UK government to prioritize productive technological innovation, it is crucial that creators should be enhanced and protected by AI rather than stolen. The legal framework currently under debate must find a sweet spot between promoting innovation while protecting creators, and I hope in the coming weeks we see them finding a way to achieve a balance.

Oxylabs has just launched the world’s first YouTube dataset, which requires the creator’s consent to conduct AI training. How does this consent process work? How scalable is it for other industries like music or publishing?

All original videos in all original videos in the dataset have been explicitly agreed to the creator, used for AI training, connecting creators and innovators through ethical purposes. All datasets provided by Oxylab include videos, transcripts and rich metadata. Although such data has many potential use cases, Oxylabs is refined and specially prepared for AI training, a purpose that content creators consciously agree on.

Many tech leaders believe that requiring all creators to explicitly opt in can “kill” the AI ​​industry. What is your reaction to this claim? How to prove the oxylabs method?

Requirements for each material use of AI training, previous clear choices pose significant operational challenges and will be costly with substantial AI innovation. It can inadvertently inspire companies to transfer development activities to jurisdictions with strict law enforcement or different copyright systems rather than protecting creators’ rights. However, this does not mean that there is no intermediate position to encourage AI development when respecting copyright. Instead, what we need is a viable mechanism that can simplify the relationship between AI companies and creators.

These datasets provide a way forward. Unless the copyright owner explicitly opts out, a selection model of what content can be used. The third approach is to facilitate transactions between publishers, creators and AI companies through technical solutions such as online platforms.

Ultimately, any solution must operate within the scope of applicable copyright and data protection laws. At Oxylabs, we believe that AI innovation must be pursued responsibly, and our goal is to contribute to legitimate practical frameworks that respect creators while achieving progress.

The biggest hurdle your team must overcome is the biggest hurdle that makes consent-based datasets feasible?

YouTube has opened the way for us to enable content creators to easily and easily license their work for AI training. After that, our work is primarily technical, involving collecting data, cleaning and constructing to prepare data sets and building the entire technical setup for companies to access the data they need. But this is something we have been doing to some extent over the years. Of course, each situation presents its own set of challenges, especially when you deal with something as large and complex as multi-modal data. But we have the knowledge and technical capabilities. In view of this, once YouTube authors have the opportunity to give consent, the rest is just a matter of investing our time and resources into it.

Apart from YouTube content, do you envision the future of other major content types, such as music, writing, or digital art, that you can also systematically use as training data?

For some time, we have pointed out the need to adopt a systematic approach to consent and content licensing to enable AI innovation while balancing creator rights. There will be mutual benefits only if both parties have convenient and cooperative ways to achieve their goals.

This is just the beginning. We believe that providing data sets like ours in various industries can provide a solution that ultimately brings copyright debates to friendly closures.

Does the importance of products such as Oxylabs’ ethical datasets depend on different AI governance approaches in the EU, the UK and other jurisdictions?

On the one hand, the availability of clear data sets puts the field of AI companies based on jurisdictions that are inclined to be stricter regulations. The main concern of these companies is that getting strict consent rules compared to creators will only give unfair advantages to AI developers in other jurisdictions. The problem is not that these companies don’t care about consent, but that there is no convenient way to get consent, and they are doomed to fall behind.

On the other hand, we believe that if consent is simplified and access to data that obtains AI training, there is no reason why this approach should not be the preferred approach worldwide. Our dataset of licensed YouTube content is a step towards this simplification.

With the public increasingly distrust of how AI is trained, how do you think transparency and consent can be a competitive advantage for tech companies?

Although transparency is often seen as a barrier to competitive advantage, it is also our biggest weapon against distrust. The more transparency an AI company can provide, the more evidence there is to be ethical and helpful AI training, thus rebuilding trust in the AI ​​industry. In turn, creators see that they and society can gain value from AI innovation and have more reasons to agree in the future.

Oxylabs are often associated with data scratching and web intelligence. How does this new ethics initiative adapt to the company’s broader vision?

The release of YouTube datasets continues our mission at Oxylabs to establish and promote ethical industry practices. To this end, we co-founded the Ethical Network Data Collection Program (EWDCI) and introduced an industry-first transparent layer framework for agency procurement. We have also launched part of the 4β project to enable researchers and academics to maximize their research impact and enhance understanding of critical public network data.

Going forward, do you think the government should abide by default consent training data or should it remain an initiative of volunteer industry leaders?

In a free market economy, it is usually best to let the market correct itself. By allowing innovation to respond to market demand, we continue to reshape and renew our prosperity. Heavy legislation has never been a good first choice, and should only resort to efforts when all other avenues to allow innovation to ensure justice exhausted.

It looks like we have reached this point in AI training. The licensing options YouTube offers for creators and our datasets suggest that the ecosystem is actively seeking ways to adapt to new reality. So while clear regulations certainly need to ensure that everyone acts in their rights, the government may want to stomp easily. They may not need explicit consent in every situation, but rather want to study ways in which the industry can develop mechanisms to address current tensions and get tips from that action when legislation is being made to encourage innovation rather than hinder innovation.

Will you not delay innovation for startups and AI developers who want to prioritize ethical data usage?

One way startups can help promote ethical data is to develop technical solutions that simplify the process of obtaining consent and deriving value. As an option to get transparent procurement data, AI companies don’t need to compromise on speed. So I suggest they open their eyes for this kind of product.

Thank you for your excellent interview, and readers who hope to learn more should visit Oxylabs.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Check Also
Close
Back to top button