A new kind of artificial intelligence technology was released by Microsoft.

The tool was designed to help programmers. It would suggest ready-made blocks of computer code for them to add to their own.

The new tool was liked by many programmers. Matthew Butterick was not among them. He and a group of other lawyers have filed a lawsuit against Microsoft and the other companies that designed and deployed Copilot, in order to get class-action status.

Copilot developed its skills by analyzing a lot of data. The computer code was posted to the internet. Mr. Butterick equates this process to piracy because the system doesn't acknowledge its debt. Microsoft and its partners are accused of violating the legal rights of millions of programmers.

A design technique called "A.I. training" is believed to be the first legal attack on a design technique that is poised to remake the tech industry. Many artists, writers, pundits and privacy activists complain that companies are using data that does not belong to them.

ImageA dark-haired man in a dark coat poses in front of greenery.
Matthew Butterick, a programmer and lawyer, said he was concerned that work he had done was being improperly employed in new artificial intelligence systems.Credit...Tag Christof for The New York Times
A dark-haired man in a dark coat poses in front of greenery.

The technology industry has seen similar lawsuits in the past. In the 1990s and 2000s, Microsoft fought the rise of open source software, seeing it as a threat to its business. As the importance of open source grew, Microsoft embraced it and even acquired a place where open source programmers could build and store their code.

Almost every new generation of technology has been challenged in some way. Bradley J. Hulbert is an intellectual property lawyer who specializes in this area of the law.

There is a groundswell of worry over artificial intelligence. Creative types worry that companies and researchers are using their work to create new technology without their consent and without providing compensation. A wide variety of systems are trained in this way, including art generators, speech recognition systems, and even self-driving cars.

An artificial intelligence lab in San Francisco backed by a billion dollars in funding from Microsoft is behind Copilot. Artificial intelligence technologies are being trained using digital data.

  • Microsoft: The company’s $69 billion deal for Activision Blizzard, which rests on winning the approval by 16 governments, has become a test for whether tech giants can buy companies amid a backlash.
  • Apple: Apple’s largest iPhone factory, in the city of Zhengzhou, China, is dealing with a shortage of workers. Now, that plant is getting help from an unlikely source: the Chinese government.
  • Amazon: The company appears set to lay off approximately 10,000 people in corporate and technology jobs, in what would be the largest cuts in the company’s history.
  • Meta: The parent of Facebook said it was laying off more than 11,000 people, or about 13 percent of its work force

Nat Friedman, the chief executive of GitHub, said that using existing code to train the system was fair use of the material under the Copyright Act. This argument hasn't been tested in a court case.

Mr. Butterick said in an interview that the ambitions of Microsoft and Openai go far beyond Copilot. They would like to train on any data for free.

ImageMr. Butterick and a team of other lawyers are suing Microsoft and other developers of Copilot.
Mr. Butterick and a team of other lawyers are suing Microsoft and other developers of Copilot.Credit...Mike Segar/Reuters
Mr. Butterick and a team of other lawyers are suing Microsoft and other developers of Copilot.

GPT 3 was unveiled in 2020. Huge amounts of digital text, including thousands of books, were used to train the system.

The system was able to predict the next word in a sequence. The thought could be completed with the entire paragraphs of text if someone typed a few words into the model. The system could write its own posts on social media.

The researchers who built the system were surprised to learn that it could even write computer programs.

A new system, Codex, was trained by Openai on a new collection of data. In a research paper detailing the technology, the lab said that at least some of the code came from a Microsoft service.

The underlying technology for Copilot was developed by Microsoft. After being tested with a relatively small number of programmers, Copilot was made available to all programmers on the internet.

Many programmers who have used the technology said that the code that Copilot produces is simple and might be useful to a larger project, but must be adjusted, augmented and scrutinized. Some programmers only use it if they are learning to code or trying to learn a new language.

ImageCodex became the building block for Copilot.
Codex became the building block for Copilot.Credit...Jason Henry for The New York Times
Codex became the building block for Copilot.

Mr Butterick was worried that Copilot would destroy the global community of programmers who have built the code at the heart of most modern technologies. A few days after the system's release, he published a post titled "This copilot is stupid and wants to kill me."

The community of programmers who openly share their code with the world is identified by Mr. Butterick. Over the past 30 years, open source software has helped drive the rise of most of the technologies that consumers use on a daily basis.

The sharing of open source software is governed by licenses that ensure that it is used in ways that benefit the community of programmers. Mr. Butterick believes that Copilot will make open source coders obsolete as it improves.

He filed a suit after complaining about the issue for a long time. The suit has not yet been granted class-action status, despite being in the early stages.

Many legal experts were surprised that Mr. Butterick didn't accuse Microsoft, GitHub or OpenAI of piracy. His suit argues that the companies have violated the terms of service and privacy policies while also being in violation of a federal law that requires companies to display copyrighted information when they make use of material.

The copyright issue could eventually be addressed by the suit, according to Mr. Butterick and Joe Saveri.

ImageJoe Saveri is one of the lawyers involved in the lawsuit.
Joe Saveri is one of the lawyers involved in the lawsuit.Credit...Tag Christof for The New York Times
Joe Saveri is one of the lawyers involved in the lawsuit.

The lawsuit was not commented on by Microsoft or Openai.

Training an A.I. system on copyrighted material is not illegal according to most experts. If the system creates material that is similar to the data it was trained on, it could be done.

Pam Samuelson is a professor at the University of California, Berkeley who specializes in intellectual property and its role in modern technology. There is a need for a legal assessment.

Dr. Samuelson said that it is no longer a toy issue.