the legalities, ethics and practicalities of building or distilling more intelligent models from 2.0 and o3
with the u.s. copyright office recently clarifying that content generated exclusively from an ai does not have copyright protection, the question arises: can one legally use gemini 2.0 or openai o3 to build a more intelligent ai through distillation or other means?
so first let's explore the legalities of using top models like gemini's 2.0 and openai's o3 to build more intelligent models.
perplexity:
"The U.S. Copyright Office clarified that purely AI-generated content (e.g., o3 responses to text prompts) cannot be copyrighted, even with complex prompts. This means:
Technical knowledge/patterns in o3 outputs could theoretically be extracted without copyright infringement
Exact verbatim copies of protected training data in outputs remain problematic
While o3 outputs might not be protected, their training data often includes copyrighted works:
Ongoing lawsuits challenge whether AI training constitutes infringement (Silverman v. OpenAI)
Courts haven't ruled definitively on whether training qualifies as fair use."
so until courts rule definitively otherwise, it is provisionally legal to distill, or through other means, build more intelligent ai models from 2.0, o3 and other top models.
now let's explore terms of service considerations.
perplexity:
"OpenAI's terms explicitly prohibit using outputs to train competing models. Violations could lead to:
Account termination
Potential breach of contract claims API access revocation (as occurred with DeepSeek)
While copyright law might permit distillation of non-protected elements, contractual terms create separate legal exposure:
Microsoft blocked DeepSeek's API access despite unclear copyright status
Enterprise partners often require contractual compliance beyond copyright law"
essentially, terms of service prohibitions are purely contractual. considering that the question of whether it was legal and/or ethical for top ai developers to have scraped the internet for their data sets remains unanswered, terms of service violations are an as yet unresolved ethical matter that was therefore be left to the discretion of developers considering building more intelligent models from the top ais.
lastly, let's consider how building these models would be done.
perplexity:
"Building or distilling an AI model from another model (like OpenAI's GPT-3) without access to its weights and other core parameters is challenging but theoretically possible under certain conditions:
Distillation via APIs:
Knowledge distillation can occur by querying a model through its API, capturing input-output pairs, and using this data to train a new "student" model. This approach does not require direct access to the original weights but relies on extensive interaction with the model.
Weight-Agnostic Neural Networks:
Some research explores architectures that perform tasks without explicit weight training. These models optimize architectures rather than weights, showing potential for tasks like reinforcement learning and basic supervised learning.
Custom Architectures:
AI models can be built from scratch without frameworks or pre-trained weights by leveraging fundamental principles of neural networks and programming tools like NumPy or MATLAB. However, this requires significant expertise and computational resources.
These methods avoid direct access to proprietary weights."
with deepseek r1 having substantially lowered the barrier to entry for creating foundational ai models, the above considerations become increasingly relevant for ai developers.