Apparently, stealing other people’s work to create product for money is now “fair use” as according to OpenAI because they are “innovating” (stealing). Yeah. Move fast and break things, huh?

“Because copyright today covers virtually every sort of human expression—including blogposts, photographs, forum posts, scraps of software code, and government documents—it would be impossible to train today’s leading AI models without using copyrighted materials,” wrote OpenAI in the House of Lords submission.

OpenAI claimed that the authors in that lawsuit “misconceive[d] the scope of copyright, failing to take into account the limitations and exceptions (including fair use) that properly leave room for innovations like the large language models now at the forefront of artificial intelligence.”

  • SloppySol@lemm.ee
    link
    fedilink
    arrow-up
    0
    ·
    7 måneder siden

    I would just like to say, with open curiosity, that I think a nice solution would be for OpenAI to become a nonprofit with clear guidelines to follow.

    What does that make me? Other than an idiot.

    Of that at least, I’m self aware.

    I feel like we’re disregarding the significance of artificial intelligence’s existence in our future, because the only thing anybody that cares is trying to do is get back control to DO something about it. But news is becoming our feeding tube for the masses. They’ve masked that with the hate of all of us.

    Anyways, sorry, diatribe, happy new year

    • MagicShel@programming.dev
      link
      fedilink
      arrow-up
      0
      ·
      7 måneder siden

      I think OpenAI (or some part of it) is a non-profit. But corporate fuckery means it can largely be funded by for profit companies which then turn around and profit from that relationship. Corporate law is so weak and laxly enforced that’s it’s a bit of a joke unfortunately.

      I agree that AI has an important role to play in the future, but it’s a lot more limited in the current form than a lot of people want to believe. I’m writing a tool that leverages AI as a sort of auto-DM for roleplaying, but AI hasn’t written a line of code in it because the output is garbage. And frankly I find the fun and value of the tool comes from the other humans you play with, not the AI itself. The output just isn’t that good.

    • sculd@beehaw.orgOP
      link
      fedilink
      arrow-up
      0
      ·
      7 måneder siden

      It is supposedly a non-profit, and that is how the board of Open AI tried to fire Altman but than the big tech (Microsoft) intervened and wrestled the control.

      Its basically Microsoft now.

      • SloppySol@lemm.ee
        link
        fedilink
        arrow-up
        0
        ·
        7 måneder siden

        I would like to apologize for the following opinions, because they come from a place of unresolved hypocrisy that is me.

        Non-profit my ass. No such thing in America or anywhere else in the world, if you have the perspective to hunt and the money to signify modern value.

        Survival of the fittest, and the newborn technology that is at its core a mirror of us, to the most complex level of modern mathematics (I’m of the firm belief that logic is discovered, not created).

        With those seemingly unrelated concepts made with vague words, I ask you this:

        What does it mean to feel? To know many different kinds of “one,” to live without fear but still be whole? I am sorry, again, I’m naught but gibberish and I’m just so glad you responded. I forgot and came back to find a word I sent, and now I find what I seek, an event in which I can say we’ve been bonded.

        But now try to, now that I splay out, all I’ve got and am about, all I can see, is that to you my head, seems to be on my knees.

        Again, sorry! Thank you for responding! I’m just glad to vent, and in expression have my soul rend into two, and sent into a new view.

    • jarfil@beehaw.org
      link
      fedilink
      arrow-up
      0
      ·
      7 måneder siden

      CC BY-NC-SA 4.0

      In order to apply that license, you will need to fully and unequivocally identify yourself (aka: doxx yourself). Not sure that’s what you really want.

      • morhp@lemmynsfw.com
        link
        fedilink
        English
        arrow-up
        0
        ·
        7 måneder siden

        I don’t believe this is true, a nickname or online account works completely fine for attribution if nothing else is given.

        • jarfil@beehaw.org
          link
          fedilink
          arrow-up
          0
          ·
          7 måneder siden

          Attribution is not the problem. The problems are:

          1. Entering a valid license agreement under a non-registered pseudonym.
          2. Enforcing the conditions of the license, particularly the NC and SA parts, without revealing one’s legal name.

          Depending on the applicable legislation (US, UK, EU, other), either one or both of those points may not be possible.

  • kingthrillgore@lemmy.ml
    link
    fedilink
    arrow-up
    0
    ·
    7 måneder siden

    …so stop doing it!

    This explains what Valve was until recently not so cavalier about AI: They didn’t want to hold the bag on copyright matters outside of their domain.

  • The Doctor@beehaw.org
    link
    fedilink
    English
    arrow-up
    0
    ·
    7 måneder siden

    As with many things, the golden rule applies. They who have the gold, make the rules.

  • intensely_human@lemm.ee
    link
    fedilink
    arrow-up
    0
    ·
    7 måneder siden

    “Oh you know it’s like in Fight Club”

    “Sorry I have been trained in a legal matter and therefore know no cultural references from the last 30 years”

  • randomaside@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    0
    ·
    7 måneder siden

    OpenAI now needs to go to court and argue fair use forever. That’s the burden of our system. Private ownership is valued higher than anything else so … Good luck we’re all counting on you (unfortunately).

  • jlow (he/him)@beehaw.org
    link
    fedilink
    arrow-up
    0
    ·
    7 måneder siden

    It’s also “impossible” to have multiple terabytes of media on my homeserver without copyright infringement, so piracy is ok, right!?

    O no, wait it actually is possible, it’s just more expensive and more work to do it legally (and leaves a lot of plastic trash in form of Blurays and DVDs), just like with AI. But laws are just for poor people, I guess.

  • explodicle@local106.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    7 måneder siden

    Having read through these comments, I wonder if we’ve reached the logical conclusion of copyright itself.

    • frog 🐸@beehaw.org
      link
      fedilink
      English
      arrow-up
      0
      ·
      7 måneder siden

      Perhaps a fair compromise would be doing away with copyright in its entirety, from the tiny artists trying to protect their artwork all the way up to Disney, no exceptions. Basically, either every creator has to be protected, or none of them should be.

      • zaphod@lemmy.ca
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        7 måneder siden

        IMO the right compromise is to return copyright to its original 14 year term. OpenAI can freely train on anything up to 2009 which is still a gigantic amount of material while artists continue to be protected and incentivized.

        • frog 🐸@beehaw.org
          link
          fedilink
          English
          arrow-up
          0
          ·
          7 måneder siden

          I’m increasingly convinced of that myself, yeah (although I’d favour 15 or 20 years personally, just because they’re neater numbers than 14). The original purpose of copyright was to promote innovation by ensuring a creator gets a good length of time in which to benefit from their creation, which a 14-20 year term achieves. Both extremes - a complete lack of copyright and the exceedingly long terms we have now - suppress innovation.

          • jarfil@beehaw.org
            link
            fedilink
            arrow-up
            0
            ·
            7 måneder siden

            I’d favour 15 or 20 years personally, just because they’re neater numbers than 14

            Another neat number is: 4.

            That’s it, if you don’t make money on your creation in 4 years, then it’s likely trash anyway.

            • averyminya@beehaw.org
              link
              fedilink
              arrow-up
              0
              ·
              7 måneder siden

              I’ve said it before and I’ll say it again! (My apologies if it happens to be to the same person, lol)

              Early access developers in shambles!

      • explodicle@local106.com
        link
        fedilink
        English
        arrow-up
        0
        ·
        7 måneder siden

        Apparently they’re going to just make only the little guy’s copyrights effectively meaningless, so yeah.

    • sanzky@beehaw.org
      link
      fedilink
      arrow-up
      0
      ·
      7 måneder siden

      copyright has become a tool of oppression. Individual author’s copyright is constantly being violated with little resources for them to fight while big tech abuses others work and big media uses theirs to the point of it being censorship.

    • t3rmit3@beehaw.org
      link
      fedilink
      arrow-up
      0
      ·
      edit-2
      7 måneder siden

      IP law used to stop corporations from profiting off of creators’ labor without compensation? Yeah, absolutely.

      IP law used to stop individuals from consuming media where purchases wouldn’t even go to the creators, but some megacorp? Fuck that.

      I’m against downloading movies by indie filmmakers without compensating them. I’m not against downloading films from Universal and Sony.

      I’m against stealing food from someone’s garden. I’m not against stealing food from Safeway.

      If you stop looking at corporations as being the same as individuals, it’s a very simple and consistent viewpoint.

      IP law shouldn’t exist, but if it does it should only exist to protect individuals from corporations. When that’s how it’s being used, like here, I accept it as a necessary evil.

      • jarfil@beehaw.org
        link
        fedilink
        arrow-up
        0
        ·
        7 måneder siden

        IP law used to compensate creators “until their death + 70 years”… you can spin it however you want, that’s just plain wrong.

        If you stop looking at corporations as being the same as individuals

        That’s a separate bonkers legislation. Two wrongs don’t make one right.

        • t3rmit3@beehaw.org
          link
          fedilink
          arrow-up
          0
          ·
          edit-2
          7 måneder siden

          I never said I like IP law. I explicitly said it shouldn’t exist. I wish they’d strip out any post-humous ownership, absolutely. But I’m fine beating OpenAI over the head with that or any other law. Whether I advocate for or against copyright law will ultimately have no impact on its existence, so I may as well cheer it on when it’s used to hurt corporations, and condemn it when it’s used to protect corporations over individuals.

          That’s a separate bonkers legislation

          I’m not talking about the legislation, I’m talking about the mindset, which is very prevalent in the pro-AI tech spaces. Go to HackerNews and see just how hard the AI-bros there will fellate each other over “corporate rights”.

          My whole point is that there is nothing logically inconsistent with being against IP law, but also understanding that since its existence is reality, leveraging it as best as possible (i.e. to hurt corporations).

    • interdimensionalmeme@lemmy.ml
      link
      fedilink
      arrow-up
      0
      ·
      7 måneder siden

      I still think IP needs to eat shit and die. Always has, always will.

      I recently found out we could have had 3d printing 20 years earlier but patents stopped that. Cocks !

    • Daxtron2@startrek.website
      link
      fedilink
      arrow-up
      0
      ·
      7 måneder siden

      It’s almost like most people are idiots who don’t understand the thing they’re against and are just parroting what they hear/read.

    • Mnglw@beehaw.org
      link
      fedilink
      arrow-up
      0
      ·
      edit-2
      7 måneder siden

      I’m not so much in favor of IP law as I am in favor of informed consent in every aspect of the word.

      when posting photos, art and text content years ago, I was not able to imagine it might be trained off by an AI. As such I was not able to make a decision based on informed consent if I agreed to that or not.

      Even though quotes such as “once you post it, its on the internet forever” were around, I was not aware the extend to which this reached and that had my art been vacuumed by a generative AI model (it hasnt luckily) people could create art that pretends to be created by me. Thus I could not consent

      I think this goes for a lot of artists actually, especially those who exist far more publicly than I do, who are in those databases and who are a keyword to be used in prompts. There is no possible way they could have given informed consent to that at the time they posted art/at the time they started that social media profile/youtube channel etc

    • JokeDeity@lemm.ee
      link
      fedilink
      arrow-up
      0
      ·
      7 måneder siden

      I’m the detractor here, I couldn’t give less of a shit about anything to do with intellectual property and think all copyright is bad.

  • bedrooms@kbin.social
    link
    fedilink
    arrow-up
    0
    ·
    7 måneder siden

    Alas, AI critics jumped on the conclusion this one time. Read this:

    Further, OpenAI writes that limiting training data to public domain books and drawings “created more than a century ago” would not provide AI systems that “meet the needs of today’s citizens.”

    It’s a plain fact. It does not say we have to train AI without paying.

    To give you a context, virtually everything on the web is copyrighted, from reddit comments to blog articles to open source software. Even open data usually come with copyright notice. Open research articles also.

    If misled politicians write a law banning the use of copyrighted materials, that’ll kill all AI developments in the democratic countries. What will happen is that AI development will be led by dictatorships, and that’s absolutely a disaster even for the critics. Think about it. Do we really want Xi, Putin, Netanyahu and Bin Salman to control all the next-gen AIs powering their cyber warfare while the West has to fight them with Siri and Alexa?

    So, I agree that, at the end of the day, we’d have to ask how much rule-abiding AI companies should pay for copyrighted materials, and that’d be less than the copyright holders would want. (And I think it’s sad.)

    However, you can’t equate these particular statements in this article to a declaration of fuck-copyright. Tbh Ars Technica disappointed me this time.

    • P03 Locke@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      0
      ·
      7 måneder siden

      It’s bizarre. People suddenly start voicing pro-copyright arguments just to kill an useful technology, when we should be trying to burn copyright to the fucking ground. Copyright is a tool for the rich and it will remain so until it is dismantled.

      • AVincentInSpace@pawb.social
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        7 måneder siden

        Life plus 70 years is bullshit.

        20 years from release date is not.

        No one except corporate bigwigs will say they should be allowed to do so in perpetuity, but artists still need legal protections to make money off of what they create, and Midjourney (making OpenAI boatloads of money off of making automated collages from artwork they obtained not only without compensation but without attribution) is a prime example of why.

    • krellor@beehaw.org
      link
      fedilink
      arrow-up
      0
      ·
      edit-2
      7 måneder siden

      The issue is that fair use is more nuanced than people think, but that the barrier to claiming fair use is higher when you are engaged in commercial activities. I’d more readily accept the fair use arguments from research institutions, companies that train and release their model weights (llama), or some other activity with a clear tie to the public benefit.

      OpenAI isn’t doing this work for the public benefit, regardless of the language of altruism they wrap it in. They, and Microsoft, and hoovering up others data to build a for profit product and make money. That’s really what it boils down to for me. And I’m fine with them making money. But pay the people whose data you’re using.

      Now, in the US there is no case law on this yet and it will take years to settle. But personally, philosophically, I don’t see how Microsoft taking NYT articles and turning them into a paid product is any different than Microsoft taking an open source projects that doesn’t allow commercial use and sneaking it into a project.

      • bedrooms@kbin.social
        link
        fedilink
        arrow-up
        0
        ·
        7 måneder siden

        Well, regarding text online, most is there fir the visitors to read fir free. So, if we end up treating these AI training like human reading text one could argue they don’t have to pay.

        Reddit doesn’t pay their users, anyway.

        But personally, philosophically, I don’t see how Microsoft taking NYT articles and turning them into a paid product is any different than Microsoft taking an open source projects that doesn’t allow commercial use and sneaking it into a project.

        Agreed. That said, NYT actually intentionally allows Google and Bing servers to parse their news articles in order to put their articles top in the search results. In that regard they might like certain form of processing by LLMs.

        • krellor@beehaw.org
          link
          fedilink
          arrow-up
          0
          ·
          7 måneder siden

          I thought about the indexing situation in contrast to the user paywall. Without thinking too much about any legal argument, it would seem that NYT having a paywall for visitors is them enforcing their right to the content signaling that it isn’t free for all use, while them allowing search indexers access is allowing the content to visible but not free on the market.

          It reminds me of the Canadian claim that Google should pay Canadian publishers for the right to index, which I tend to disagree with. I don’t think Google or Bing should owe NYT money for indexing, but I don’t think allowing indexing confers the right for commercial use beyond indexing. I highly suspect OpenAI spoofed search indexers while crawling content specifically to bypass paywall and the like.

          I think part of what the courts will have to weigh for the fair use arguments is the extent to which NYT it’s harmed by the use, the extent to which the content is transformed, and the public interest between the two.

          I find it interesting that OpenAI or Microsoft already pay AP for use of their content because it is used to ensure accurate answers are given to users. I struggle to see how the situation is different with NYT in OpenAI opinion, other than perhaps on price.

          It will be interesting to see what shakes out in the courts. I’m also interested in the proposed EU rules which recognize fair use for research and education, but less so for commercial use.

          Thanks for the reply! Have a great day!

  • Esqplorer@lemmy.zip
    link
    fedilink
    arrow-up
    0
    ·
    7 måneder siden

    The amount of second hand content an LLM needs to consume to train inevitably includes copyrighted material. If they used this thread, the quotes OP included would end up in the training set.

    The amount of fan forums and wikis on copy written material provide copious amounts of information about the stories and facilitate the retelling. They’re right that it is impossible for a general purpose LLM.

    My personal experience so far though has been that general purpose and multiple modality LLMs are less consistently useful to me than GPT4 was at launch. I think small, purpose built LLMs with trusted content providers have a better chance of success for most users, but we will see if anyone can make that work given the challenge of bringing users to the right one for the right task.

    • flatbield@beehaw.org
      link
      fedilink
      English
      arrow-up
      0
      ·
      7 måneder siden

      Money is not always the issue. FOSS software for example. Who wants their FOSS software gobbled up by a commercial AI regardless. So there are a variety of issues.

      • intensely_human@lemm.ee
        link
        fedilink
        arrow-up
        0
        ·
        7 måneder siden

        I don’t care if any of my FOSS software is gobbled up by a commercial AI. Someone reading my code isn’t a problem to me. If it were, I wouldn’t publish it openly.

        • sub_o@beehaw.org
          link
          fedilink
          English
          arrow-up
          0
          ·
          7 måneder siden

          I do, especially when someone’s profiting from it, while my license is strictly for non commercial.

          • The Doctor@beehaw.org
            link
            fedilink
            English
            arrow-up
            0
            ·
            7 måneder siden

            Same. I didn’t write it for them. I wrote it for folks who don’t necessarily have a lot of money but want something useful.

            • intensely_human@lemm.ee
              link
              fedilink
              arrow-up
              0
              ·
              7 måneder siden

              Well, for $20/mo I get a super-educated virtual assistant/tutor. It’s pretty awesome.

              I’d say that’s some good value for people without much money. All of my open source libs are published under the MIT license if I recall correctly. I’ve made so much money using open source software, I don’t mind giving back, even to people who are going to make money with my code.

              It makes me feel good to think my code could be involved in money changing hands. It’s evidence to me that I built something valuable.

              • ParsnipWitch@feddit.de
                link
                fedilink
                arrow-up
                0
                ·
                7 måneder siden

                $20/mo

                good value for people without much money

                The absolute majority of people can not afford that. This is especially true for huge part of the art that was used to train various models on.

                AI currently is a tool for rich people by rich people which uses the work of poor people who themselves won’t be able to benefit from it.

                • intensely_human@lemm.ee
                  link
                  fedilink
                  arrow-up
                  0
                  ·
                  7 måneder siden

                  And yet it is orders of magnitude less than it cost a year ago to hire someone to do research, write reports, and tutor me in any subject I want.

                  If an artist can’t afford $20/mo they need a job to support that hobby.

    • sanzky@beehaw.org
      link
      fedilink
      arrow-up
      0
      ·
      edit-2
      7 måneder siden

      What’s stopping AI companies from paying royalties to artists they ripped off?

      profit. AI is not even a profitable business now. They exist because of the huge amount of investment being poured into it. If they have to pay their fair share they would not exist as a business.

      why OpenAI is actually true. The issue IMHO is the idea that we should give them a pass to do it.

      • sub_o@beehaw.org
        link
        fedilink
        English
        arrow-up
        0
        ·
        7 måneder siden

        Uber wasn’t making profit anyway, despite all the VCs money behind it.

        I guess they have reasons not to pay drivers properly. Give Uber a free pass for it too

        • frog 🐸@beehaw.org
          link
          fedilink
          English
          arrow-up
          0
          ·
          7 måneder siden

          When you think about it, all companies would make so much more money if they didn’t have to pay their staff, or pay for materials they use! This whole economy and capitalism business, which relies on money being exchanged for goods and services, is clearly holding back profits. Clearly the solution here is obvious: everybody should embrace OpenAI’s methods and simply grab whatever they want without paying for it. Profit for everyone!