Maven Imported 1.12 Million Fediverse Posts (wedistribute.org)
from deadsuperhero@lemmy.ml to fediverse@lemmy.ml on 12 Jun 20:34
https://lemmy.ml/post/16803328

Maven, a new social network backed by OpenAI’s Sam Altman, found itself in a controversy today when it imported a huge amount of posts and profiles from the Fediverse, and then ran AI analysis to alter the content.

#fediverse

threaded - newest

FinchHaven@sfba.social on 12 Jun 20:40 next collapse

@deadsuperhero

Been looking into #Maven all morning

Just going to copy-paste two posts

The head admin/dev @jsecretan claims:

"Happy to remove any of your posts from Maven and cease ingestion from those servers going forward"

So, after the fact, individuals on Mastodon have to contact you personally and ask you to stop?

Is that your position?

Reminds me of Byron Miller (@Supernovae @universeodon.com) and his since-deleted "In four months of having full text seach [we haven't heard from anyone who has be directly harmed]..."

That last is a paraphrase because Supernovae has pretty much removed any mention of himself from the Fediverse, right down to deleting his involvement with Mastodon on Github, causing renchap to opine:

"I suspect that @Supernovae closed it because they do not want to be involved with Mastodon anymore."

here: https://github.com/mastodon/mastodon/issues/21398#issuecomment-2145321855

Executive summary: there are a lot of people On Here(tm) who don't appreciated every new idea all you bright-eyed young creatives can come up with

FinchHaven@sfba.social on 12 Jun 20:41 collapse

@deadsuperhero

Instructive to read #Maven's #About page and see who's behind it.

Here: https://www.heymaven.com/about

Selected excerpts from "Who is behind Maven?"

"CEO Ken Stanley is an expert on open-ended discovery in both AI and human systems and ... (most recently leading the Open-Endedness Team at #OpenAI )."

At: "Is Maven part of a larger company?"

"No, Maven is an independent startup."

But

"Here are a few of our investors, who also commented on their reasons for supporting Maven:

-- Ev Williams, co-founder of #Twitter: “Maven lets you follow your deepest curiosities instead of the trends of the day.”

-- Sam Altman, CEO of OpenAI: ”In Maven, there is a chance for AI to play a role in fixing much that is broken in our online discourse.”

-- Rana El Kaliouby, co-founder of Affectiva..."

Sam Altman

#SamAltman

Where have I heard that name before?

fossphi@lemm.ee on 13 Jun 22:40 collapse

Sam Altman

Everything he touches tends to inevitably turn to shit. This guy and his fellow silicon valley cronies are scum of the earth

QuadratureSurfer@lemmy.world on 13 Jun 02:01 next collapse

Hmmm it was even able to pull in private DMs.

Maybe private DMs on Mastadon aren’t as private as everyone thinks… that, or the open nature of Activity Pub is leaking them somehow?

Edit - From the article:

Even more shocking is the revelation that somehow, even private DMs from Mastodon were mirrored on their public site and searchable. How this is even possible is beyond me, as DM’s are ostensibly only between two parties, and the message itself was sent from two hackers.town users.

From what @delirious_owl@discuss.online mentioned below, it sounds like this shouldn’t be very shocking at all.

delirious_owl@discuss.online on 13 Jun 02:14 next collapse

They’re called DMs not PMs

QuadratureSurfer@lemmy.world on 13 Jun 02:20 next collapse

They’re called DMs not PMs

? Did you mean that the other way around? And if you did… forgive me, I don’t really use Mastodon. I was never much of a twitter fan. I don’t really like how all of my likes are public (although I guess I have had to get used to that with Lemmy).

delirious_owl@discuss.online on 13 Jun 02:21 collapse

No. They’re direct. They’re not private.

QuadratureSurfer@lemmy.world on 13 Jun 02:23 collapse

Ah, I see. So it’s the same mistake that Lemmy users make when thinking that Upvotes/Downvotes aren’t public.

It sounds like DMs on Mastodon are public, but are commonly mistaken to be private then?

delirious_owl@discuss.online on 13 Jun 15:58 collapse

I don’t know why anyone would think any of this stuff is private. It can be pseudonyms, but that’s up to you.

JackbyDev@programming.dev on 13 Jun 04:14 collapse

PM never implied any form of end to end encryption. It only ever meant people couldn’t see it apart from site operators. I genuinely don’t believe people thought it meant otherwise.

delirious_owl@discuss.online on 13 Jun 15:56 collapse

But on a federated system, everyone can see all messages. That’s expected.

JackbyDev@programming.dev on 13 Jun 16:44 collapse

No, should just be your instance admin and the admin of the instance your messaging.

deadsuperhero@lemmy.ml on 13 Jun 03:32 collapse

The shocking part was less about Maven’s methods or lack of ethics, and more along the lines of “How the fuck did they do that?!”

QuadratureSurfer@lemmy.world on 13 Jun 03:47 collapse

What @delirious_owl@discuss.online seemed to be implying is that direct messages on Mastodon should be considered “public” rather than “private”.

I’m assuming that’s along the same lines of how Lemmy users generally think that their upvotes/downvotes are private when in reality, if you know how to look for them, you can see them.

minnix@lemux.minnix.dev on 13 Jun 02:04 next collapse

Even more shocking is the revelation that somehow, even private DMs from Mastodon were mirrored on their public site and searchable. How this is even possible is beyond me, as DM’s are ostensibly only between two parties, and the message itself was sent from two hackers.town users.

I find this hard to believe but stranger things have happened.

delirious_owl@discuss.online on 13 Jun 02:15 next collapse

Why would you expect anything that you post on social media to be private? I don’t get it.

ShortN0te@lemmy.ml on 13 Jun 17:36 collapse

You missed the point. It is not about if it is private or not, it is how they use it. You are allowed (on some pages) to read news article. Are you allowed to copy and publish them on your own site? No. You have a Copyright on your posts same as a author has on his books.

If it is legal or not is still to be discussed.

Similar to how data was mined (or even still is) about users without consent. Now there is for example the GDPR.

delirious_owl@discuss.online on 14 Jun 00:49 next collapse

We should definitely be copying and pasting authwalled news articles. Do what’s right, not what’s legal.

GBU_28@lemm.ee on 14 Jun 13:10 collapse

Still doesn’t explain how public posts on a public, decentralized social media platform are implied to be “mine” or that I have any influence on the end use. It’s hosted on someone else’s computer from the get go, if anything the server owners are the content owners more than I am.

Edit it’d be like if I started seeding a file on a torrent platform, then got upset when someone downloaded it.

unionagainstdhmo@aussie.zone on 14 Jun 13:17 next collapse

That’s not how copyright works, if it did GitHub would own every project on their site.

GBU_28@lemm.ee on 14 Jun 16:28 collapse

Where is the copyright on Lemmy?

unionagainstdhmo@aussie.zone on 14 Jun 21:55 collapse

Any material you create is implicitly copyright and owned by you. Comments without licences are equivalent to GitHub repos without licences, you can’t use them

GBU_28@lemm.ee on 14 Jun 21:58 collapse

Implicit is not durable, especially when the servers could be federated all over the world.

ShortN0te@lemmy.ml on 14 Jun 13:31 collapse

I write a book that gets published. I still hold copyright over it even if it is in someone else’s bookshelf. What rights the copyright holder and the person has is regulated by law. For example a physical book can be resold or lent to someone else, but it is not allowed to copy it and sell the copies.

I can cite text from the boom, that falls under fair use but I cannot use whole chapters in a derived work.

I still hold copyright over my messages online, even when it is public or published, that is basic copyright law in most relevant legislations. If the training of an LLM and later selling access to the LLM with copyright infringed data is fair use is yet to be determined.

GBU_28@lemm.ee on 14 Jun 16:27 collapse

There is no copyright on publicly posted messages.

Edit none that is durable

ShortN0te@lemmy.ml on 14 Jun 17:44 collapse

Sure there is, most messages are probably too short but in general yes. There is no difference to an online article.

GBU_28@lemm.ee on 14 Jun 18:11 collapse

There is no evidence you have any durable rights to your comments on Lemmy. It’s hosted on someone else’s machine and they have complete control.

hollyberries@programming.dev on 13 Jun 04:55 next collapse

To be honest, the extreme negative reaction was a surprise to me, as I thought interaction between disparate systems was the entire point, but clearly we didn’t navigate the culture correctly.

Noooo fucking shit? If they spent more than a minute on a proper instance and not milquetoast mastodon dot social, they would have realised that a good number of fedi users despise shenanigans like this?

bartolomeo@suppo.fi on 13 Jun 09:18 next collapse

And there were people on here saying that licensing your comments CC was stupid…

Railcar8095@lemm.ee on 14 Jun 11:26 next collapse

Have those but been harvested? Did the users get compensation?

The day that has any effect aside from bloating the thread, I’ll accept they are not stupid.

bartolomeo@suppo.fi on 15 Jun 01:48 collapse

Ah yes, the old “laws don’t work so let’s get rid of them” argument.

Railcar8095@lemm.ee on 15 Jun 05:44 collapse

Nobody said to get rid of the laws. I’m telling to enforce them. Posting that and not following through is why people thinks they are stupid.

I’ll root hard for anybody suing them. If they don’t because they think it’s impossible to win it because it’s hard, then, they are the ones de-facto giving away the law.

neme@lemm.ee on 14 Jun 13:02 next collapse

Maybe Lemmy should start adding NoAI meta tags to posts and comments like some other websites have started doing for images? Though I doubt it helps that much.

jeena@piefed.jeena.net on 14 Jun 13:53 next collapse

Normally you can do it in your robots.txt so each instance can choose to do it.

Danterious@lemmy.dbzer0.com on 16 Jun 02:51 collapse

I think it is more important to have a non-commercial tag/license added.

~Anti~ ~Commercial-AI~ ~license~ ~(CC~ ~BY-NC-SA~ ~4.0)~

GBU_28@lemm.ee on 14 Jun 13:06 collapse

Offers no protections that are durable.

LiveLM@lemmy.zip on 14 Jun 10:17 next collapse

A short while ago, Jimmy Secretan posted this response on everything that happened today:
“We have paused everything related to our Fediverse ingestion for now and we are removing everything ingested.
To be honest, the extreme negative reaction was a surprise to me, as I thought interaction between disparate systems was the entire point, but clearly we didn’t navigate the culture correctly.”

Make the world a better place, bully your local tech bro today!

Railcar8095@lemm.ee on 14 Jun 11:29 next collapse

Please, if there’s a god, don’t let ChatGPT learn from hexbears.

I don’t want it explaining why actually invading Ukraine is good for Ukraine when I ask for a smoothy recipe.

ReakDuck@lemmy.ml on 14 Jun 13:31 collapse

But… its good for the Ukraine you know /s

feoh@lemmy.ml on 14 Jun 14:18 collapse

I looked at their site and thought: What a #!@$ stupid idea.

The whole thing stinks of Twitter brain. “Follow topics, not people”? So what you’re saying is that the null brains on Twitter are far too focused on whenever one of the Kardassians farts to focus on anything real?

Puhh-lease. The Fediverse isn’t about that all.

Hard pass.