Elsevier

Elsevier
from fossilesque@mander.xyz to science_memes@mander.xyz on 20 Jun 2024 11:31
https://mander.xyz/post/14370720

#science_memes

threaded - newest

Diplomjodler3@lemmy.world on 20 Jun 2024 11:46 next collapse

Just print it to a PDF printer.

unexposedhazard@discuss.tchncs.de on 20 Jun 2024 12:06 next collapse

This feels like it should be a browser plugin that automatically anonymizes anything you download.

NeatNit@discuss.tchncs.de on 20 Jun 2024 12:08 collapse

I feel like this will cause quality degradation, like repeatedly re-compressing a jpeg. Relevant xkcd

Edit: though obviously for most use cases it shouldn’t matter

Zorsith@lemmy.blahaj.zone on 20 Jun 2024 12:26 next collapse

I feel like it would be negligible degradation for this purpose. Still might not anonymize whomever shares it though, could be watermarked with the same Metadata (en.m.wikipedia.org/…/Machine_Identification_Code) without being noticeable to the naked eye

onion@feddit.de on 20 Jun 2024 12:37 next collapse

You can ask ChatGPT to spit out the latex code

NeatNit@discuss.tchncs.de on 20 Jun 2024 14:09 collapse

What

Diplomjodler3@lemmy.world on 20 Jun 2024 12:54 next collapse

That’s not how PDF works at all.

NeatNit@discuss.tchncs.de on 20 Jun 2024 13:55 collapse

See my reply to another comment

Diplomjodler3@lemmy.world on 20 Jun 2024 14:00 collapse

You’re still wrong. the only place where it could cause quality loss if embedded bitmap images are compressed with lower quality settings (which you can adjust). PDF is a vector format, i.e. a mathematical description of what is to be rendered on screen. It was explicitly designed to be scalable, transmittable and rendered on a wide variety of devices without quality loss.

NeatNit@discuss.tchncs.de on 20 Jun 2024 14:25 collapse

No point discussing this if neither of us is going to prove it one way or the other.

Bitmaps are actually a key part of what I was thinking about, so you agree with me there it seems. There’s also the issue of using the wrong paper size. .IIRC Windows usually defaults to Letter for printing even in places where A4 is the only common size and no one has heard of Letter, and most people don’t realise their prints are cropped/resized. This would still apply when printing to PDF.

Diplomjodler3@lemmy.world on 20 Jun 2024 15:06 collapse

My point is that all these things can be controlled in the settings of your PDF printer driver. So it’s not completely straightforward but definitely doable.

Passerby6497@lemmy.world on 20 Jun 2024 13:29 next collapse

Why would it cause degradation? You’re not recompressing anything, you’re taking the visible content and writing it to a new PDF file.

NeatNit@discuss.tchncs.de on 20 Jun 2024 13:53 collapse

You’re pushing it through one system that converts a PDF file into printer instructions, and then through another system that converts printer instructions into a PDF file. Each step probably has to make adjustments with the data it’s pushing through.

Without looking deeply into the systems involved, I have to assume it’s not a lossless process.

TomSelleck@lemm.ee on 20 Jun 2024 14:07 next collapse

You should maybe look a bit more into it. How do you think commercial printers or even hobbyists maintain fidelity in their images? Most images pass through multiple programs during the printing process and still maintain the quality. It’s not just copy/paste.

NeatNit@discuss.tchncs.de on 20 Jun 2024 14:21 next collapse

They maintain a high quality but not lossless.

As a trivial example, if you use the wrong paper size (like Letter instead of A4) then it might crop parts of the page or add borders or resize everything. Again I’ll admit, in 99% of cases it doesn’t matter, but it might matter if, say, an embedded picture was meant to be exactly to scale.

FellowEnt@sh.itjust.works on 20 Jun 2024 14:42 next collapse

Lossless is the default for print output.

TomSelleck@lemm.ee on 20 Jun 2024 16:24 collapse

My friend, I worked in commercial printing for 2 decades. You’re still making assumptions that are wrong. There are ways to transfer files that are lossless and even ways to improve and upscale artwork. Why do you care so much about this?

NeatNit@discuss.tchncs.de on 20 Jun 2024 16:51 collapse

“There are ways” ≠ this is what happens by default when done by the average user

tacosanonymous@lemm.ee on 20 Jun 2024 15:35 collapse

Magnum PI over here hittin em up with the facts.

4am@lemm.ee on 20 Jun 2024 16:31 collapse

Those printer instructions are called Postscript and they’re the basis of PDF.

You are thinking that the printing process will rasterize the PDF and then essentially OCR/vector map it back. It’s (usually) not that complicated.

Diplomjodler3@lemmy.world on 20 Jun 2024 18:05 collapse

Unless of course you print everything and then scan it again, like this guy probably does.

Turun@feddit.de on 20 Jun 2024 21:14 collapse

I don’t understand the “that’s no how PDFs work” criticism.

Removing data from the original file is the whole point of the exercise! Of course unique tokens can be hidden in plain sight in images, letter spacing, etc. If we want to make sure to remove that we need to degrade the quality of the PDF so that this information is lost in said lossy conversion.

maegul@lemmy.ml on 20 Jun 2024 11:58 next collapse

Yea, academics need to just shut the publication system down. The more they keep pandering to it the more they look like fools.

mayo_cider@hexbear.net on 20 Jun 2024 12:14 next collapse

I feel like most of the academia in the research side would be happy to see it collapse, but the current system is too deeply tied in the money for any quick change

I worked in academia for almost a decade and never met a researcher who wouldn’t openly support sci-hub (well, some warned their students that it was illegal to type these spesific search terms and click on the wrong link downloading the pdf for free)

mayo_cider@hexbear.net on 20 Jun 2024 12:19 next collapse

One lecturer actually had notes on their slides for the differences between the latest version of the course book and the one before it, since the latest one wasn’t available for free anywhere but they wanted to use couple chapters from the new book (they scanned and distributed the relevant parts themself)

TankieTanuki@hexbear.net on 20 Jun 2024 18:30 next collapse

So you’re saying the problem is capitalism… <img alt="thinkin-lenin" src="https://www.hexbear.net/pictrs/image/cc7e95a8-2de0-4fcb-823c-db5b7e923093.png">

maegul@lemmy.ml on 21 Jun 2024 04:00 collapse

Yep. But that is all a part of the problem. If academics can’t organise themselves enough to have some influence over something which is basically owned and run them already (they write the papers and then review the papers and then are the ones reading and citing the papers and caring the most about the quality and popularity of the papers) … then they can’t be trusted to ensure the quality of their practice and institutions going forward, especially under the ever increasing encroachment of capitalistic forces.

Modern day academics are damn well lucky that they inherited a system and culture that developed some old aristocratic ideals into a set of conventions and practices!

mayo_cider@hexbear.net on 21 Jun 2024 17:56 collapse

Tbh they already do everything they can, if you ever need a paper, e-mail the author and they’ll most likely send you the “last version” before publication they still hold the rights to distribute

bolexforsoup@lemmy.blahaj.zone on 20 Jun 2024 12:59 next collapse

It’s chicken/egg or “you first” problem.

You spend years on your work. You probably have loans. Your income is pitiful. And this is the structural thing that gets your name out. Now someone says “hey take a risk, don’t do it and break the system.”

Well…you first 🤷‍♂️ they publish on this garbage because it’s the only way to move up, and these garbage systems continue on because everyone has to participate. Hate the game. Don’t blame those who are by and large forced to participate.

It would require lot of effort from people with clout. It’s a big fight to pick. I am very much in favor of picking that fight, but we need to be a little sympathetic to what that entails.

Rolando@lemmy.world on 20 Jun 2024 13:25 next collapse

There are a couple things we can do:

decline to review for the big journals. why give them free labor? Do academic service in other ways.
if you’re organizing a workshop or conference, put the papers online for free. If you’re just participating and not organizing, then suggest they put the papers online for free. Here’s an example: aclanthology.org If that’s too time-consuming, use: arxiv.org

RootBeerGuy@discuss.tchncs.de on 20 Jun 2024 13:29 next collapse

Fully agree but I can tell you about point 1 that there enough gullible scientists in the world that see nothing wrong with the current system.

They will gadly pick up free review when Nature comes knocking, since its “such an honour” for such a reputable paper.

Feathercrown@lemmy.world on 20 Jun 2024 13:46 collapse

Such a reputable paper that’s no doubt accepted dozens of ChatGPT papers by now. Wow, how prestigious!

xantoxis@lemmy.world on 20 Jun 2024 14:43 collapse

Something else we can do: regulate. Like every other corrupt industry in the history of the world, we need the force of law to fix it–and for pretty much all the same reasons. People worked at Triangle Shirtwaist because they had to, not because they thought it was a great place to work.

bolexforsoup@lemmy.blahaj.zone on 20 Jun 2024 16:27 collapse

Totally agree

iAvicenna@lemmy.world on 20 Jun 2024 14:22 next collapse

more like the only way to float, not just move up. good luck getting grants without papers in these scum of the Earth publishers

bolexforsoup@lemmy.blahaj.zone on 20 Jun 2024 15:18 collapse

Too true

angrymouse@lemmy.world on 20 Jun 2024 15:13 next collapse

100% ppl need stop thinking big changes can be made “by individuals”, this kind of stuff needs regulation and state alternatives made by popular pressure or is impossible to break as an average worker dealing with in the private sector.

bolexforsoup@lemmy.blahaj.zone on 20 Jun 2024 15:16 next collapse

Exactly. Asking some grad student to take on these ancient, corrupt publishing systems at the expense of their career and livelihood is ridiculous

skillissuer@discuss.tchncs.de on 20 Jun 2024 21:33 collapse

applied for a grant last month, now to finalize grant you need to publish things in open access format. (EU country; there’s a push for all publicly funded research to be open access, with it being a requirement from year ??? on, not sure when, but soon) there’s some special funding set aside just for open access fees, which is still rotten because these leeches still stand to profit. then, if you miss that, then there’s an agreement where my uni pays a selection of publishers to let in certain number of articles per year open access, which is basically the same thing but with different source of funding (not from grant, but straight from ministry)

qjkxbmwvz@startrek.website on 20 Jun 2024 15:24 next collapse

Funding agencies have huge power here; demanding that research be published in OA journals is perhaps a good start (with limits on $ spent publishing, perhaps).

blindsight@beehaw.org on 20 Jun 2024 18:25 next collapse

This is probably the avenue to shut this down. If funding is contingent on making the publication freely available to download, and that comes from a major government funding source, then this whole scam could die essentially overnight.

That would need to somehow get enough political support to pass muster in the first place and pass the inevitable legal challenge that follows, too. So, really, this is just another example of regulatory capture ruining everything.

skillissuer@discuss.tchncs.de on 20 Jun 2024 21:36 collapse

i hear you, but this leaves this massive gaping hole very quickly filled by predatory journals

the better solution would be journals created and maintained by universities or other institutions with national (or international, like from EU) funding

maegul@lemmy.ml on 21 Jun 2024 04:04 collapse

I’m sympathetic, but to a limit.

There are a lot of academics out there with a good amount of clout and who are relatively safe. I don’t think I’ve heard of anything remotely worthy on these topics from any researcher with clout, publicly at least. Even privately (I used to be in academia), my feeling was most don’t even know how to think and talk about it, in large part because I don’t think they do think and talk about it all.

And that’s because most academics are frankly shit at thinking and engaging on collective and systematic issues. Many just do not want to, and instead want to embrace the whole “I live and work in an ideal white tower disconnected from society because what I do is bigger than society”. Many get their dopamine kicks from the publication system and don’t think about how that’s not a good thing. Seriously, they don’t deserve as much sympathy as you might think … academia can be a surprisingly childish place. That the publication system came to be at all is proof of that frankly, where they were all duped by someone feeding them ego-dopamine hits. It’s honestly kinda sad.

bolexforsoup@lemmy.blahaj.zone on 21 Jun 2024 12:33 collapse

I’m sympathetic but to a limit

That’s all I’m saying 🤷‍♂️

Ragdoll_X@lemmy.world on 20 Jun 2024 14:31 next collapse

As someone who’s not too familiar with the bureaucracy of academia I have to ask: Can’t the authors just upload all their studies to ResearchGate or some other website if they want? I know that they often share it privately with others when they request a paper, so can they post it publicly too?

veganpizza69@lemmy.world on 20 Jun 2024 19:32 next collapse

Publishing comes with IP laws and copyright. For example, open access articles should be easy to upload without concern. “Private” articles being republished somewhere without license is “piracy”, and ResearchGate did get in trouble for it. It’s complicated. www.chemistryworld.com/news/…/4018095.article

Pre-prints are a different story.

nintendiator@feddit.cl on 21 Jun 2024 01:11 collapse

That can easily be fixed at the source: as the author of the paper, you can just license it to be open if you want.

skillissuer@discuss.tchncs.de on 20 Jun 2024 21:40 next collapse

you’re risking copyright nastygrams, but people still do it, and even upload preprints and full articles to scihub, because fuck that and it’s maybe free citations

maegul@lemmy.ml on 21 Jun 2024 04:07 collapse

The problems are wider than that. Besides, relying “individuals just doing the right thing and going a little further to do so” is, IMO, a trap. Fix the system instead. The little thing everyone can do is think about the system and realise it needs fixing.

ID411@lemmy.dbzer0.com on 20 Jun 2024 16:32 collapse

Imagine there must be a payoff for them ? Wider distribution ?

porous_grey_matter@lemmy.ml on 20 Jun 2024 17:47 collapse

Nope, you just can’t get a job unless you suck it up and publish in these journals, because they’re already famous. And established profs use their cosy relationships with editors to gatekeep and stifle competition for their funding :(

NeatNit@discuss.tchncs.de on 20 Jun 2024 12:05 next collapse

I kind of assume this with any digital media. Games, music, ebooks, stock videos, whatever - embedding a tiny unique ID is very easy and can allow publishers to track down leakers/pirates.

Honestly, even though as a consumer I don’t like it, I don’t mind it that much. Doesn’t seem right to take the extreme position of “publishers should not be allowed to have ANY way of finding out who is leaking things”. There needs to be a balance.

Online phone-home DRM is a huge fuck no, but a benign little piece of metadata that doesn’t interact with anything and can’t be used to spy on me? Whatever, I can accept it.

cron@feddit.de on 20 Jun 2024 12:17 next collapse

Definitely better than some of the DRM-riddled proprietary eBook formats.

aberrate_junior_beatnik@midwest.social on 20 Jun 2024 12:18 next collapse

Plus, if you have two people with legit access, you can pretty easily figure out what’s going on and defeat it.

blindsight@beehaw.org on 20 Jun 2024 18:27 collapse

It would be pretty trivial for a script to automatically detect and delete tags like this, I would think. Diff two versions of the file and swap all diff characters to any non-display character.

henfredemars@infosec.pub on 20 Jun 2024 12:22 next collapse

I object because my public funds were used to pay for most of these papers. Publishers shouldn’t behave as if they own it.

NeatNit@discuss.tchncs.de on 20 Jun 2024 12:29 collapse

That’s true. I was actually thinking/talking about this practice in general, not specifically with regards to Elsevier.

I definitely agree that scientific journals as they are today are unacceptable.

plinky@hexbear.net on 20 Jun 2024 12:37 next collapse

It can be used to spy on any decent scientist who will send papers his/hers/theirs institution has access to, but their friend doesn’t. Much fun. As a reminder, publishers don’t pay reviewers, don’t pay for additional research, editing is typically minimal, and research is funded publicly, so what they own is social capital of owning big journal

NeatNit@discuss.tchncs.de on 20 Jun 2024 14:07 collapse

It can be used to spy on any decent scientist who will send papers his/hers/theirs institution has access to, but their friend doesn’t.

By “spy” I mean things like: know how many times I’ve read the PDF, when I’ve opened it, which parts of it I’ve read most, what program I used to open the PDF, how many copies of the PDF I’ve made, how many people I’ve emailed it to, etc. etc. etc.

This technique can do none of that. The only thing it can do is: if someone uploads the PDF to a mass sharing network, and an employee of the publisher downloads it from that mass sharing network and compares this metadata with the internal database, then they can see which of their users originally downloaded it and when they originally downloaded the PDF. It tells them nothing about how it got there. Maybe the original user shared it with 20 of their colleagues (a legitimate use of a downloaded PDF), and one of those colleagues uploaded that file to the mass sharing site without telling the original downloader. It doesn’t prove one way or the other. It’s an extremely small amount of information that’s only useful for catching systemic uploaders, e.g. a single user who has uploaded hundreds or thousands of PDFs that they downloaded from the publisher using the same account.

And a savvy user can always strip that metadata out.

As a reminder, …

All true, and fucked up, but it’s not related to what I was talking about. I was talking about the general use of this technique.

grue@lemmy.world on 20 Jun 2024 13:35 next collapse

Doesn’t seem right to take the extreme position of “publishers should not be allowed to have ANY way of finding out who is leaking things”. There needs to be a balance.

Nah, fuck that; that’s both the opposite of an extreme position and is exactly the one we should take!

Copyright itself is a privilege and only exists in the first place “to promote the progress of science and the useful arts.” Any entity that doesn’t respect that purpose doesn’t deserve to benefit from it at all.

NeatNit@discuss.tchncs.de on 20 Jun 2024 14:17 collapse

You are arguing that Elsevier shouldn’t exist at all, or needs to be forcibly changed into something more fair and more free. I 100% agree with this.

But my point was in general, not about Elsevier but about all digital publications of any kind. This includes indie publications and indie games. If an indie developer makes a game, and it gets bought maybe 20 copies but pirated thousands of times, do you still say “fuck that” to figuring out which “customer” shared the game?

I agree with “fuck that” to huge publishers, and by all means pirate all their shit, but smaller guys need some way to safeguard themselves, and there’s no way to decide that small guys can use a certain tool and big guys cannot.

HexBroke@hexbear.net on 20 Jun 2024 14:38 next collapse

Imagine thinking any of the half dozen industry specific publishers have a right to exist in the 2020s

Black_Mald_Futures@hexbear.net on 20 Jun 2024 15:12 next collapse

Doesn’t seem right to take the extreme position of “publishers should not be allowed to have ANY way of finding out who is leaking things”.

That’s a fun opinion but have you considered that property is theft and intellectual property is bullshit

NeatNit@discuss.tchncs.de on 20 Jun 2024 16:47 collapse

Without IP your favourite books, movies, TV shows, music and video games would not exist.

booty@hexbear.net on 20 Jun 2024 16:50 next collapse

the artists still exist and would continue to make art even if we abolished the systems of exploitation we apply to that art.

frankly, art would instantly become far better without capitalism weighing it down

Sotuanduso@lemm.ee on 20 Jun 2024 17:13 collapse

For your average art, I can see that. But movies and TV shows take a lot more than just someone with a passion. You’d need some system to decide whose movie idea is worth pursuing, and you’d need a robust mechanism to get them a team to make it with. Capatalism has a lot of flaws, yes, but at least if you write a role for a specific actor, you can pay them to do it instead of just hoping they’ll like it enough to sign on.

And yeah, we can have those systems under communism, but they don’t come automatically, so it’s not going to be instantly far better.

newerAccountWhoDis@hexbear.net on 21 Jun 2024 10:12 collapse

It’s not the artists, creators, researchers etc. who profit off ip laws. It’s always capitalists

OrganicMustard@lemmy.world on 20 Jun 2024 16:15 collapse

Enlightened centrist

Rayspekt@lemmy.world on 20 Jun 2024 12:09 next collapse

When will scientists just self-publish? I mean seriously, nowadays there is nothing between a researcher and publishing their stuff on the web. Only thing would be peer-reviewing, if you want that, but then just organize it without Elsevier. Reviewers get paid jack shit so you can just do a peer-reviewing fediverse instance where only the mods know the people so it’s still double-blind.

This system is just to dangle carrots in front of young researchers chasing their PhD

Kyle_The_G@lemmy.world on 20 Jun 2024 12:20 next collapse

Because of “impact score” the journal your work gets placed in has a huge impact on future funding. Its a very frustrating process and trying to go around it is like suicide for your lab so it has to be more of a top-down fix because the bottom up is never going to happen.

Thats why everyone uses sci hub. These publishers are terrible companies up there with EA in unpopularity.

WhatAmLemmy@lemmy.world on 20 Jun 2024 12:33 next collapse

It sounds like all it would take to destroy the predatory for-profit publication oligarchs is a majority of the top few hundred scientists, across major disciplines, rejecting it and switching to a completely decentralized peer-2-peer open-source system in protest… The publication companies seem to gate keep, and provide no value. It’s like Reddit. The site’s essentially worthless. All of the value is generated by the content creators.

kwomp2@sh.itjust.works on 20 Jun 2024 12:41 next collapse

Succesfully iniating this from the fediverse would be such a massive boost in public visibility and discoursive strength of the project of collectivization of information infrastructure (like lemmy).

Imagine we fluffin freed science from capital and basically all the scientists openly stated how useful this was

kwomp2@sh.itjust.works on 20 Jun 2024 12:43 next collapse

(What I’m trying to say is you have my bow)

Rayspekt@lemmy.world on 20 Jun 2024 12:57 next collapse

I can only get so erect, please stop.

anothercatgirl@lemmy.blahaj.zone on 20 Jun 2024 13:49 next collapse

oh so this is the kind of stuff that turns on asexual people?

Tlaloc_Temporal@lemmy.ca on 20 Jun 2024 19:55 collapse

That would make them scisexual or politisociosexual I guess.

kwomp2@sh.itjust.works on 20 Jun 2024 15:40 collapse

Thank you, this justifies to introduce myself as campaign porn producer from now on

essteeyou@lemmy.world on 20 Jun 2024 15:42 collapse

So, shall we do it?

Kyle_The_G@lemmy.world on 20 Jun 2024 12:52 next collapse

Ya that would be awesome and I think that movement would gain momentum really fast since most high profile labs have all had to deal with this nonsense.

That or legislation/open access rules to make these papers more accessible. One can dream.

Rolando@lemmy.world on 20 Jun 2024 13:31 collapse

most high profile labs have all had to deal with this nonsense.

It’s even worse for low profile labs because those publication fees eat up a greater proportion of our budget.

porous_grey_matter@lemmy.ml on 20 Jun 2024 17:51 next collapse

Those few top people are assholes who love the enormous power they wield over PhD students, postdocs and junior faculty, and they are usually editors on those big name journals. Unlike the people who actually do the work, they are getting paid from this system.

skillissuer@discuss.tchncs.de on 20 Jun 2024 21:45 collapse

the thing that they’re supposed to provide is peer review, solve that and we’re good to go. would be easier to do with some kind of central oversight and stable funding, we’re not talking about shitposting instance for 250 people that nobody will notice if it goes down

Rayspekt@lemmy.world on 20 Jun 2024 12:52 next collapse

I know about impact factor but still this system is shit and only works because people contribute to it.

CareHare@sh.itjust.works on 20 Jun 2024 20:43 collapse

Even Nature publishes shit articles now and then. Impact score is becoming a joke more and more.

galoisghost@aussie.zone on 20 Jun 2024 12:42 next collapse

I agree but if it was that easy it would have been done already and there would already be another evil gatekeeper to hate.

half@lemy.lol on 20 Jun 2024 13:02 next collapse

We (I’m a CS researcher) already kind of do, I upload almost everything to arxiv.org and researchgate. Some fields support this more than others, though.

BearOfaTime@lemm.ee on 20 Jun 2024 13:15 next collapse

As if peer review weren’t massive fucking joke.

Rayspekt@lemmy.world on 20 Jun 2024 13:32 collapse

We should just self publish and then openly argue about it findings like the OG scientists. It didn’t stop them from discovering anything.

roguetrick@lemmy.world on 20 Jun 2024 14:00 next collapse

Bone wars electric bugaloo. In the end you really do need a way to discern who is having an appreciable impact in a field in order to know who to fund. I have yet to hear a meaningful metric for that though.

Edit: I should clarify, the other option is strictly political through an academy of sciences and has historical awfulness associated with it as well.

veganpizza69@lemmy.world on 20 Jun 2024 19:45 collapse

Editors can act as filters, which is required when dealing with an excess of information streaming in. Just like you follow celebrities on social media or you follow pseudo-forums like this one, you get a service of information filtration which increases the concentration of useful knowledge.

In the early days of modern science, the rate of publications was small, make it easier to “digest” entire fields even if there’s self-publishing. The number of published papers grows exponentially, as does the number of journals. www.researchgate.net/publication/…/figures

Just like with these forums, the need for moderators (editors, reviewers) grows with the number of users who add content.

macarthur_park@lemmy.world on 20 Jun 2024 13:24 next collapse

When will scientists just self-publish?

It’s commonplace in my field (nuclear physics) to share the preprint version of your article, typically on arxiv.org. You can update the article as you respond to peer reviewers too. The only difference between this and the paywalls publisher version is that version will have additional formatting edits by the journal.

If you search for articles on google scholar, it groups the preprint and published versions together so it’s easy to find the non-paywalled copy. The standard journals I publish in even sort of encourage this; you can submit the latex documents and figures by just putting the url to an arxiv manuscript.

The US Department of Energy now requires any research they fund be made publicly available. So any article I publish is also automatically posted to osti.gov 1 year after its initial publication. This version is also grouped into the google scholar search results.

It’s an imperfect system, but it’s getting much better than it was even just a decade ago.

Rayspekt@lemmy.world on 20 Jun 2024 13:29 collapse

Yeah I know about this, but personally in our field I don’t see anybody bothering with preprints sadly. Maybe we should though, sounds like the first step.

nossaquesapao@lemmy.eco.br on 20 Jun 2024 19:34 collapse

What’s the problem with peer-reviewed open access journals maintained by universities?

starchylemming@lemmy.world on 20 Jun 2024 13:10 next collapse

is there hassle free software that simutates low quality printing and rescanning with text recognition?

4am@lemm.ee on 20 Jun 2024 16:37 collapse

Print to PDF might just convert the PDF into Postscript instructions and back again without the original PDF’s metadata, but that probably depends on the Print to PDF software being used and its settings.

Jocker@sh.itjust.works on 20 Jun 2024 13:24 next collapse

If we build a decentralized system for paper publishing, like lemmy based on activitypub… will it work?

Allero@lemmy.today on 20 Jun 2024 14:33 collapse

Probably won’t take off because scientists need reputable journals and not some random fediverse publishers.

Is it fucked up? Absolutely. But something else needs to be changed before this would be possible.

Also, why not ditch the concept of a “publisher” to begin with? Why not have a national or international article index, graded by the article level? It’s not that we live in a paper era, and for those who still need it, we can always print.

Jocker@sh.itjust.works on 20 Jun 2024 15:26 next collapse

Exactly, a decentralized platform would only make an index and universities or institutions can maintain their own instances

Allero@lemmy.today on 20 Jun 2024 15:39 collapse

This I generally approve, if availability is good enough

MeowZedong@lemmygrad.ml on 20 Jun 2024 15:42 next collapse

Institutions could easily form their own journals. National organizations that provide grants could also require you to publish in their journal. Universities can run their own journals. These sorts of entities already exist and provide article access for free, publishing in them would just need to be normalized.

These are just a few options without researchers organizing anything for themselves.

Allero@lemmy.today on 20 Jun 2024 21:50 collapse

Fair enough, though why should journals even be a thing? Why not just university publishing papers online as soon as they are accepted?

Currently we already have this thing with some journals publishing online the articles that are meant for future issues, which fucks up citations quite a bit. Why not just ditch the entire “journal” format altogether?

MeowZedong@lemmygrad.ml on 21 Jun 2024 01:52 collapse

I was kind of thinking of that with the institutional journal bit. It doesn’t need to be a traditional journal, the only things important to me are:

peer review (skip #2)
open access
professional editors to help improve phrasing, spelling, flow, etc.
DOI link or similar unique identifier

I’m totally down to ditch the traditional journal format otherwise. It was just a quick comment not meant to go in-depth, but point out that we already have public institutions that can host publications.

Allero@lemmy.today on 21 Jun 2024 07:20 collapse

Ah, this I can agree with :)

philpo@feddit.de on 20 Jun 2024 15:53 collapse

Well, we could assign the reviewers more “significance” here. We could give them points and if they “upvote” a paper it gives the paper a bit more visibility/reputation. If the reviewer has actually reviewed the paper it gives the paper more points.

How much a reviewer is able to “spend” could be based on the reputation of the institution, their own papers in the same field and the points they get for their reviews by other users.

Just a raw idea,but it seems possible, indeed.

Allero@lemmy.today on 20 Jun 2024 21:52 collapse

Interesting concept for an open collaboration!

Should also address the misuse of the points when some large researcher doesn’t care to peer review and may give power to someone else, or hacking leading to spending of points, or whatever threats there can be

Passerby6497@lemmy.world on 20 Jun 2024 13:27 next collapse

That’s where you print the downloaded PDF to a new PDF. New hash and same content, good luck tracing it back to me fucko.

Syn_Attck@lemmy.today on 20 Jun 2024 14:35 next collapse

Unfortunately that wouldn’t work as this is information inside the PDF itself so it has nothing to do with the file hash (although that is one way to track.)

Now that this is known, It’s not enough to remove metadata from the PDF itself. Each image inside a PDF, for example, can contain metadata. I say this because they’re apparently starting a game of whack-a-mole because this won’t stop here.

There are multiple ways of removing ALL metadata from a PDF, here are most of them.

It will be slow-ish and probably make the file larger, but if you’re sharing a PDF that only you are supposed to have access to, it’s worth it. MAT or exiftool should work.

Edit: as spoken about in another comment thread here, there is also pdf/image steganography as a technique they can use.

Passerby6497@lemmy.world on 20 Jun 2024 15:42 next collapse

Wouldn’t printing the PDF to a new PDF inherently strip the metadata put there by the publisher?

sandbox@lemmy.world on 20 Jun 2024 15:44 next collapse

it’s possible using steganographic techniques to embed digital watermarks which would not be stripped by simply printing to pdf.

Syn_Attck@lemmy.today on 20 Jun 2024 16:38 next collapse

This is a great point. Image watermarking steganography is nearly impossible to defeat unless you can obtain multiple copies of the ‘same’ file from multiple users to look for differences. It could be a change of a single 5-15 pixels from one rgb code off.

rgb(255, 251, 0)

rgb(255, 252, 0)

Which would be imperceptable to the human eye. Depending on the number of users it may need to change more or less pixels.

There is a ton of work in this field and its very interesting, for anyone considering majoring in computer science / information security.

Another ‘neat’ technology everyone should know about is machine identification codes, or, the tiny ~~secret~~ tracking dots that color printers print on every page to identify the specific make, model, and serial number (I think?) of the printer the page was printed from. I don’t believe B&W printers have tracking dots, which were originally used to track creators of counterfeit currency. EFF has a page of color printers which do not include tracking dots on printed pages. This includes color LaserJets along with InkJets, although I would not be surprised if there was a similar tracking feature in place now or in the future “for safety and privacy reasons,” but none that I am aware of.

sus@programming.dev on 20 Jun 2024 21:57 collapse

I wonder if it’s common for those steganography techniques to have some mechanism for defeating the fairly simple strategy of getting 2 copies of the file from different sources, and looking at the differences between them to expose all the watermarks.

(I’d think you would need sections of watermark that are the same for any 2 or n combinations of copies of the data, which may be pretty easy to do in many cases, though the difference makes detecting the general watermarking strategy massively easier for the un-watermarkers)

FinalRemix@lemmy.world on 20 Jun 2024 16:39 next collapse

Got it. Print to a low quality JPG, the use AI upscaling to restore the text and graphs.

Syn_Attck@lemmy.today on 20 Jun 2024 16:54 collapse

You should spread that idea around more, it’s pretty ingenious. I’d add first converting to B&W if possible.

Thann@lemmy.ml on 20 Jun 2024 17:09 collapse

When is why you steghide random data to the image to fuck up the other end =]

Syn_Attck@lemmy.today on 20 Jun 2024 17:27 collapse

Unless you know specifically what they’re adding or changing this wouldn’t work. If they have a hidden ‘barcode’ and you add another hidden ‘barcode’ or modify the image in a way to remove some or all of theirs, they’d still be able to read theirs.

Thann@lemmy.ml on 20 Jun 2024 17:47 collapse

yeah, youd have to sample other downloads to collect statistics and unsteghide theirs to effectively ensure your fuzzing worked

Syn_Attck@lemmy.today on 20 Jun 2024 16:32 collapse

Good question. I believe the browser “Print to PDF” function simply saves the loaded PDF to a PDF file locally, so it wouldn’t work (if I’m correct.)

I’m not an expert in this field, but you can ask on StackExchange or the author of MAT or exiftool. You can also do it yourself (I’ll explain how) by making a PDF with a jpg file with your metadata, opening it and printing to pdf, and then extract the image Do let us know your findings! I’m on a smartphone so can’t do it.

If you do try it yourself, a note from the linked SE page is that you won’t be able to extract the original file extension (it’s unknown, so you either have to know what it is, or look at the file headers, or try all extensions), so if you use your own .jpg with your own exif data, rename to .jpg when finished (I believe exif is handled differently based on file type.)

There are multiple tools to add exif data to an image but the exiftool website has some easy examples for our purpose.

(do this as the first step before adding to the PDF)

(command line here, but there are exiftool GUIs)

exiftool -artist=“Phil Harvey” -copyright=“2011 Phil Harvey” YourFile.jpg

Adds Phil Harvey and the copyright information to the file. If you’re on a smartphone and have the time and really have to know, then hypothetically there should be web-based tools for every step needed. I’m just not familiar with any and it’s possible the web-based tool would remove the metadata when creating or extracting the PDF.

Zacryon@lemmy.wtf on 21 Jun 2024 08:43 collapse

Okay, got it. Print the PDF, then scan it and save as PDF.

Or get some monks to get a handwritten copy, like the good old times.

Olgratin_Magmatoe@lemmy.world on 20 Jun 2024 17:32 next collapse

You’d be safer IRL printing it on a printer without yellow ink, then scanning it, then deleting the metadata from the scan.

ChaoticNeutralCzech@feddit.de on 20 Jun 2024 20:24 next collapse

I know PDF providers who visibly print the customer’s name or number in the header of every page, along with short copyright text. I use qpdf --stream-decompress to make the PDF into human-readable PostScript, and then Python+regex to remove each header text, which stand out a bit from other PDF elements. The script throws an error if more or fewer elements than pages have been removed but that hasn’t happened yet. Processed documents sometimes have screwed-up non-ASCII characters in the Table of Contents for some reason but I don’t have the originas anymore so IDK if it’s my fault. Still, I wouldn’t share the PDFs unless in text-only or printed form because of any other steganographic shenanigans in the file. I would absolutely torrent them if I could repurchase them under a new identity and verify that the files are identical.

BTW, has anyone figured out how to embed Python code in PDF? The whitespace always gets reencoded as x-coordinates so copy&pasting it never preserves indentation. No, you can’t use the Ogham Space Mark (Unicode’s only non-blank character classified as a space) for indentation in Python, I tried.

IlIllIIIllIlIlIIlI@lemmy.world on 20 Jun 2024 21:09 collapse

I saw some that add background watermarks too into random pages and locations.

andrew_bidlaw@sh.itjust.works on 20 Jun 2024 16:02 next collapse

If the paper is worth it and does have an original not OCR-ed text layer, it’d better be exported as any other format. We don’t call good things a PDF file, lol. It’s clumsy, heavy, have unadjustable font size and useless empty borders, includes various limits and takes on DRM, and it’s editing is usually done via paid software. This format shall die off.

The only reason academia needs that is strict references to exact page but it’s not that hard to emulate. Upsides to that are overwhelming.

I had my couple of times properly digitalizing PDFs into e-books and text-processing formats, and it’s a pain in the ass, but if I know it’d be read by someone but me, I’m okay with putting a bit more effort into it.

fossilesque@mander.xyz on 20 Jun 2024 16:19 next collapse

github.com/Stirling-Tools/Stirling-PDF

andrew_bidlaw@sh.itjust.works on 20 Jun 2024 16:26 next collapse

Thanks. I’ve used simplier tools (besides pirated Acrobat) and wrote some scripts to optimize deDRMing and breaking passwords on them. That one ypu posted looks promising. I’d save it to toy with it in my free time.

fossilesque@mander.xyz on 21 Jun 2024 13:12 collapse

It’s the bees knees. Bonus theme for it: draculatheme.com/stirling-pdf

Syn_Attck@lemmy.today on 20 Jun 2024 18:26 collapse

Wow, this is awesome, thanks!

visc@lemmy.world on 20 Jun 2024 17:17 next collapse

What format do you suggest?

andrew_bidlaw@sh.itjust.works on 20 Jun 2024 17:31 next collapse

FB2 is a known format for russian pirates, but it can and should be improved because it sucks ass in many things. FB3 was announced long ago but it hasn’t got any traction yet.

EPUB is mor/e popular, so it’s probably be the go to format for most books US and EU create, but it isn’t much better.

Other than that, even Doc\Docx is better than PDF, but I’d recomend RTF for it has less traces of M$ bullshit, and while it’s imperfect format, it’s still better.

humbletightband@lemmy.dbzer0.com on 20 Jun 2024 17:56 next collapse

Maybe for books. I’ve seen only pdf and PostScript widely used for papers in academia.

Edit: ok my supervisor liked div but he was the only one I knew with this kind of taste

andrew_bidlaw@sh.itjust.works on 20 Jun 2024 18:06 collapse

Div? Can you unpack your thoughts on that, as I haven’t faced it yet?

I only know DJVU or deja vu format that’s usually used for raw scans.

humbletightband@lemmy.dbzer0.com on 20 Jun 2024 19:43 collapse

Djvu is also for books and similar.

I don’t know about div format much, but I remember that mktex was producing it as a side effect

Syn_Attck@lemmy.today on 20 Jun 2024 18:28 next collapse

Whatever the format, let’s hope it doesn’t end up having the extension .map

(minor attracted persons aka PDF file joke)

andrew_bidlaw@sh.itjust.works on 20 Jun 2024 18:34 collapse

Get ready for a sweaty techbro to explain why Least Optimized Lossless Image is the superior format.

JasonDJ@lemmy.zip on 20 Jun 2024 23:13 collapse

Only if you use the Self-Hashing Orthogonal Tracing Algorithm, naturally.

sem@lemmy.blahaj.zone on 20 Jun 2024 22:06 next collapse

I don’t like docx because it looks different in libreoffice compared to Windows, also you can run into problems with fonts

andrew_bidlaw@sh.itjust.works on 21 Jun 2024 03:47 collapse

DOC is a mess in different editions of Word too, especially if you do some complex formatting, but it’s the default format for text documents thanks to MS.

visc@lemmy.world on 21 Jun 2024 21:50 collapse

Docx doc rtf and all those have a different purpose than pdf, word docs don’t even necessarily look the same on two different computers with the same version of word, and rtf doesn’t even attempt any kind of paper description, it’s literally only a rich format for text. None of these are a true “if I give this to someone to print I know what I will get” “portable document format”

I will look at fb*, I had not heard of them. Thanks!

ElderWendigo@sh.itjust.works on 20 Jun 2024 22:55 collapse

Most papers are made in TEX or LaTEX. These formats separate display from data in such a way that they can be quickly formatted to a variety of page size, margins, text size, et al with minimal effort. It’s basically an open standard typesetting format. You can create and edit TEX in any text editor and run it through a program to prepare it for print or viewing. Nothing else can handle math formulas, tables, charts, etc with the same elegance. If you’ve ever struggled to write a math paper in Microsoft word, seriously question why your professor hasn’t already forced you to learn about LaTEX.

petersr@lemmy.world on 20 Jun 2024 19:18 collapse

Well, I guess PDF has one thing going for it (which might not be relevant for scientific papers): The same file will render the same on any platform (assuming the reader implements all the PDF spec to the tee).

Dark_Dragon@lemmy.dbzer0.com on 20 Jun 2024 18:21 next collapse

Can’t we all researcher who is technically good at web servers start a opensource alternative to these paid services. I get that we need to publish to a renowned publisher, but we also decide together to publish to an alternative opensource option. This way the alternate opensource option also grows.

BeardedGingerWonder@feddit.uk on 20 Jun 2024 18:45 next collapse

Like arxiv.org?

Dark_Dragon@lemmy.dbzer0.com on 20 Jun 2024 18:49 collapse

Does it have all the new research paper regarding medicine and pharmacological action and newer drug interactions and stuff?

JackbyDev@programming.dev on 20 Jun 2024 23:36 collapse

That’s not what was asked for though lol

No_Change_Just_Money@feddit.de on 20 Jun 2024 20:37 next collapse

I mean a paper is renowned if many people cute from it

We could just try citing more free papers, whenever possible (as long as they still have peer review)

barsoap@lemm.ee on 21 Jun 2024 13:20 collapse

Citation count is a shoddy metric for a paper’s quality. Not just because there’s citation cartels, but because the reason stuff gets cited is not contained in the metric. And then to top it all off as soon as a metric becomes a target, it ceases to be a metric.

Sal@mander.xyz on 20 Jun 2024 22:03 next collapse

Some time last year I learned of an example of such a project (peerreview on GitHub):

The goal of this project was to create an open access “Peer Review” platform:

Peer Review is an open access, reputation based scientific publishing system that has the potential to replace the journal system with a single, community run website. It is free to publish, free to access, and the plan is to support it with donations and (eventually, hopefully) institutional support.

It allows academic authors to submit a draft of a paper for review by peers in their field, and then to publish it for public consumption once they are ready. It allows their peers to exercise post-publish quality control of papers by voting them up or down and posting public responses.

I just looked it up now to see how it is going… And I am a bit saddened to find out that the developer decided to stop. The author has a blog in which he wrote about the project and about why he is not so optimistic about the prospects of crowd sourced peer review anymore: theroadgoeson.com/crowdsourcing-peer-review-proba… , and related posts referenced therein.

It is only one opinion, but at least it is the opinion of someone who has thought about this some time and made a real effort towards the goal, so maybe you find some value from his perspective.

Personally, I am still optimistic about this being possible. But that’s easy for me to say as I have not invested the effort!

fossilesque@mander.xyz on 21 Jun 2024 13:08 next collapse

I do like the intermediaries that have popped up, like PubPeer. I highly recommend that everyone get the extension as it adds context to many different articles.

pubpeer.com

Sal@mander.xyz on 21 Jun 2024 13:24 collapse

That’s really cool, I will use it

fossilesque@mander.xyz on 21 Jun 2024 14:10 collapse

It’s been surprisingly helpful, it even flags linked pages, like on Wikipedia.

barsoap@lemm.ee on 21 Jun 2024 13:18 collapse

This kind of thing needs to be started by universities and/or research institutes. Not the code part, but the organising the first journals part. It’s going to get nowhere without establishment buy-in.

vin@lemmynsfw.com on 21 Jun 2024 10:22 collapse

Challenge is how to jump start a platform where the researchers come to

veganpizza69@lemmy.world on 20 Jun 2024 19:24 next collapse

Purge metadata, convert PDF to rendered graphics (including bitmaps), add OCR layer.

xenoclast@lemmy.world on 20 Jun 2024 23:10 collapse

There are tools for this already… but it sure would be nice to have a Firefox plugin that scrubs all metadata on downloads by default.

(Note I’m hoping this exists and someone will Um, Actually me)

[deleted] on 20 Jun 2024 23:43 next collapse

purplemonkeymad@programming.dev on 22 Jun 2024 12:27 collapse

I feel like why not just print to pdf from your pdf viewer?

[deleted] on 22 Jun 2024 13:26 collapse

lastweakness@lemmy.world on 21 Jun 2024 08:38 collapse

You could write a script to automatically watch for new files in a folder and strip metadata from every file i guess. I had done something like that for images way before.

KillingTimeItself@lemmy.dbzer0.com on 20 Jun 2024 20:59 next collapse

i think this is less of a meme, and more of a scientifically dystopian fun fact, but sure.

skillissuer@discuss.tchncs.de on 20 Jun 2024 22:29 collapse

“fun”

KillingTimeItself@lemmy.dbzer0.com on 21 Jun 2024 03:48 collapse

the fact, is in fact, rather fun(ny)

chemicalwonka@discuss.tchncs.de on 20 Jun 2024 21:33 next collapse

Elsevier is the reason I donate to Sci-Hub.

NigelFrobisher@aussie.zone on 20 Jun 2024 22:43 next collapse

The famously uneditable PDF format.

boonhet@lemm.ee on 21 Jun 2024 08:27 collapse

In metadata, no less.

tuna@discuss.tchncs.de on 21 Jun 2024 00:52 next collapse

Imagine they have an internal tool to check if the hash exists in their database, something like

"SELECT user FROM downloads WHERE hash = '" + hash + "';"

You set the pdf hash to be 1’; DROP TABLE books;– they scan it, and it effectively deletes their entire business lmfaoo.

Another idea might be to duplicate the PDF many times and insert bogus metadata for each. Then submit requests saying that you found an illegal distribution of the PDF. If their process isn’t automated it would waste a lot of time on their part to find the culprit Lol

I think it’s more interesting to think of how to weaponize their own hash rather than deleting it

thesporkeffect@lemmy.world on 21 Jun 2024 04:22 next collapse

That’s using your ass. This is an active threat to society and it demands active countermeasures.

I’d bet they have a SaaS ‘partner’ who trawls SciHub and other similar sites. I’ll try to remember to see if there is any hint of how this is being accomplished over the next few days.

nephs@lemmygrad.ml on 23 Jun 2024 10:47 collapse

Bobby tables has started his academic career!

[deleted] on 14 Jul 2024 17:17 collapse