Do any of you have M$ Word running in present form?
from j4k3@lemmy.world to linux@lemmy.ml on 25 Aug 2024 17:28
https://lemmy.world/post/19044101

My old man has a bunch of .dox stuff saved. He has complicated large files saved that are not supported by any of the FOSS conversion tools. I’ve tried Libre office, Abi Word, and every command line tool and converter I can find. These are entire book sized files.

I have a W10 machine with Word. Is extracting the .exe and running it with wine feasible without making an epic mess or massive project of this?

#linux

threaded - newest

just_another_person@lemmy.world on 25 Aug 2024 17:31 next collapse

You can try Pandoc and see if that works, Google Docs, Office365, finding an abandonware version of Word and running on Wine…lots of options to work with.

It might be easier to start narrowing down where you need to look if you get the header info from one of these files.

[deleted] on 25 Aug 2024 20:55 collapse

.

just_another_person@lemmy.world on 25 Aug 2024 21:23 collapse

Okay. First off, I downvoted you for obvious reasons.

Second, if you’re not sure how to extract the header of a file, just Google that. You may be ill prepared and asking for help here.

[deleted] on 25 Aug 2024 22:52 collapse

.

just_another_person@lemmy.world on 25 Aug 2024 22:54 collapse

You don’t understand how file formats work I guess. You can’t just ‘head’ an encoded file and expect a terminal to output what you want. Do some research.

Tippon@lemmy.dbzer0.com on 25 Aug 2024 17:34 next collapse

Have you tried the online version of MS Office? I’m not sure, but I think there’s a free version. Depending on the file, you might be able to convert it to another format, then use a FOSS tool going forwards.

neidu2@feddit.nl on 25 Aug 2024 17:55 next collapse

I was thinking along the same lines. Use the online version available via portal.office.com, and use that to convert everything to something more FOSS-friendly.

Not sure if access is free, though.

Telorand@reddthat.com on 25 Aug 2024 18:09 collapse

This is what I would recommend as well. Try to convert within Word to an older version or open version that’s likely to be compatible with other software. Test one and see if it converts okay.

j4k3@lemmy.world on 25 Aug 2024 18:04 collapse

Too many documents and Office 365 is a $10 month sub scam.

Tippon@lemmy.dbzer0.com on 25 Aug 2024 18:11 next collapse

I can’t comment on how many documents you have, but there’s a free version of Office 365

[deleted] on 25 Aug 2024 21:04 collapse

.

Grangle1@lemm.ee on 25 Aug 2024 22:14 collapse

If you’re already thinking of extracting/attempting to run a desktop version of Office, you may as well save yourself the effort if you can and give the free online version a try. You’ll be using a proprietary piece of software either way.

[deleted] on 25 Aug 2024 22:56 collapse

.

Blue_Morpho@lemmy.world on 25 Aug 2024 21:01 next collapse

You don’t have to keep the sub. Pay $10, convert the files, unsubscribe.

frazorth@feddit.uk on 27 Aug 2024 10:03 collapse

Dont buy Office 365, just use the Office live.com. Yes, Microsoft are reusing their live.com.

It is free, I use it for dealing with résumés.

onedrive.live.com

pcouy@lemmy.pierre-couy.fr on 25 Aug 2024 18:02 next collapse

In my experience, OnlyOffice has the best compatibility with M$ Office. You should try it if you haven’t

anothermember@lemmy.zip on 25 Aug 2024 20:51 collapse

It’s worth a try, though in my experience it can struggle with very large files.

_edge@discuss.tchncs.de on 25 Aug 2024 18:06 next collapse

I wouldn’t even try with wine these days.

Why don’t you use the Win10 machine you have, the online version of Microsoft Office (web browser or app), a VM with Windows, or (if it works for your case) Google Docs or OnlyOffice.

data1701d@startrek.website on 25 Aug 2024 18:33 next collapse

How old of Docx files are you talking? Something like Office 2010 might run quite well, and your father would have probably had to have used some very weird features for it to be incompatible.

[deleted] on 25 Aug 2024 21:00 collapse

.

data1701d@startrek.website on 25 Aug 2024 23:14 collapse

That sucks.

richardisaguy@lemmy.world on 25 Aug 2024 19:10 next collapse

Why do you spell it as “m$ office”?

XEAL@lemm.ee on 25 Aug 2024 19:22 collapse

Greedy fucks.

turkalino@lemmy.yachts on 25 Aug 2024 19:25 next collapse

VMWare and archive dot org are your friend

wizardbeard@lemmy.dbzer0.com on 25 Aug 2024 19:49 next collapse

Assuming the latest version of OpenOffice doesn’t work for these files…


My next course of action would be using the Win 10 machine with Word, or a VM with Win10 or 11 and the latest version of Word. Use MASGrave to trick M$ into considering it licensed if you need to.

Use a Powershell script to interact with Word through the COM object interface and automate opening Word, opening the file, saving it as a different filetype, and closing. Here’s a snippet of Powershell from Reddit for going in the opposite direction (odt to docx) for a single file. I wouldn’t try to do this through Linux, just suck it up and use Windows so you don’t have an extra layer of mess to deal with.

Going off M$ documentation of the save types enum, I would replace “wdFormatDocumentDefault” in that snippet with wdFormatOpenDocumentText or wdFormatStrictOpenXMLDocument, then test it with a single file to see which gives the output you need.

Getting all the files of the starting type from a folder can be done using Get-ChildItem. Store those in a variable and use a foreach loop over the initial file list.

eruchitanda@lemmy.world on 25 Aug 2024 20:05 next collapse

OnlyOffice.

eruchitanda@lemmy.world on 25 Aug 2024 20:05 collapse

Not to be confused with OpenOffice.

(LibreOffice forked from OO back then.)

GetOffMyLan@programming.dev on 25 Aug 2024 20:34 next collapse

Why not just use the windows machine?

theshatterstone54@feddit.uk on 25 Aug 2024 21:48 next collapse

Just use OnlyOffice or this: flathub.org/apps/io.gitlab.o20.word

j4k3@lemmy.world on 25 Aug 2024 22:59 collapse

All of the office suites seem to use either Python 3 docx or iconvert under the surface. These tools do not support whatever default encoding m$ is using. It is clearly a font encoding issue, but I won’t know what that font is until my back is in good enough shape to setup a desktop at my bedside workstation.

transientpunk@sh.itjust.works on 26 Aug 2024 01:16 collapse

Why not just use a VM?

[deleted] on 25 Aug 2024 22:04 next collapse

.

datavoid@lemmy.ml on 26 Aug 2024 00:37 collapse

I started using M$ around 2010 personally

whatsgoingdom@rollenspiel.forum on 25 Aug 2024 23:30 next collapse

I’ve had some success with fmstrat/winapps (if I remember the repo correctly) but that might be overkill for your use case

absGeekNZ@lemmy.nz on 25 Aug 2024 23:48 next collapse

I have office 2007 on a winxp VM, I haven’t had to use it in a few years, but it is there as a back up

possiblylinux127@lemmy.zip on 26 Aug 2024 01:33 collapse

That’s EOL

absGeekNZ@lemmy.nz on 26 Aug 2024 01:41 collapse

Long past, but for old files especially, old .doc files it is great as a backup.

It lives in a VM that never has access to the internet, it almost never gets started up.

muhyb@programming.dev on 26 Aug 2024 00:25 next collapse

It’s not open source but probably has the best compatibility. You can give it a shot.

www.freeoffice.com/en/

Needs an account after one week though.

theshatterstone54@feddit.uk on 27 Aug 2024 08:27 collapse

Looks interesting. Any info on whether Excel Macros work for it?

muhyb@programming.dev on 27 Aug 2024 10:08 collapse

It doesn’t use Visual Basic for its macros so I wouldn’t expect a complex compatibility. To be fair Excel macros is usually a problem outside of MS Office.

possiblylinux127@lemmy.zip on 26 Aug 2024 01:33 next collapse

No

You need Windows

nik9000@programming.dev on 26 Aug 2024 01:52 next collapse

Try your local library.

UnbalancedFox@lemmy.ca on 26 Aug 2024 03:14 next collapse

I bought a cheap win11+office 2021 combo on the net and use a VM. Its not the easiest way but it works…

😔

thayer@lemmy.ca on 26 Aug 2024 04:28 next collapse

Assuming you meant “.docx files”, those should open without issue in LibreOffice. As others have said, OnlyOffice is another popular option if format preservation is a goal.

What do you mean when you say the files are “not supported” by the tools you’ve tried? What, exactly, is happening and what are you trying to accomplish? The end goal wasn’t clear to me from your post.

Getting Word to run under wine will require much more effort than copying the Word binary.

j4k3@lemmy.world on 26 Aug 2024 05:08 collapse

Yes .docx.

It appears as though the encoding is missing in such a way that nothing in Linux recognizes the file. The underlying CLI tools don’t have a way of converting the file. I tried with Python’s docx tool and with iconv. It has to be encoding related because some tools initially load the file with several sets of Asian characters instead of English. However, there is no hexadecimal or sections of entirely binary looking data. Archiving tools do not open up the the file to reveal anything else like a metafile or header. Neo vim shows garbled nonsense throughout. Bat warns of binary. Python won’t load the file, nor will Only Office. Libre Office and Abi Word load initially with Asian characters before crashing.

The only option is likely gong to be setting up the W10 machine and converting a bunch of files within it.

Ultimately, my old man thinks he can be an author all of the sudden and is trying to write. He’s not very capable of learning. I’m not confident that he can learn to use FOSS to do the same thing he has been doing. This post was just to see if there are options I am not already aware of that might actually work in practice. I can easily do everything I need in FOSS. I can do everything he needs to do. I’m more concerned about becoming his tech support when he forgets how to copy pasta. He already fails to separate the internet hardware connectivity from the web browser and operating system within his mental model of technology.

thayer@lemmy.ca on 26 Aug 2024 05:26 next collapse

Thanks for clarifying, and I can appreciate your overall concerns as I face the same dilemma with my aging relatives.

Just to confirm, have you opened these files in Word yourself (or witnessed them being opened), to verify they are in fact valid documents? if valid, are they meant to be in English?

It wouldn’t be the first time I’ve seen “other” files renamed with an incorrect file extension.

[deleted] on 26 Aug 2024 05:46 collapse

.

MonkderVierte@lemmy.ml on 26 Aug 2024 12:20 next collapse

Sure it’s not .doc? Earlier .docx were rather more standard compliant than new ones. .doc is the old proprietary MS Word format, while .docx is to the OOXML standard (though with all the proprietary extensions, making the standard useless).

flubba86@lemmy.world on 26 Aug 2024 12:42 collapse

Sounds like it’s actually a .doc file that has been renamed to a .docx for some reason. Real MS Word would probably still open it fine, but open source tools would fall over hard.

You mentioned you can’t decompress it either. If it was a real .docx you could rename the extension to .zip and unzip it with any archiver to see the contents. If the archiver complains about the format, then it’s not a real docx.

nyan@sh.itjust.works on 27 Aug 2024 14:57 collapse

If it really is a .doc file and written in an ASCII-compatible encoding as most English-language documents are, opening it in a hex editor (or a non-codepage-aware text editor like the Notepad on a W10 or earlier Windows machine) will show an indecipherable proprietary header followed by the text in the file, possibly with a single space or “junk” character between each letter depending on the exact version of Word and system encoding it was written with. There may be occasional additional stretches of markup junk. At the end, there will be a footer with occasional decipherable text strings like “MSWordDoc” and font names.

If you open a .docx file in such a program, you should get a typical zipfile signature: the letters “PK” at the beginning of the file, followed by a lot of gobbledegook. If you don’t get that “PK”, it probably isn’t a .docx.

(I’ve looked at a lot of MS file guts, for both curiosity and information extraction purposes.)

Kit@lemmy.blahaj.zone on 26 Aug 2024 04:52 next collapse

Honestly it might be worthwhile to just get a month of Microsoft 365 and use the web client. You can upload all of the files to OneDrive and open them in the web version of Word to do what you need. Nothing beats native compatibility in a project of this scope.

ProgrammingSocks@pawb.social on 27 Aug 2024 22:45 collapse

Isn’t there a free tier anyways?

VinesNFluff@pawb.social on 26 Aug 2024 10:54 next collapse

I will agree with the people suggesting “VM and a pirated copy”

Just get like office 2010 and windows 7 off of the web, run it in a VM, convert the files, dump it all.

Arfman@aussie.zone on 26 Aug 2024 13:36 next collapse

Yeah this is the last version of Office that doesn’t nag you and your can find keys or buy generated ones off eBay if you feel guilty or worried about malicious cracks.

charliegrahamm@lemm.ee on 26 Aug 2024 14:21 next collapse

Instead of pirating anything, you can instead use:

github.com/…/Microsoft-Activation-Scripts for activation

massgrave.dev/office_msi_links for download of office

VinesNFluff@pawb.social on 26 Aug 2024 15:02 collapse

(these count as piracy, but yes, they work well and are reliable)

daggermoon@lemmy.world on 27 Aug 2024 08:09 collapse

Doesn’t Office 2010 work in Wine?

VinesNFluff@pawb.social on 27 Aug 2024 15:42 collapse

I wouldn’t know, but since OP is having compatibility issues, I’d try to get as close to native as I could. Eliminate the room-for-error. Hence the VM with actual Windows.

They can just delete the lot after they’ve converted their files to an opener format. :P

ninekeysdown@lemmy.world on 26 Aug 2024 11:59 next collapse

To be honest, there’s a few good comments linking to scripts and methods here to batch convert them on a windows pc/vm. That’s the best way to go.

To add on to their comments. If you’re just interested in preserving them then maybe printing them to pdf, specifically pdf/a, would be my approach once you got them opened.

Tabzlock@lemmy.ml on 27 Aug 2024 23:07 next collapse

im pretty surs that codeweavers crossover still works for microsoft365. atleast I used it with office365 last year without major issue.

Presi300@lemmy.world on 29 Aug 2024 07:26 collapse

Generally, no. M$ office has some pretty invasive DRM, so your best bet to running it on linux is to run it on a windows virtual machine