Round trip with Word


#1

I had an outline in outliner and exported to Word format for a colleague to add information to certain lines. I would like to import this word file back into OO bUt the only way seems to be to select all, copy and paste. This gets rid of all indenting and gets rid of parent/ child relationships.
Surely I am missing something basic here? Is there an easy way to pull Word into OO and retain structure?
Any help appreciated as I would like to do this workflow a lot.


#2

can you output to .opml format from word? OO will open opml files


#3

From a forum on exporting opal from Word:
I’ve been also trying to find something to work… I’m usually really good with finding time efficient/manageable ways of converting documents, but this one is frustrating me. I’ve recently downloaded the OmniOutliner program, which has the potential to do amazing thing for students, especially. I’ve been looking for a way to take documents that i’ve previously made and convert them into the notebook layout, so I can import them into OmniOutliner, which supports the OPML format.
For the answer of you question:
Yes, there is. I’ve been using a program called ReadTree, which does the conversion for you.
The only problem is the Notebook Layout… Regular file formate (not originally created in notebook layout) lose their formatting when transferred over. This causes you to have to edit every single line of the document to get it into the OPML format you want.
From what you question stated though, it should work for you.

Well that sounds just great doesn’t it? Seriously though, Its a bit of an issue. I want to be able to work on an OO document and export to word and then have a relatively painless return trip. Import word form Word outline mode would be good…


#5

Where can I find ReadTree? Or does anyone else know a simple workaround to import a Word-Document (just the structure (headings, subheadings etc.) without formatting)?


#6

If you are happy to use the Terminal.app prompt, I would explore:

http://pandoc.org/


#7

I tried to recommend the Pandoc document converter to you, but the post was helpfully flagged as spam :-)

Pandoc (open source) is written by John MacFarlane at UCB - it’s quite a powerful general purpose converter which can work, inter alia, between OPML and .docx

( I won’t include the link again, in case it prompts automatic deletion of the post a second time, but www pandoc org with dots restored should work …


#8

Still hoping myself - no news yet I am sorry to say


#9

Did try to suggest John MacFarlane’s Pandoc (for OPML ⇄ Word and many other directions) in another thread, but perhaps the spam filter didn’t like seeing a link …


#10

Hello,

I’ve been manually converting my docx outline into omnioutliner and it’s very time consuming. I tried pandoc and was able to convert the docx to opml and open in omnioutliner, but there wasn’t anything in the file. I’m going to learn more about pandoc and try to make it work because I think it’ll pay off in the end.

Does anyone have any advice or script that can help make this happen? I don’t care about formatting besides keeping the structure of the outline from microsoft word to omnioutliner.

Thanks,

Davi


#11

It depends a bit on how you want to map the Word outline structures to your OO outline, but here is a JavaScript for Automation script which essentially maps MSWord outline headers and body paragraphs to Outline rows.

If the mapping you use is different (e.g. from body paragraphs to OO notes, then it would need a bit of redrafting).

USAGE

This script reads the outline in the currently active MS Word window, and creates a new document in OO (I have only tested with OO5).

Paste the source script into Script Editor, and test with the language dropdown (at top left) set to JavaScript rather than AppleScript:

(function () {
    'use strict';

    // Create new OO outline from outline in MS Word front window.
        // Ver 0.01  Rob Trew (2017)

        // GENERIC FUNCTIONS -----------------------------------------------------

        // foldl :: (b -> a -> b) -> b -> [a] -> b
        function foldl(f, a, xs) {
            return xs.reduce(f, a);
        };

        // map :: (a -> b) -> [a] -> [b]
        function map(f, xs) {
            return xs.map(f);
        };

        // mapAccumL :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y])
        function mapAccumL(f, acc, xs) {
            return xs.reduce(function (a, x) {
                var pair = f(a[0], x);
                return [pair[0], a[1].concat([pair[1]])];
            }, [acc, []]);
        };

        // read :: Read a => String -> a
        function read(s) {
            return JSON.parse(s);
        };

        // NESTING FROM INDENTATION ----------------------------------------------

        // nestFromIndents :: [{ text::String, indent::Int }]
        //        -> Tree { text::String, indent:: Int, nest:: [Tree] }
        function nestFromIndents(xs) {
            return foldl(function (levels, x) {
                var indent = x.indent,
                    iNext = indent + 1,
                    iMax = levels.length - 1,
                    node = {
                        text: x.text,
                        nest: []
                    };
                return (
                    levels[indent < iMax ? (
                        indent
                    ) : iMax].nest.push(node),
                    iNext > iMax ? levels.push(node) : levels[iNext] = node,
                    levels
                );
            }, [{
                text: undefined,
                nest: []
            }], xs)[0];
        };

        // FROM MICROSOFT WORD ---------------------------------------------------

        var mw = Application('Microsoft Word'),
            ds = mw.documents,
            mbd = ds.length > 0 ? {
                just: ds.at(0)
            } : {
                nothing: true
            };

        return mbd.nothing ? [] : function () {
            // TEXT NEST FROM MS WORD --------------------------------------------
            // Document must be in draft view
            var oView = mw.windows.at(0)
                .view,
                startViewType = oView.viewType();

            // TEMPORARILY SWITCH TO NORMAL VIEW TO READ OUTLINE LEVELS
            // (FOR SOME REASON NOT ACCESSIBLE IN OUTLINE VIEW)
            startViewType !== "normal view" && (oView.viewType = "normal view");

            // wordParaLevels :: [{ indent :: Int, text :: String }]
            var wordParaLevels = mapAccumL(function (a, x) {
                    var strLevel = x.outlineLevel()
                        .slice(-1),
                        blnHeader = !isNaN(strLevel),

                        // Plain text paras are interpreted as one level more
                        // indented than the most recent outline header.
                        lngLevel = blnHeader ? read(strLevel) : a + 1;
                    return [blnHeader ? lngLevel : a, {
                        text: x.textObject.content()
                            .slice(0, -1), // The final CR is not required.
                        indent: lngLevel
                    }];
                }, -1, mbd.just.paragraphs())[1],
                topRows = nestFromIndents(wordParaLevels)
                .nest;

            // RESTORE ORIGINAL VIEW TYPE IF WE HAD CHANGED IT
            oView.viewType() !== startViewType && (oView.viewType = startViewType);

            // TO OMNIOUTLINER ---------------------------------------------------

            // jsoToOO ::  App -> [{ text :: String, nest :: TextNest}] -> [OO Row]
            function jsoToOO(app, oParent, lstNests) {
                return map(function (x) {
                    var oRow = app.Row({
                        topic: x.text
                    });
                    return oParent.children.push(oRow), x.nest.length > 0 ? (
                        jsoToOO(app, oRow, x.nest)
                    ) : oRow;
                }, lstNests);
            };

            var oo = Application('OmniOutliner'),
                d = function () {
                    return oo.activate(), oo.documents.push(oo.Document({
                        name: topRows.length > 0 ? topRows[0].text : 'untitled'
                    })), oo.documents.at(0);
                }(),
                ooRows = jsoToOO(oo, d, topRows);

            return(
                d.expandall(),
                'Nested text copied from MS Word to OmniOutliner'
            );
        }();
    })();


#12

Or this variant – possibly closer to the ‘round trip’ spirit, in which

  1. MS Word Heading -> OO Row
  2. MS Word other para -> Additional para for Note of most recent OO Row
((options) => {
    'use strict';

    // MS WORD -> JSO-TEXT-NEST -> OMNI-OUTLINER OUTLINE

        // See options dictionary at foot of code:

        // bodyAsNote : true -> Body paras are notes for preceding header
        // bodyAsNote : false -> Body paras are child rows of preceding header

        // Ver 0.08 Rob Trew (c) 2017

        // GENERIC FUNCTIONS -------------------------------------------------

        // (++) :: [a] -> [a] -> [a]
        const append = (xs, ys) => xs.concat(ys);

        // foldl :: (b -> a -> b) -> b -> [a] -> b
        const foldl = (f, a, xs) => xs.reduce(f, a);

        // map :: (a -> b) -> [a] -> [b]
        const map = (f, xs) => xs.map(f);

        // mapAccumL :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y])
        const mapAccumL = (f, acc, xs) =>
            xs.reduce((a, x) => {
                const pair = f(a[0], x);
                return [pair[0], a[1].concat([pair[1]])];
            }, [acc, []]);

        // read :: Read a => String -> a
        const read = s => JSON.parse(s);

        // TEXT-NEST JSO FROM MS WORD --------------------------------------------

        // nestFromIndents :: [{ text::String, indent::Int }]
        //        -> Tree { text::String, indent:: Int, nest:: [Tree] }
        const nestFromIndents = xs =>
            foldl((levels, x) => {
                const
                    indent = x.indent,
                    iNext = indent + 1,
                    iMax = levels.length - 1,
                    node = {
                        text: x.text,
                        nest: [],
                        note: x.note || undefined
                    };
                return (
                    levels[indent < iMax ? indent : iMax].nest.push(node),
                    iNext > iMax ? levels.push(node) : levels[iNext] = node,
                    levels
                );
            }, [{
                text: undefined,
                nest: []
            }], xs)[0];

        // For reading body paras as child rows
        // bodyAsRowLevels :: [MSWord Para] -> [{ indent :: Int,  text :: String}]
        const bodyAsRowLevels = xs => mapAccumL(
            (a, x) => {
                const
                    strLevel = x.outlineLevel()
                    .slice(-1),
                    blnHeader = !isNaN(strLevel),
                    // Plain text paras are interpreted as one level more
                    // indented than the most recent outline header.
                    lngLevel = blnHeader ? read(strLevel) : a + 1;
                return [blnHeader ? lngLevel : a, {
                    text: x.textObject.content()
                        .slice(0, -1), // Final CR discarded
                    indent: lngLevel
                }]
            }, -1,
            xs
        )[1];

        // For reading body paras as notes
        // bodyAsNoteLevels :: [MSWord Para] ->
        //          [{ indent :: Int, text :: String, note :: String}]
        const bodyAsNoteLevels = xs => foldl(
                (a, x) => {
                    const
                        strLevel = x.outlineLevel()
                        .slice(-1),
                        blnHeader = !isNaN(strLevel),
                        node = blnHeader ? {
                            text: x.textObject.content()
                                .slice(0, -1), // Final CR discarded
                            indent: read(strLevel),
                            note: ''
                        } : Object.assign(
                            a.lastHeader, {
                                note: (a.lastHeader.note || '') +
                                    x.textObject.content()
                                    .slice(0, -1) + '\n'
                            }
                        );
                    return {
                        lastHeader: blnHeader ? node : a.lastHeader,
                        nodes: blnHeader ? (
                            append(a.nodes, [node])
                        ) : a.nodes
                    };
                }, {
                    lastHeader: {
                        indent: 0
                    },
                    nodes: []
                },
                xs
            )
            .nodes;

        // jsoFromMSWord :: MSWord App -> Bool -> MSWord ActiveDoc -> TextNest
        const jsoFromMSWord = (app, blnBodyAsNote, d) => {
            const
                oView = app.windows.at(0)
                .view,
                startViewType = oView.viewType(),
                textNest = ( // TEMPORARY SWITCH TO NORMAL VIEW REVEALS LEVELS
                    (startViewType !== "normal view") && (
                        oView.viewType = "normal view"
                    ),
                    nestFromIndents(
                        (blnBodyAsNote ? bodyAsNoteLevels : bodyAsRowLevels)(
                            d.paragraphs()
                        )
                    )
                    .nest
                );
            return ( // RETURNING TO ORIGINAL VIEW IF CHANGED
                (oView.viewType() !== startViewType) && (
                    oView.viewType = startViewType
                ),
                textNest
            );
        };

        // OO OUTLINE FROM TEXT NEST JSO -----------------------------------------

        // jsoToOO ::  App -> [{ text :: String, nest :: TextNest}] -> [OO Row]
        const jsoToOO = (app, oParent, lstNests) => map(
            x => {
                const oRow = app.Row({
                    topic: x.text,
                    note: x.note ? x.note.slice(0, -1) : ''
                });
                return (
                    oParent.children.push(oRow),
                    x.nest.length > 0 ? jsoToOO(app, oRow, x.nest) : oRow
                );
            },
            lstNests);

        const
            mw = Application('Microsoft Word'),
            ds = mw.documents,
            mbd = ds.length > 0 ? {
                just: ds.at(0)
            } : {
                nothing: true
            };

        return mbd.nothing ? [] : (() => {
            const
                msWordNest = jsoFromMSWord(mw, options.bodyAsNote, mbd.just),
                oo = Application('OmniOutliner'),
                d = (() => (
                    oo.activate(),
                    oo.documents.push(oo.Document({
                        name: msWordNest.length > 0 ? (
                            msWordNest[0].text
                        ) : 'untitled'
                    })),
                    oo.documents.at(0)
                ))(),
                ooRows = jsoToOO(
                    oo, d,
                    msWordNest
                );
            return (
                d.expandall(),
                ooRows
            );
        })()
    })({
        bodyAsNote: true
    });


#13

Thanks for posting these, however I didn’t get it to work if there was one (or) more images in the MS Word file. Which is disappointing. I wish that OO would get more import support. It’s really a big bottleneck.


#14

images in the MS Word file

Yes – these are only textual functions.

(Word-embedded images are not part of my workflows, so unlikely that I will get around to looking at that, I’m afraid)


#15

This solution is awesome; I guess you are a programmer!? I signed up JUST to say thank you; on my own iMac this 100% deals with my problem (that I mostly use Outline mode of Word at work, but would like to take this format about on my iPAD).

All I need to do now, is work-out how to reproduce this from my office PC too…