summaryrefslogtreecommitdiffstats
path: root/doc/overview/index.rst
blob: 0b95f8b26d01495d5ddd345639814c59610c17b3 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
.. highlight:: cpp

Overview
========

Composition of the library
--------------------------

The primary goal of the orcus library is to provide a framework to import the
contents of documents stored in various spreadsheet or spreadsheet-like
formats.  The library also provides several low-level parsers that can be used
independently of the spreadsheet-related features if so desired.  In addition,
the library also provides support for some hierarchical documents, such as JSON
and YAML, which were a later addition to the library.

You can use this library either through its C++ API, Python API, or CLI.  However,
not all three methods equally expose all features of the library, and the C++ API
is more complete than the other two.

The library is physically split into four parts:

    1. the parser part that provides the aforementioned low-level parsers,
    2. the filter part that providers higher level import filters for spreadsheet
       and hierarchical documents that internally use the low-level parsers,
    3. the spreadsheet document model part that includes the document model suitable
       for storing spreadsheet document contents, and
    4. CLI for loading and converting spreadsheet and hierarchical documents.

If you need to just use the parser part of the library, you need to only link
against the ``liborcus-parser`` library file.  If you need to use the import
filter part, link againt both the ``liborcus-parser`` and the ``liborcus``
libraries.  Likewise, if you need to use the spreadsheet document model part,
link against the aforementioned two plus the ``liborcus-spreadsheet-model``
library.

Also note that the spreadsheet document model part has additional dependency on
the `ixion library <https://gitlab.com/ixion/ixion>`_ for handling formula
re-calculations on document load.


Loading spreadsheet documents
-----------------------------

The orcus library's primary aim is to provide a framework to import the contents
of documents stored in various spreadsheet, or spreadsheet-like formats.  It
supports two primary use cases.  The first use case is where the client
program does not have its own document model, but needs to import data from a
spreadsheet-like document file and access its content without implementing its
own document store from scratch.  In this particular use case, you can simply
use the :cpp:class:`~orcus::spreadsheet::document` class to get it populated,
and access its content through its API afterward.

The second use case, which is a bit more advanced, is where the client program
already has its own internal document model, and needs to use orcus
to populate its document model.  In this particular use case, you can
implement your own set of classes that support necessary interfaces, and pass
that to the orcus import filter.

For each document type that orcus supports, there is a top-level import filter
class that serves as an entry point for loading the content of a document you
wish to load.  You don't pass your document to this filter directly; instead,
you wrap your document with what we call an **import factory**, then pass this
factory instance to the loader.  This import factory is then required to
implement necessary interfaces that the filter class uses in order for it
to pass data to the document as the file is getting parsed.

When using orcus's own document model, you can simply use orcus's own import
factory implementation to wrap its document.  When using your own document
model, on the other hand, you'll need to implement your own set of interface
classes to wrap your document with.

The following sections describe how to load a spreadsheet document by using 1)
orcus's own spreadsheet document class, and 2) a user-defined custom docuemnt
class.

.. toctree::
   :maxdepth: 1

   doc-orcus.rst
   doc-user.rst


Loading hierarchical documents
------------------------------

The orcus library also includes support for hierarchical document types such
as JSON and YAML.  The following sections delve more into the support for
these types of documents.

.. toctree::
   :maxdepth: 1

   json.rst
   yaml.rst