Post CAXobB8EUFr

Lúthien Merilin Aug 05, 2017 (20:09)

Hello +Paul Strack, (+Ekin Gören, +Eryn Galen, +Roman Rausch)

there's something that I would like to have your opinion about :)

In the past weeks the Roman and myself were wondering if it would be possible to make some simplifications to the Eldamo model without sacrificing anything of its versatility.

It had already struck me that many of the compound types that you see in the Word and Ref elements - Derivations, Inflections, Cognates, etcetera - are quite similar: most of them link a Form to either a Word entry or a Ref element.
This is why I grouped them together in the Linked entity.
Of course, the information about the particular type (is it a Derivation, an Inflection or a Cognate, etc.) is preserved.

But then again, we were wondering if not all Ref elements are essentially the same thing - as you describe it on the XML Schema Documentation page "Reference(s) to an attested form (@v) in Tolkien’s writing (...)" ? The only thing that is significantly different, is that they have a link to a Source. But there are even other elements that also have links to a Source.

OK, to make it more concrete: I was working on the dictionary app today and wondering how to represent the deriv element. The entry for Edhel is a great example with lots of _Ref_elements in it (simplifying things for the sake of the discussion somewhat):















... many more ....







... many more ....
--- and then the Child Word elements go even two more levels deep.

I was at one point wondering what the added value is here to have all those , , elements embedded inside elements.

Could not the same richness also be captured if you were to replace all those different types of related entities by just two different entities: an Entry (Word) and a Form ; and let the relationship between them represent the specifics of that relationship (is it a Derivation, a Cognate, a whatever ...)?
There would still be some other entities, like Doc, Source, Language - but not a great deal.

It's essentially a collection of graphs that starts with root element Word (Entry) and where the related Nodes are all Forms - connected to one another by "deriv", "cognate", "child", "before" or what-ever-type of relationship.

Another reason I got to think about it was that I have been working a lot with graph databases in the past few months at work: we use it there to model hierarchical Cultural Heritage Objects with - for instance, newspapers, magazines, and such.
I'm going to think about this in the coming days .. but I am really curious if you ever thought about something like this, and what your thoughts about it are?


Lúthien Merilin Aug 06, 2017 (18:08)

PS - for the record, what I wrote above is not in any way critical of the existing mode. I'm above all still very much impressed with the rigour and sheer size of your effort.

The sort of alternate representation that I am thinking about does not concern the data itself or how the entities relate to one another. It would just be another way to wrap it up; a way that might make things more transparent.

This could be done using a regular relational database, but also with a special graph database like Blazegraph (https://wiki.blazegraph.com/wiki/index.php/Main_Page). Anyhow, it's an interesting option.
wiki.blazegraph.com - Blazegraph