Update March 23rd
Over the past few weeks I've been busy with working on a few other things, but that was a total benefit for this project as well as I got to know the in's and out's of what you can possibly do with a HTML form which is backed up by a lot of JS and CSS. With the knowledge of what is possible I now proceeded to write a first concept for the interface with which one will be able to search through the database. And I have gone nuts with my ideas, nothing shall be impossible, and all of it shall be easy to use even without any technical knowledge. Although I want to make it possible to inject RegEx and maybe even custom SQL Select statements.
Here's the written draft so far; I'll work on a GUI mock-up next to make the options a bit easier to understand maybe. If you have any further ideas what sort of query one possibly might want to execute on this database, please voice it!
https://docs.google.com/document/d/1U7nIIaDWv0YRCqODPN4SBwfMeEcL_POoUZV38wv2WQY/edit?usp=sharing
Furthermore I documented what, and how, I exported from eldamo so far, so that may be interesting for you +Paul Strack!
https://docs.google.com/document/d/1CY-ys_C4i5UCGkBpqKHFfydlRzraEKYDlS6qG77GEgU/edit?usp=sharing
While writing the document about the search form I had to constantly consider how to display the information I'd like to gather in my database. Right now I combined the thoughts I put into that with the big open question: How much of eldamo I want to export. +Lúthien Merilin kicked off the discussion about the more in-depth aspects of eldamo, and Paul especially elaborated on the aspects of phonological rules. To it is connected a very large field of data, with incredibly many facettes, which I yet have to understand in its entirety. For now I am making the decision to exclude this, and other sorts of data, i.e. the data of grammar pages, from my project, for the following reasons:
- My database is aimed to help people (inexperienced and professional students of elvish alike) with making translations with the mindset of "getting it done", i.e. not for creating new theories about the languages, but to use the languages.
- I don't know how to display this data in a meaningful way, or what options to supply to query it.
- I do not want to replace eldamo, in the contrary I now decided that I'd much rather like to incorporate eldamo into my database by providing links to the detailed word pages along with the results (given you are okay with this idea, Paul!)
- It makes Lúthien's project differ more from mine, which makes either of the two more interesting to follow, and eventually (hopefully) use frequently for working with Tolkien's languages.
More concretely I will discard all phonology-related elements from the eldamo-XML-file, and filter those
- phoneme
- phonetic-rule
- grammar
- phonetic-group
- phonetics
everything else, including names, phrases, text and roots I will include.
- Change of topic -
I have a question about the data, and it is about
Paul Strack Mar 24, 2017 (04:54)
One thing your system can do that Eldamo can't is sophisticated search (since Eldamo is not backed by a DB). It might be nice to have Soundex search for cases where you don't know exactly how to spell a word.
For Sindarin in particular, it would be useful to build indexes of lenited and plural forms of all words. The search results could then include the singular and unlenited forms. For those new to Sindarin, I imagine that would be extremely useful.
Paul Strack Mar 24, 2017 (05:01)
eldamo.org - Eldamo : Sindarin : acharn
The hierarchy may be based on similarities of form or meaning. In many cases this relationship can be subjective, though.
Paul Strack Mar 24, 2017 (05:04)
eldamo.org - Eldamo : Sindarin : Aelin-uial
That, and Christopher Tolkien has done most of this work already in the History of Middle Earth series.
Severin Zahler Mar 24, 2017 (11:23)
The open question is just whether it is performance wise better to have every single inflection of every single word pre-generated and saved to the database (with a couple thousand words with a handful of inflections each it would be a couple tenthousand data points) or whether those inflections that have no exceptions should be generated when they're needed. Initially I intended to follow this second idea, but as the amount of data would still very well be storeable without having to worry about disk space it probably is both easier and I presume also more performant to have them pre-generated.
I have not heard of Soundex before, will check that out definitely :D
Paul Strack Mar 25, 2017 (19:57)
In particular, for Sindarin, it tells you when a word began with a nasalize stop, which is important for lenition.
For Quenya it indicates older ñ and th in spelling (using thorn, which I can't type on my phone).
Severin Zahler Mar 27, 2017 (08:31)
Paul Strack Mar 27, 2017 (16:10)
Severin Zahler Mar 29, 2017 (15:22)
Although I have not taken direct inspiration from this conversation when making the concept for my instance of an elvish dictionary / database it does meet astonishingly many of the points mentioned, mainly due to Pauls eldamo, but also because of interfaces I planned on adding:
Mentioned by +Lúthien Merilin:
- Reliability of words: Filterable by eldamo-marks
- Etymological information like silme vs. thule: Thanks to th e reminder of Paul, this information will also be available.
- Exclude reconstructions and neologisms: Yes, by marks
- Include names: Yes, maybe sentences as well
- "Easy Regex": No, many many buttons offer the possibility to submit almost any query, without having to know any Regex.
- Full Regex injections: Yes
- Automatically generate new words by applying certain rules to the existing vocabulary: No, albeit it may be interesting to make a tool to make this half-automatic, i.e. that the tool suggest such a neologism and the user can decide whether to add it or not.
- Search inflected words (as leo.org does to some extent) and deliver base form of word as result: Yes
- Accomodate both academical and non-academical users: Yes. I may not supply the phonological development and rules, but for "academic" users wanting to translate something the DB should offer plenty of uses.
- provide Tolkiens roots: Yes, as far as eldamo contains them.
- differentiate attestations: Yes
- Mark deduced words: Yes, if the deduced word is gained from eldamo, the relation should already be present, if the word is gained from a list of neologies such relations should be added along with them.
- Users can add synonymous glosses: Yes (cf. below)
Mentioned by David Giraudeau:
- Normalizations (k >> c) being visible. Should be granted with the eldamo word --> reference relations.
- Full etymology: Any relation that's known will find a place in the DB and can be displayed accordingly, additionally the words will be linked to their respective eldamo-Page where the etymology is very nicely unfolded (I am rather trying to display the entries in a compact manner)
- Editorial notes and external + internal history: Rather No.
- Links between quotes (i.e. phrases) and entries: Afaik eldamo only has the relations phrase --> reference (
-> word. Can easily be added though.- external dating: Via the linked source: Yes
As mentioned by +Tamas Ferencz, Roman Rausch and others:
The fact that so many projects of this kind have stalled before really made me consider the entire story around opening the database to user inputs. Right now I have the following two things in my concept:
- Suggest changes: Every word displayed when you submit a query shall have a button next to it for anyone to suggest changes on a word. May that be fixing mistakes or adding new things like synonyms as discussed above. Suggestions can be made on any part of the entry.
- Translation page (as voiced by +Andre Polykanine): As a more specific tool I want to add a page where one can efficiently translate the words one queried into a real language of his liking. The words will be listed on one page along with the English glosses and a text field each to enter a translation.
Either sort of submission will be stored in a suggestions database / table and I will be notified about new suggestions. I then can accept or decline the suggestions and accordingly the data gets added to the main database.
I would also be open to have a login area where specifically trusted and interested people can log in and also participate in the review process and thus the DB administration. For that there'd need to be someone willing to help out on that, but before that happens my project has to prove its usefullness first I guess. If it does end up being useful the motivation both of me and probably other using the database may be big enough to really wanting to preserve it for the days to come.
Regarding the topic of adding altogether new entries (as mentioned by +Paul Strack): I may add a tool to add in single entries as suggestions, however I doubt it would be worth it to provide a fancy interface for the assumingly very rare bulk imports. Those could be forwarded to me to be done with direct SQL injections from a csv file or similar.
On the other end: Bulk exports. I have not thought about this so far. On one hand the results surely will get presented in a table format, but some of the fields contain things that wont fit into a csv file (for instance) so easily, i.e. buttons to pop-up the entire inflection table of a word. Generally I don't think it should be very difficult to save a set of query results as a file.
Regarding what platforms I am headed for: Right now I am completely focused on a Browser interface. Personally I am unsure how useful a mobile app would be, if someone makes a proper translation he probably won't do it in the mornings on the train from his phone, but much rather sitting at his desk at home, or at least have a laptop at hand. I will however make sure that the website is responsive and useable with mobiles.
It also looks pretty good regarding whether the project will see the light of day at all, and with the already fantastic dataset of eldamo it should be useful from day one. Personally I am, even after multiple years of working with elvish, still getting into the matter, rather than feeling like being in the "business" for ages already, and have already made fantastic experiences with it.
plus.google.com - Elvish dictionary app(lication) Mellyn, following a session at the Omentiel...