User:Most2dot0/Ideas

From Angelina Jordan Wiki
Revision as of 10:50, 29 September 2025 by Most2dot0 (talk | contribs) (Events, Contexts, and Categories: new Cargo implementation subsection section)

Database

Many of the ideas scetched out below have already some implementation. I will link to the Modules, Data, etc. that realize some elements.

Common Database Structure

Have a common database structure for video sources (and possible other content, like pictures, e.g. FB & Insta posts w/o video, etc.). The data would be identified by URL, and include

  • URL
  • Source type (YT, IG, FB, ...)
  • Channel name & id (as subcategory of source type)
  • Language
  • Comment
  • Recording and publish dates
  • Open Graph data derived from the URL, especially all basic tags, i.e. Title, Description, Image

This would be stored in it's own table. Possibly the Open Graph data would even have it's own table, so that it is easier to regenerate.

The songs table would not repeat this information, other the ones needed for linkage. Instead, a table of events would be added with the song title as link to the songs database. Events could be specific rehearsal sessions, concerts, album (recordings), etc. The Events should be uniquely identifiable by name. Another table the would list recordings linked with the song title to the songs table, the event name to the event table, and with the URL to the video source info table.

Performances in the sense of a single song performance, in general as part of an event.

Implementation of Elements

The following data may bestored in their own wiki articles per entry. They can be identified to be ours such data by being part of an according category (though consider if a location would be directed to a section in an event article):

  • Songs
  • Events
  • Locations
  • Artist

It could be for the future considered to have templates there to tag available data for scraping purposes, so that e.g. a bot could periodically collect them into tables for use in other pages as well.

The following data may be stored in data formats like JSON, with a single page with a list of entries:

That the SongsToVideos table contains links to video-segments is an interim solution, which should not be needed in the long run. Instead, I envision the Performances table to be the central data storage that links all the other elements together. It will be augmented by the autogenerated MetaData table providing additional info for linked videos.

Performances Table

An entry in the performances table does not necessarily have its own name. It will link to other database elements by name, and can by identified by the combination of those, plus it's data. These are:

  • Song Title
  • Event
  • Date (possibly only as month or year indication)
  • Type (Music-Video, Track, Recording, Live, Soundcheck, Rehearsal, Sing-along, Lip-sync)
  • Position (e.g. on setlist of concert, or sequence number of different rehearsals of same title/event)
  • Venue (if known)
  • Comment
  • Video-Segments, with
- URL,
- start & end times (possibly only contained in URL)
- Duration (full, short, fragment),
- Quality (best, good, acceptable, poor)

Identification of Songs

Problem

See also Talk:Songs#Some conventions

In general, different songs can share the same title. If that is the case, there is usually a disambiguation given in brackets, like the name of the original artist, of the author, etc.

This is handeled differently for these situations:

  1. In this Wiki, for the linkage to the song pages and their names, a disabiguation is only used if Angelina actually performed different songs of the same title.
  2. In the Songs listing, a disambiguation is given when more than one song with the title does exist. As part of the page title, it can not be formally distinguished from song titles, that use brackets in their title.
  3. Wikipedia does sometimes use a disambiguation even though it is not need. in the songs listing, Wikipedia's disambiguation is then given as a hidden comment.

Current Solution

Use the titles according to 1., which will make their use most easy and consistent with the rest of the wiki. The question is then, if it is required to offer the core title, and it's disambiguation seperately for display purposes?

If we decide for reasons of simpliciy that it is not required, we could have these three fields:

  • Title (soley used for technical identification)
  • Disambiguation (used for display)
  • WPext (just the disambiguation to form WP links)

For display purposes, we could then print

Title (Disambiguation),

whereas Wikipedia links would be formed as

{{w|Title (WPext)}}

If no WPext is given, the Disambiguation would be used in its place.

The mistake that is made is, that e.g.

All of Me (jazz standard)

will be misprinted as

All of Me (jazz standard)

On the other Hand, this has the added benefit of making it clear what is needed to identifiy it in this Wiki.

Quality Indicators

Present some sort of "quality" indication, possibly regarding (either sepearate or maybe integrated) in

  • performance type (studio, live, rehearsal, sing-along, lip-sync),
  • recording length (full, short, fragment), and
  • audio quality (best, good, aceptable, poor)

These could possibly be integrated into a common 10-1 scale, calculated e.g. like this:

  • Base by Type: "studio"=8, "live"=8, rehearsal="6", "sing-along"=2, "lip-sync"=0
  • Length: "full" +/-0, "short" -1, "fragment" -4
  • Modify by Quality: "best" + 2, "good" +/0, aceptable "-2", poor "-4"

Limit result to 1-10.

Use of Database

The above information can be linked to e.g. display table listings for Channels, Events, Songs, etc.

Ideally there should be a common generic list generator, that can be configured for data (columns) displayed, and with arbitrary filters for each data type (column contents).

There could be templates to generate references (e.g. for footnotes) simply by providing the URL. (note {{cite by url}}, which will not scale)

Cargo Implementation

It's being looked into moving the data from the big JSON files into the song's articles via the Cargo Extension. On reason is, that Wikimedia is inefficient in regards to storage when handling changes to big files. But Cargo also offers addtional features, like the possibility to browse the data without predefiend queries and with dynamic filters.

Database tables

The basic structure described above can be retained.

Database storage

The entries (rows) of several tables can be defined in the article page of the related song. It will contain several Cargo templates for the Song, the song Performances, and the Videos for each Performance. The song's Cargo template might be integrated into an existing song templates, like the {{Infobox song}}. The songs listing in the Songs article could then be generated using a Cargo query.

Events, Contexts, and Categories

As discussed in Data_talk:Performances.json#Splitting_event_field, there is the plan to have a hierachial context description instead of event entries in the database. This has been already implmented in some form in the current User/Most2dot0/Performances-devel prototype. The feedback on this prototype pointed into the direction, that verbatim links to events should rather point to wiki articles about them, as opposed to the event summary in the database.

Transition

The distributed storage to be achieved with Cargo will likely be more difficult to maintain in regards to bulk changes to data. Unlike with the JSON pages, it's not possible to do bulk modifications with simple regex search and replace calls. Instead, it would likely require scripting or use of robots. For this reason, the database structure and most of the existing data should be in the desired final form before making the transition.

To easily build the Cargo data, the preload feature could be utilized for a preloaded section addition, by either the normal or a special version of the JSON based Performances listings, which could preload a template that then optains the required data from the JSON tables, and replaces itself with Cargo template calls for the data.

Once the Carga database is complete, the JSON data could be reconstructed from it, and it should be possible to keep most of the existing presentation code if still desired or for a transitional phase.

But the long term goal is likely to eliminate the Lua/JSON based data presentations with Cargo queries and Cargo browsing via the drill-down interface.

Embedded YouTube Player

The player already features easy configuration, arbitrary start/end time for videos in a playlist. The follwing improvements are being considered

Further Improvements

Also Play FB, IG, TT Videos

This has been considered in the past, but is postponed, as the other social media don't offer a comparable API as YouTube does, so many features would lack.

Repeat Options

Enable no autoplay, repeat single clip, complete playlist once, complete playlist loop (this is current fixed mode).

Consider if at the end of a playlist, the player could start the next one on the page (might be possible through the numbered IDs each player gets assigend when being substituted)

Player Indication at URL

Consider to indicate e.g. via color coding / added icon /changed icon that an URL will start the player instead of following the link to the source.