| src/plugins/TagSystem | ||
| example.php | ||
| LICENSE | ||
| README.md | ||
Tag System
Idea behind this project
Goal
Manage a database of tags, which are metadata that can be associated to anything.
I define tag a data associated to another data (aka. a metadata)
composed of a name, a value, and multiple possible relationships with
other tags.
Here, a relationship is any sort of link associated to two different tags (and only two). A relationship can be directional.
Let say I want to search a given tools in my toolbox. There are multiple context that can lead me to search for a tool. Is it because I want to do a given work ? Do I already know the tool but not where it is ? Is it because I want to check the quality of the tools ? Is the given work I want to do with a tool limit which material the tool can be composed of ? Do I want to search for a tool in my toolbox that is from another person ?
You can associate multiple tags to a work, a person or a tool, in this example, to search effectively in the toolbox.
Tag structure
A tag contains a unique identifier (called its name), a value (Quantity, string, or boolean) and a set of relationships.
Relationships
A tag A can be a subset of another tag B, which means the tag B is
a set containing the tag A.
A tag A can be the negation of a tag B (and the relationship is
therefore identical for the tag B).
A tag A can be an equivalent to a tag B (and the relationship is
therefore identical for the tag B).
All these relationships define a boolean algebra (with set) between tags, allowing any operation allowed by boolean algebra between two tags.
Corruption
This relationships system create corruption issues.
Corruption with negation and equivalency
If A is opposite of B (so B is opposite of A), if A is
equivalent of C (so C is equivalent of A) and if B is the
equivalent of C, then the database is corrupted. Indeed, there is no
good answer to the question "is A equivalent to B ?".
A = NOT B AND A = C = B
A real-life example could be a tag dog marked as an opposite of a tag
cat and a tag cute marked as equivalent to the tag cat and
equivalent to the tag dog, which will be a mistake.
To avoid making these kind of corruptions, one should use negation tag
in tag composition (for searching for example, you can define a
not dog search that will create a composite tag that will only live
in memory. Still, this kind of relationship should be allowed in
database, depending on the usage. The same go for equivalency.
Corruption with set and subset
If A is a subset of B (so B is a set containing A), if A is a
set containing C, if C is not a subset of B, then the database is
corrupted. Indeed, there is no good answer to the question "is C in
B ?".
C ∈ A ∈ B AND C ∉ B.
A real-life example could be a tag dog being a subset of a tag
mammal and the tag cute being a subset of the tag dog, without
being a subset of the tag mammal. You can avoid this by taking a lot
of care in your tagging taxonomy. For example, a tag cute should not
be a subset of dog, but a set, in which dog is.
Limits
Corruption possibility implies that a near-perfect User Experience (UX) is necessary in system implementing tags.
This definition of tag increase by an enormous amount the complexity of using tags. For a tagging system to work with this definition, taxonomy must be planned before implementing tags, taking into account that no taxonomy is perfect for everything. Indeed, reality is constituted of analogical data, which cannot be classified. This by definition of what a tag is: a characterization of a discrete, limited and well defined set of parameters.
A tagging system is better than hierarchical system, as it can emulate it perfectly. However, it's not the ultimate answer to every data management issue.
TL;DR: I know it's bad, but at least, it exists.
Getting started
This repo contains a PHP implementation to store this tag system, using any database supported.
Interaction with the data can be implemented later, but is not the priority while there is no possibility to store it yet.