Wikidata:WikiProject DH Tool Registry/Data Model

From Wikidata
Jump to navigation Jump to search

Wikiproject

 

Background

 

Data Model

 

How to Use

 

For the sake of our tool registry, we differentiate between a number of related concepts. The following is a brief summary of how we think about tools:

  1. research tools comprise both methods and concrete software
  2. methods are informed by theories and have a purpose.
  3. methods are implemented through (multiple) layers of software, which, in turn, requires hardware and infrastructural resources such as electricity, internet connectivity or licences and which interacts with data formats and serialisations (reading and writing).
  4. software is written in programming languages and can be interacted with through interfaces. Command line interfaces (CLI), including application programming interfaces, require programming languages to interact with them.
  5. methods, languages, and formats rely on and implement abstract concepts

We have mapped the relations between the core concepts in this very basic ontology and linked them to specific Wikidata items.

Basic conceptual data model for the DH tool registry


specific data models[edit]

software[edit]

The data model for describing software has a very minimal model of required information at its core, which can be extended through the optional components.

Many properties will suggest that one provides a source for the information by means of an URL reference. This can readily be the official website or GitHub repo of a software project.

minimal required model[edit]

optional components[edit]

methods[edit]

Classification, taxonomies, and ontologies[edit]

We strive to classify all methods—and the software implementing them—according to field-specific taxonomies and ontologies. Within the digital humanities, the Taxonomy of Digital Research Activities in the Humanities (TaDiRAH) has gained some traction and is now adopted for classifying conference papers, posters, and workshops at the annual ADHO and DHd conferences. Its most current iteration is version 2.x.x, which limited coverage to 168 methods and saw a move towards a proper SKOS vocabulary, whose tripple store and SPARQL endpoint is hosted by DARIAH. Online documentation unfortunately lacks severely behind development and the best documentation of v2 can be found in (Borek et al. 2021).

Wikidata does not provide a means for directly applying external taxonomies but methods can be mapped to their equivalent in TaDiRAH through the TaDiRAH ID property. TaDiRAH already has already mapped many of their concepts to Wikidata and we have replicated this mapping from Wikidata to TaDiRAH in all these cases (85 items). This leaves 83 concepts which have not yet been mapped.

minimal required model[edit]

  • instance of: this is the core property of Wikidata. Each item must be an instance (think of it as a manifestion) of something.

optional components[edit]

TaDiRAH ID: pointing to an equivalent entity in the Taxonomy of Digital Research Activities in the Humanities (TaDiRAH) through the TaDiRAH ID. As TaDiRAH cannot possibly be comprehensive and might not flexibly react to developments of new methods, we can only heavily recommend to check TaDiRAH already provides an equivalent for a method to be added to Wikidata.

scholarly literature[edit]

… COMING SOON …

Referenced works[edit]

Borek, Luise, Canan Hastik, Vera Khramova, Klaus Illmayer, and Jonathan D. Geiger. 2021. “Information Organization and Access in Digital Humanities: TaDiRAH Revised, Formalized and FAIR.” In Information Between Data and Knowledge, 321–32. Schriften Zur Informationswissenschaft 74. Glückstadt: Werner Hülsbusch. https://doi.org/doi.org/10.5283/epub.44951.