Wikidata:WikiProject DH Tool Registry/Data Model
For the sake of our tool registry, we differentiate between a number of related concepts. The following is a brief summary of how we think about tools:
- research tools comprise both methods and concrete software
- methods are informed by theories and have a purpose.
- methods are implemented through (multiple) layers of software, which, in turn, requires hardware and infrastructural resources such as electricity, internet connectivity or licences and which interacts with data formats and serialisations (reading and writing).
- software is written in programming languages and can be interacted with through interfaces. Command line interfaces (CLI), including application programming interfaces, require programming languages to interact with them.
- methods, languages, and formats rely on and implement abstract concepts
We have mapped the relations between the core concepts in this very basic ontology and linked them to specific Wikidata items.
specific data models[edit]
software[edit]
The data model for describing software has a very minimal model of required information at its core, which can be extended through the optional components.
Many properties will suggest that one provides a source for the information by means of an URL reference. This can readily be the official website or GitHub repo of a software project.
minimal required model[edit]
label
: a simple string. While not strictly required, there are some maintenance bots which will question the necessity of an item if no label is provided.name
: as a regular Wikidata property, we prefername
overlabel
as it can be more easily queried.instance of
: this is the core property of Wikidata. Each item must be an instance (think of it as a manifestion) of something.- Suggested values should all be sub classes of
software category
- Suggested values should all be sub classes of
- (
programmed in
): this property is necessary if an item is an instance ofsoftware library
official website
:copyright license
: Setting this property will necessitate to also setcopyright status
: most software will be copyrighted
has use
: This property allows to map software to Wikidata Items describing methods, e.g. Gephyhas use
network analysis and data visualization, which in turn is linked to the Taxonomy of Digital Research Activities in the Humanities (TaDiRAH) through propertyTaDiRAH ID
optional components[edit]
logo image
: URI of an image file representing the official logo of a softwaresource code repository URL
: Setting this property requires two additional qualifying propertiesversion control system
: e.g. “Git”web interface software
: e.g. “GitHub”
operating system
- Common values
programmed in
: e.g. Javabusiness model
: e.g. “freemium”, “subscription”reads file format
: e.g. “JSON”writes file format
: e.g. “RDA”software version identifier
described at URL
: link to other tool registries or tutorials- tool registries
- TaPOR 3: URLs follow the pattern
https://tapor.ca/tools/{ID}
, e.g. https://tapor.ca/tools/171 for Gephi. - SSH Open Marketplace: URLs follow the pattern
https://marketplace.sshopencloud.eu/tool-or-service/{ID}
, e.g. https://marketplace.sshopencloud.eu/tool-or-service/87wJWo for Gephi.
- TaPOR 3: URLs follow the pattern
- tool registries
used by
: pointing to items for research papers, software packages, project websites that make use of a specific tool. One can for instance, as we did, query the full corpus of Digital Humanities Quarterly, add bibliographical information for papers to Wikidata and then point to the Wikidata item of this paper, e.g. Daniel Burckhardt “Comparing Disciplinary Patterns: Exploring the Humanities through the Lens of Scholarly Communication” makes use of Gephi.short name
: property for acronyms such as “XML” for “eXtensible Markup Language”. This property is necessary for querying for acronyms, as the label does not allow for specifically designating full names or acronyms.
methods[edit]
Classification, taxonomies, and ontologies[edit]
We strive to classify all methods—and the software implementing them—according to field-specific taxonomies and ontologies. Within the digital humanities, the Taxonomy of Digital Research Activities in the Humanities (TaDiRAH) has gained some traction and is now adopted for classifying conference papers, posters, and workshops at the annual ADHO and DHd conferences. Its most current iteration is version 2.x.x, which limited coverage to 168 methods and saw a move towards a proper SKOS vocabulary, whose tripple store and SPARQL endpoint is hosted by DARIAH. Online documentation unfortunately lacks severely behind development and the best documentation of v2 can be found in (Borek et al. 2021).
Wikidata does not provide a means for directly applying external taxonomies but methods can be mapped to their equivalent in TaDiRAH through the TaDiRAH ID
property. TaDiRAH already has already mapped many of their concepts to Wikidata and we have replicated this mapping from Wikidata to TaDiRAH in all these cases (85 items). This leaves 83 concepts which have not yet been mapped.
minimal required model[edit]
instance of
: this is the core property of Wikidata. Each item must be an instance (think of it as a manifestion) of something.- Suggested values
method
or its subclasses
- Suggested values
optional components[edit]
TaDiRAH ID
: pointing to an equivalent entity in the Taxonomy of Digital Research Activities in the Humanities (TaDiRAH) through the TaDiRAH ID. As TaDiRAH cannot possibly be comprehensive and might not flexibly react to developments of new methods, we can only heavily recommend to check TaDiRAH already provides an equivalent for a method to be added to Wikidata.
scholarly literature[edit]
… COMING SOON …
Referenced works[edit]
Borek, Luise, Canan Hastik, Vera Khramova, Klaus Illmayer, and Jonathan D. Geiger. 2021. “Information Organization and Access in Digital Humanities: TaDiRAH Revised, Formalized and FAIR.” In Information Between Data and Knowledge, 321–32. Schriften Zur Informationswissenschaft 74. Glückstadt: Werner Hülsbusch. https://doi.org/doi.org/10.5283/epub.44951.