entity_management.core

Provenance entities

Inheritance diagram of entity_management.core

Classes

Activity(*[, name, status, used, generated, ...])

Base class for provenance activity.

Agent()

Agent.

Contribution(*, agent)

DataDownload(*[, name, license, contentUrl, ...])

External resource representations, this can be a file or a folder on gpfs.

Entity(*[, name, description, ...])

Enables provenance metadata when publishing/deprecating entities.

ModelRuntimeParameters(*[, name, ...])

Model runtime parameters.

MultiDistributionEntity(*[, name, ...])

Entity with one or more distributions.

Person(*, email[, name, givenName, familyName])

Person.

SoftwareAgent(*, version)

Software agent

WorkflowExecution(*[, status, used, ...])

Represents activity of a workflow execution.

class entity_management.core.Activity(*, name=None, status=None, used=None, generated=None, startedAtTime=None, endedAtTime=None, wasStartedBy=None, wasInformedBy=None, wasInfluencedBy=None, wasAssociatedWith=None)

Bases: Identifiable

Base class for provenance activity.

Parameters:
publish(*, resource_id=None, sync_index=False, base=None, org=None, proj=None, use_auth=None, include_rev=False, activity=None)

Create or update activity resource in nexus.

Parameters:
  • resource_id (str) – Resource identifier.

  • include_rev (bool) – Whether to include _rev in the linked entities or not.

  • activity (Activity) – Optional activity which provided information to the current activity.

Returns:

New instance of the same class with revision updated.

class entity_management.core.Agent

Bases: Identifiable

Agent.

Parameters:

name (str) – Name of the agent.

class entity_management.core.Contribution(*, agent)

Bases: BlankNode

class entity_management.core.DataDownload(*, name=None, license=None, contentUrl=None, url=None, contentSize=None, digest=None, encodingFormat=None)

Bases: BlankNode

External resource representations, this can be a file or a folder on gpfs.

Parameters:
  • name (str) – The distribution name.

  • license (Identifiable) – A Link towards the distribution license.

  • contentUrl (str) – When followed this link leads to the actual data.

  • url (str) – When followed this link leads to a resource providing further description on how to download the attached data.

  • contentSize (int) – If known in advance size of the resource.

  • digest (int) – Hash/Checksum of the resource.

  • encodingFormat (str) – Type of the resource accessible by the contentUrl.

either downloadURL for files or accessURL for folders must be provided

as_dict(use_auth=None)

Get contentUrl as dict.

Parameters:

use_auth (str) – Optional OAuth token.

download(path=None, file_name=None, use_auth=None)

Download contentUrl.

Parameters:
  • path (str) – Optional path where to save the file. If not provided current folder will be used.

  • file_name (str) – Optional file name. If not provided, file name will be taken from the name stored in Nexus.

  • use_auth (str) – Optional OAuth token.

classmethod from_file(file_like, name=None, resource_id=None, storage_id=None, content_type='application/octet-stream', base=None, org=None, proj=None, use_auth=None)

Create DataDownload object form file_like.

Parameters:
  • file_like (Union[str, Path, BytesIO]) – Path to the file or BytesIO buffer with data.

  • name (str) – Optional name for the DataDownload name property. If not provided file name with extension from the file_like will be used.

  • resource_id (str) – Optional file resource identifier. If not provided will be generated by Nexus.

  • storage_id (str) – Optional identifier of the storage backend where the file will be stored. If not provided, the project’s default storage is used.

  • content_type (str) – File content type for example: text/plain. Default value: application/octet-stream.

  • use_auth (str) – Optional OAuth token.

classmethod from_json_str(json_str, resource_id=None, name=None, base=None, org=None, proj=None, use_auth=None)

Create DataDownload object representing json form serialized dict in string.

Parameters:
  • json_str (str) – Dict serialized as json string.

  • resource_id (str) – Optional file resource identifier. If not provided will be generated by Nexus.

  • name (str) – Optional name for the DataDownload name property. If not provided a uuid will be generated.

  • use_auth (str) – Optional OAuth token.

get_id()

Retrieve _id property.

get_location(use_auth=None)

Get file location when applicable.

For files located on the gpfs storage backend, this will give direct file URI.

Parameters:

use_auth (str) – Optional OAuth token.

get_location_path(use_auth=None)

Get file path when applicable.

For files located on the gpfs storage backend, this will give direct filesystem path.

Parameters:

use_auth (str) – Optional OAuth token.

get_url_as_path()

Get url path when applicable.

Move file to nexus managed folder in gpfs project and return created resource identifier.

Parameters:
  • file_path (str) – Path to the file.

  • name (str) – Optional name for the DataDownload name property. If not provided file name with extension from the file_path will be used.

  • resource_id (str) – Optional file resource identifier. If not provided will be generated by Nexus.

  • storage_id (str) – Optional identifier of the storage backend where the file will be stored. If not provided, the project’s default storage is used.

  • content_type (str) – File content type for example: text/plain.

  • use_auth (str) – Optional OAuth token.

class entity_management.core.Entity(*, name=None, description=None, wasAttributedTo=None, wasGeneratedBy=None, wasDerivedFrom=None, dateCreated=None, distribution=None, contribution=None)

Bases: Identifiable

Enables provenance metadata when publishing/deprecating entities.

publish(*, resource_id=None, sync_index=False, base=None, org=None, proj=None, use_auth=None, activity=None, was_attributed_to=None, include_rev=False)

Create or update resource in nexus. Makes a remote call to nexus instance to persist resource attributes. If use_auth token is provided user agent will be extracted from it and corresponding activity with createdBy field will be created.

Parameters:
  • resource_id (str) – Resource identifier.

  • activity (Activity) – Optionally provide activity which generated this resource. Otherwise, when running in the context of a bbp-workflow (NEXUS_WORKFLOW env variable is provided), activity default value will be workflow execution activity.

  • was_attributed_to (Person) – Provide person argument in order to add the Person to the set of attribution parameter self.wasAttributedTo.

  • use_auth (str) – OAuth token in case access is restricted. Token should be in the format for the authorization header: Bearer VALUE.

  • include_rev (bool) – Whether to include _rev in the linked entities or not.

Returns:

New instance of the same class with revision updated.

classmethod was_generated_by(generated_by, **kwargs)

List all resources which were generated by specified resource.

Parameters:

generated_by – Resource activity that generated entities.

Returns:

Iterator through the generated resources.

class entity_management.core.ModelRuntimeParameters(*, name=None, description=None, wasAttributedTo=None, wasGeneratedBy=None, wasDerivedFrom=None, dateCreated=None, contribution=None, distribution=None, model, purpose=None, modelBuildingSteps=None, allocationPartition='prod', numberOfNodes=None, nodeConstraint=None, exclusiveNodeAllocation=False, allocationDuration=None, qualityOfService='', memoryAmount=None, numberOfTasksPerNode=None)

Bases: MultiDistributionEntity

Model runtime parameters.

Parameters:
  • model (Identifiable) – Model reference to which the parameters apply.

  • name (str) – Name of the parameter collection.

  • purpose (str) – Purpose of the parameter set. For example parameters for simulation or visualization.

  • modelBuildingSteps (int) – Core neuron model building steps parameter if applicable.

classmethod list_by_model(model_resource_id, **kwargs)

List all instances belonging to the schema this type defines.

Parameters:

changes – Keyword changes in the new copy, should be a subset of class constructor(__init__) keyword arguments.

Returns:

New instance of the same class with changes applied.

class entity_management.core.MultiDistributionEntity(*, name=None, description=None, wasAttributedTo=None, wasGeneratedBy=None, wasDerivedFrom=None, dateCreated=None, contribution=None, distribution=None)

Bases: Entity

Entity with one or more distributions.

class entity_management.core.Person(*, email, name=None, givenName=None, familyName=None)

Bases: Agent

Person.

Parameters:
  • email (str) – Email.

  • givenName (str) – Given name.

  • familyName (str) – Family name.

class entity_management.core.SoftwareAgent(*, version)

Bases: Agent

Software agent

Parameters:

version (str) – Version of the software used.

class entity_management.core.WorkflowExecution(*, status=None, used=None, generated=None, startedAtTime=None, endedAtTime=None, wasStartedBy=None, wasInformedBy=None, wasInfluencedBy=None, wasAssociatedWith=None, name, module, task, version, parameters=None, configFileName=None, output=None, distribution=None)

Bases: Activity

Represents activity of a workflow execution.

Parameters:
  • name (str) – The user friendly workflow execution entry point. By convention will contain full name of a luigi task which was executed.

  • module (str) – Python module which was used to launch the luigi task from.

  • task (str) – Luigi task which was launched for execution.

  • version (str) – Version of the workflow engine used to execute the workflow.

  • parameters (str) – Concatenated list of parameters provided on the command line when the workflow was launched.

  • configFileName (str) – Name of the config file if one was explicitly provided.

  • output (str) – Any string that workflow tasks want to deliver as output to the external agents.

  • distribution (DataDownload) – Zip file of the additional python modules and the configuration file used to launch the workflow.